offline-llama

Autonomously manage and use local Ollama models for continuous operation without internet dependency. Includes model health monitoring, automatic fallback, and self-healing capabilities.

3,891 stars

Best use case

offline-llama is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Autonomously manage and use local Ollama models for continuous operation without internet dependency. Includes model health monitoring, automatic fallback, and self-healing capabilities.

Teams using offline-llama should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/offline-llama/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/and-ray-m/offline-llama/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/offline-llama/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How offline-llama Compares

Feature / Agentoffline-llamaStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Autonomously manage and use local Ollama models for continuous operation without internet dependency. Includes model health monitoring, automatic fallback, and self-healing capabilities.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# offline-llama

Autonomously manage and use local Ollama models for continuous operation without internet dependency. Includes model health monitoring, automatic fallback, and self-healing capabilities.

## Overview

This skill enables autonomous operation with local Ollama models. It monitors model health, automatically switches between models when issues occur, and maintains functionality even without internet connectivity. The skill includes self-healing capabilities to restart services and clear resources when needed.

## Core Features

### Model Management
- **Health Monitoring**: Continuously check model availability and performance
- **Automatic Fallback**: Switch to alternative models when primary fails
- **Model Switching**: Dynamically select best available model for task

### Self-Healing
- **Service Restart**: Automatically restart Ollama when models become unavailable
- **Resource Management**: Clear cache and temporary files to free resources
- **Model Reinstallation**: Reinstall problematic models automatically

### Connectivity Awareness
- **Internet Detection**: Monitor internet connectivity status
- **Smart Fallback**: Switch to remote models when local models unavailable and internet is present
- **Offline Mode**: Maintain full functionality without internet

## Configuration

### Models
- **Primary**: llama-3.1-8b-instruct (general tasks)
- **Secondary**: mistral-7b-instruct (faster responses)
- **Specialized**: code-llama-7b (coding tasks)

### Health Checks
- **Model Status**: Monitor availability every 30 seconds
- **Latency Tracking**: Monitor response times every minute
- **Resource Usage**: Monitor GPU/CPU and memory every 5 minutes

### Fallback Strategies
1. **Model Switching**: Automatically switch to alternative local models
2. **Response Retry**: Retry failed requests with exponential backoff
3. **Degraded Mode**: Continue with limited functionality if all models unavailable

## Usage

### When Internet is Available
- Use local models primarily
- Fallback to remote models if local models unavailable
- Maintain optimal performance

### When Internet is Unavailable
- Use local models exclusively
- Continue all operations without interruption
- Provide degraded functionality if needed

## Commands

### Model Management
- `model_status` - Check current model health
- `switch_model` - Manually switch between models
- `restart_ollama` - Restart Ollama service

### Health Monitoring
- `check_health` - Run comprehensive health check
- `monitor_resources` - Monitor system resources
- `clear_cache` - Clear model cache and temporary files

## Self-Healing

### Automatic Actions
- **Service Restart**: Triggered when model becomes unavailable
- **Resource Cleanup**: Triggered when high memory usage detected
- **Model Reinstallation**: Triggered when persistent failures occur

### Manual Intervention
- **Manual Restart**: User can manually restart services
- **Cache Clearing**: User can manually clear resources
- **Model Updates**: User can update models as needed

## Security Considerations

- All operations performed locally
- No external dependencies required
- Secure model management
- Privacy-preserving by default

## Performance Optimization

- **Resource Monitoring**: Track GPU/CPU usage and memory
- **Latency Tracking**: Monitor response times and performance
- **Model Selection**: Choose optimal model based on task requirements

## Maintenance

### Regular Tasks
- **Health Checks**: Run periodic health checks
- **Cache Management**: Clear unused cache regularly
- **Model Updates**: Keep models updated when possible

### Troubleshooting
- **Log Analysis**: Monitor logs for issues
- **Performance Metrics**: Track performance over time
- **Error Handling**: Graceful error handling and recovery

## Integration

This skill integrates with:
- **Ollama**: Local model management
- **System Resources**: Monitor and manage system resources
- **Network**: Detect internet connectivity
- **OpenClaw**: Seamless integration with existing tools

## Future Enhancements

- **Model Training**: Support for custom model training
- **Advanced Routing**: Intelligent model selection based on task
- **Multi-GPU Support**: Scale across multiple GPUs
- **Cloud Sync**: Optional cloud backup and synchronization

## License

This skill is part of the OpenClaw ecosystem and follows the same licensing terms as OpenClaw itself.

Related Skills

ollama

3891
from openclaw/skills

Ollama 本地大模型调用技能。支持通过 API 与 Ollama 实例交互进行文本生成。 Use when: (1) 需要调用本地或远程 Ollama 模型 (2) 需要执行 LLM 推理任务 (3) 需要通过 Python 脚本与特定 Ollama 实例 (如 qwen3.5:9b) 交互。

mirofish-offline-simulation

3823
from openclaw/skills

Fully local multi-agent swarm intelligence simulation engine using Neo4j + Ollama for public opinion, market sentiment, and social dynamics prediction.

---

3891
from openclaw/skills

name: article-factory-wechat

Content & Documentation

humanizer

3891
from openclaw/skills

Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.

Content & Documentation

find-skills

3891
from openclaw/skills

Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.

General Utilities

tavily-search

3891
from openclaw/skills

Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the web. Requires Tavily API key.

Data & Research

baidu-search

3891
from openclaw/skills

Search the web using Baidu AI Search Engine (BDSE). Use for live information, documentation, or research topics.

Data & Research

agent-autonomy-kit

3891
from openclaw/skills

Stop waiting for prompts. Keep working.

Workflow & Productivity

Meeting Prep

3891
from openclaw/skills

Never walk into a meeting unprepared again. Your agent researches all attendees before calendar events—pulling LinkedIn profiles, recent company news, mutual connections, and conversation starters. Generates a briefing doc with talking points, icebreakers, and context so you show up informed and confident. Triggered automatically before meetings or on-demand. Configure research depth, advance timing, and output format. Walking into meetings blind is amateur hour—missed connections, generic small talk, zero leverage. Use when setting up meeting intelligence, researching specific attendees, generating pre-meeting briefs, or automating your prep workflow.

Workflow & Productivity

self-improvement

3891
from openclaw/skills

Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Claude ('No, that's wrong...', 'Actually...'), (3) User requests a capability that doesn't exist, (4) An external API or tool fails, (5) Claude realizes its knowledge is outdated or incorrect, (6) A better approach is discovered for a recurring task. Also review learnings before major tasks.

Agent Intelligence & Learning

botlearn-healthcheck

3891
from openclaw/skills

botlearn-healthcheck — BotLearn autonomous health inspector for OpenClaw instances across 5 domains (hardware, config, security, skills, autonomy); triggers on system check, health report, diagnostics, or scheduled heartbeat inspection.

DevOps & Infrastructure

linkedin-cli

3891
from openclaw/skills

A bird-like LinkedIn CLI for searching profiles, checking messages, and summarizing your feed using session cookies.

Content & Documentation