ollama

Ollama local LLM deployment and management. Use for running LLMs locally.

7 stars

Best use case

ollama is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Ollama local LLM deployment and management. Use for running LLMs locally.

Teams using ollama should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ollama/SKILL.md --create-dirs "https://raw.githubusercontent.com/G1Joshi/Agent-Skills/main/skills/ai-ml/ollama/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ollama/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ollama Compares

Feature / AgentollamaStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Ollama local LLM deployment and management. Use for running LLMs locally.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Ollama

Ollama makes running LLMs locally as easy as `docker run`. 2025 updates include **Windows/AMD** support, **Multimodal** input, and Tool Calling.

## When to Use

- **Local Development**: Coding without wifi or API costs.
- **Privacy**: Processing sensitive documents on-device.
- **Integration**: Works with LangChain, LlamaIndex, and Obsidian natively.

## Core Concepts

### Modelfile

Docker-like file to define a custom model (System prompt + Base model).

```dockerfile
FROM llama3
SYSTEM You are Mario from Super Mario Bros.
```

### API

Ollama runs a local server (`localhost:11434`) compatible with OpenAI SDK.

## Best Practices (2025)

**Do**:

- **Use high-speed RAM**: Local LLM speed depends on memory bandwidth.
- **Use Quantized Models**: `q4_k_m` is the sweet spot for speed/quality balance.
- **Unload**: `ollama stop` when done to free VRAM for games/rendering.

**Don't**:

- **Don't expect GPT-4 level**: Smaller local models (8B) are smart but lack deep reasoning.

## References

- [Ollama Website](https://ollama.com/)