mlx-local-inference
Full local AI inference stack on Apple Silicon Macs via MLX.
Best use case
mlx-local-inference is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Full local AI inference stack on Apple Silicon Macs via MLX.
Teams using mlx-local-inference should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/mlx-local-inference/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How mlx-local-inference Compares
| Feature / Agent | mlx-local-inference | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Full local AI inference stack on Apple Silicon Macs via MLX.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# mlx-local-inference Full local AI inference stack on Apple Silicon Macs via MLX. ## Install ``` npx clawhub@latest install mlx-local-inference ```
Related Skills
local-first-llm
Routes LLM requests to a local model (Ollama, LM Studio, llamafile) before falling back to cloud APIs.
qwen3-tts-local-inference
Generate speech from text using Qwen3-TTS via direct Python inference — no server required.
local-system-info
Return system metrics (CPU, RAM, disk, processes) using psutil.
iyeque-local-system-info
Return system metrics (CPU, RAM, disk, processes) using psutil.
whisper-mlx-local
Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.
parakeet-local-asr
Install and operate local NVIDIA Parakeet ASR for OpenClaw with an OpenAI-compatible transcription API.
browser-use-local
Use when you need browser automation via the browser-use CLI or Python code in this OpenClaw container/host: open pages, click/type, take screenshots, extract HTML/links, or run an Agent with an OpenAI-compatible LLM (e.g. Moonshot/Kimi) using a custom base_url. Also use for debugging browser-use sessions (state empty, page readiness timeouts), and for extracting login QR codes from demo/login pages via screenshots or HTML data:image.
zvec-local-rag-service
Operate an always-on local semantic-search service using zvec + Ollama embeddings.
shodh-local
Local Shodh-Memory v0.1.74 (offline cognitive memory for AI agents). Use for persistent remembering, semantic recall, GTD todos/projects, knowledge graph. Triggers: \"remember/save/merke X\", \"recall/Erinnere/search memories about Y\", \"todos/add/complete\", \"projects\", \"proactive context\", \"what learned about Z\". Server localhost:3030 (amber-seaslug), key in TOOLS.md. Hebbian learning, 3-tier (working/session/LTM), TUI dashboard.
comfyui-local
Generate high-quality images using a local ComfyUI instance.
local-task-runner
This skill provides a mechanism to execute Node.js code snippets or full scripts locally on the host machine.
localsend
Send and receive files to/from nearby devices using the LocalSend protocol.