Mojo — Python-Speed Systems Language for AI

## Overview

25 stars

Best use case

Mojo — Python-Speed Systems Language for AI is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Overview

Teams using Mojo — Python-Speed Systems Language for AI should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/mojo/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/TerminalSkills/skills/mojo/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/mojo/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How Mojo — Python-Speed Systems Language for AI Compares

Feature / AgentMojo — Python-Speed Systems Language for AIStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

## Overview

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Mojo — Python-Speed Systems Language for AI


## Overview


Mojo, the programming language by Modular that combines Python's usability with C-level performance. Helps developers write high-performance AI/ML code, optimize numerical computations with SIMD and parallelism, and gradually port Python code to Mojo for orders-of-magnitude speedups.


## Instructions

### Python Compatibility

```mojo
# Mojo can run Python code directly and call Python libraries.
# Start with Python, optimize hot paths in Mojo.

from python import Python

fn main() raises:
    # Import any Python library
    let np = Python.import_module("numpy")
    let plt = Python.import_module("matplotlib.pyplot")

    # Use NumPy arrays as usual
    let data = np.random.randn(1000)
    let mean = np.mean(data)
    let std = np.std(data)
    print("Mean:", mean, "Std:", std)

    # Plot with matplotlib
    plt.hist(data, bins=30)
    plt.title("Distribution")
    plt.savefig("plot.png")
```

### High-Performance Structs

```mojo
# matrix.mojo — Custom matrix type with SIMD operations
# This runs 10-100x faster than equivalent NumPy for small matrices.

from math import sqrt
from algorithm import vectorize, parallelize

struct Matrix:
    var data: DTypePointer[DType.float64]
    var rows: Int
    var cols: Int

    fn __init__(inout self, rows: Int, cols: Int):
        self.rows = rows
        self.cols = cols
        self.data = DTypePointer[DType.float64].alloc(rows * cols)
        memset_zero(self.data, rows * cols)

    fn __getitem__(self, row: Int, col: Int) -> Float64:
        return self.data.load(row * self.cols + col)

    fn __setitem__(inout self, row: Int, col: Int, val: Float64):
        self.data.store(row * self.cols + col, val)

    fn __del__(owned self):
        self.data.free()

    fn matmul(self, other: Matrix) -> Matrix:
        """Matrix multiplication using SIMD vectorization.

        Processes multiple floating-point operations per CPU instruction.
        On a modern CPU with 256-bit SIMD, this processes 4 float64 ops
        at once — a 4x speedup over scalar code.
        """
        var result = Matrix(self.rows, other.cols)
        let simd_width = simdwidthof[DType.float64]()

        @parameter
        fn calc_row(row: Int):
            for k in range(self.cols):
                @parameter
                fn dot[simd_width: Int](col: Int):
                    result.data.store[width=simd_width](
                        row * other.cols + col,
                        result.data.load[width=simd_width](row * other.cols + col)
                        + self[row, k] * other.data.load[width=simd_width](k * other.cols + col)
                    )
                vectorize[dot, simd_width](other.cols)

        parallelize[calc_row](self.rows)    # Parallelize across CPU cores
        return result

    fn frobenius_norm(self) -> Float64:
        """Compute the Frobenius norm using SIMD reduction."""
        var sum: Float64 = 0.0
        let simd_width = simdwidthof[DType.float64]()

        @parameter
        fn accumulate[width: Int](idx: Int):
            let vals = self.data.load[width=width](idx)
            sum += (vals * vals).reduce_add()

        vectorize[accumulate, simd_width](self.rows * self.cols)
        return sqrt(sum)
```

### SIMD and Vectorization

```mojo
# simd_example.mojo — Process data in parallel lanes
from algorithm import vectorize
from sys.info import simdwidthof

fn relu_activation(inout data: DTypePointer[DType.float32], size: Int):
    """Apply ReLU activation using SIMD.

    Processes 8 float32 values simultaneously on AVX2 CPUs,
    or 16 on AVX-512. The @parameter decorator ensures the
    SIMD width is resolved at compile time.
    """
    let simd_width = simdwidthof[DType.float32]()  # 8 on AVX2

    @parameter
    fn apply_relu[width: Int](idx: Int):
        let values = data.load[width=width](idx)
        let zeros = SIMD[DType.float32, width](0)
        data.store[width=width](idx, values.max(zeros))

    vectorize[apply_relu, simd_width](size)


fn softmax(inout data: DTypePointer[DType.float32], size: Int):
    """Numerically stable softmax with SIMD operations."""
    # Find max for numerical stability
    var max_val: Float32 = data.load(0)
    for i in range(1, size):
        let val = data.load(i)
        if val > max_val:
            max_val = val

    # Compute exp(x - max) and sum
    var sum: Float32 = 0.0
    for i in range(size):
        let exp_val = math.exp(data.load(i) - max_val)
        data.store(i, exp_val)
        sum += exp_val

    # Normalize
    let inv_sum = 1.0 / sum
    let simd_width = simdwidthof[DType.float32]()

    @parameter
    fn normalize[width: Int](idx: Int):
        data.store[width=width](idx, data.load[width=width](idx) * inv_sum)

    vectorize[normalize, simd_width](size)
```

### Parallelism

```mojo
# parallel.mojo — Multi-core parallel processing
from algorithm import parallelize, vectorize
from time import now

fn parallel_image_processing(
    inout pixels: DTypePointer[DType.uint8],
    width: Int,
    height: Int,
):
    """Apply grayscale conversion in parallel across CPU cores.

    Each row is processed by a different core, and within each row,
    SIMD processes multiple pixels simultaneously.
    """
    @parameter
    fn process_row(row: Int):
        let row_offset = row * width * 3    # 3 channels (RGB)

        for col in range(width):
            let idx = row_offset + col * 3
            let r = pixels.load(idx).cast[DType.float32]()
            let g = pixels.load(idx + 1).cast[DType.float32]()
            let b = pixels.load(idx + 2).cast[DType.float32]()

            # Luminance formula: 0.299R + 0.587G + 0.114B
            let gray = (0.299 * r + 0.587 * g + 0.114 * b).cast[DType.uint8]()
            pixels.store(idx, gray)
            pixels.store(idx + 1, gray)
            pixels.store(idx + 2, gray)

    parallelize[process_row](height)    # Each core handles a subset of rows
```

## Installation

```bash
# Install Modular CLI
curl -s https://get.modular.com | sh -

# Install Mojo
modular install mojo

# Verify
mojo --version

# Run a Mojo file
mojo run my_program.mojo

# Build a binary
mojo build my_program.mojo -o my_program
```


## Examples


### Example 1: Building a feature with Mojo

**User request:**

```
Add a real-time collaborative python compatibility to my React app using Mojo.
```

The agent installs the package, creates the component with proper Mojo initialization, implements the python compatibility with event handling and state management, and adds TypeScript types for the integration.

### Example 2: Migrating an existing feature to Mojo

**User request:**

```
I have a basic high-performance structs built with custom code. Migrate it to use Mojo for better high-performance structs support.
```

The agent reads the existing implementation, maps the custom logic to Mojo's API, rewrites the components using Mojo's primitives, preserves existing behavior, and adds features only possible with Mojo (like SIMD and Vectorization, Parallelism).


## Guidelines

1. **Start with Python, optimize in Mojo** — Use Python imports for prototyping; rewrite hot loops in native Mojo for 10-1000x speedups
2. **Use SIMD for data processing** — `vectorize` processes multiple values per instruction; always prefer it over scalar loops for numeric data
3. **Parallelize across cores** — `parallelize` distributes work across CPU cores; combine with `vectorize` for maximum throughput
4. **`fn` over `def`** — Use `fn` for performance-critical functions (strict typing, no overhead); use `def` for flexibility
5. **Owned and borrowed references** — Use `borrowed` (default) for read-only, `inout` for mutation, `owned` for transfers; this enables zero-copy optimizations
6. **Compile-time metaprogramming** — Use `@parameter` for compile-time evaluation; SIMD widths, loop unrolling, and specialization happen at compile time
7. **Profile before optimizing** — Use `mojo build -O3` and benchmark; not everything needs SIMD — focus on actual bottlenecks
8. **Gradual migration** — Port one function at a time from Python to Mojo; the interop layer makes incremental adoption easy

Related Skills

building-recommendation-systems

25
from ComeOnOliver/skillshub

This skill empowers Claude to construct recommendation systems using collaborative filtering, content-based filtering, or hybrid approaches. It analyzes user preferences, item features, and interaction data to generate personalized recommendations. Use this skill when the user requests to build a recommendation engine, needs help with collaborative filtering, wants to implement content-based filtering, or seeks to rank items based on relevance for a specific user or group of users. It is triggered by requests involving "recommendations", "collaborative filtering", "content-based filtering", "ranking items", or "building a recommender".

orchestrating-multi-agent-systems

25
from ComeOnOliver/skillshub

Execute orchestrate multi-agent systems with handoffs, routing, and workflows across AI providers. Use when building complex AI systems requiring agent collaboration, task delegation, or workflow coordination. Trigger with phrases like "create multi-agent system", "orchestrate agents", or "coordinate agent workflows".

python-mcp-server-generator

25
from ComeOnOliver/skillshub

Generate a complete MCP server project in Python with tools, resources, and proper configuration

next-intl-add-language

25
from ComeOnOliver/skillshub

Add new language to a Next.js + next-intl application

dataverse-python-usecase-builder

25
from ComeOnOliver/skillshub

Generate complete solutions for specific Dataverse SDK use cases with architecture recommendations

dataverse-python-quickstart

25
from ComeOnOliver/skillshub

Generate Python SDK setup + CRUD + bulk + paging snippets using official patterns.

dataverse-python-production-code

25
from ComeOnOliver/skillshub

Generate production-ready Python code using Dataverse SDK with error handling, optimization, and best practices

dataverse-python-advanced-patterns

25
from ComeOnOliver/skillshub

Generate production code for Dataverse SDK using advanced patterns, error handling, and optimization techniques.

aws-cdk-python-setup

25
from ComeOnOliver/skillshub

Setup and initialization guide for developing AWS CDK (Cloud Development Kit) applications in Python. This skill enables users to configure environment prerequisites, create new CDK projects, manage dependencies, and deploy to AWS.

answering-natural-language-questions-with-dbt

25
from ComeOnOliver/skillshub

Writes and executes SQL queries against the data warehouse using dbt's Semantic Layer or ad-hoc SQL to answer business questions. Use when a user asks about analytics, metrics, KPIs, or data (e.g., "What were total sales last quarter?", "Show me top customers by revenue"). NOT for validating, testing, or building dbt models during development.

python-design-patterns

25
from ComeOnOliver/skillshub

Python design patterns including KISS, Separation of Concerns, Single Responsibility, and composition over inheritance. Use when making architecture decisions, refactoring code structure, or evaluating when abstractions are appropriate.

temporal-python-testing

25
from ComeOnOliver/skillshub

Test Temporal workflows with pytest, time-skipping, and mocking strategies. Covers unit testing, integration testing, replay testing, and local development setup. Use when implementing Temporal workflow tests or debugging test failures.