modly-image-to-3d

Desktop app that generates 3D models from images using local AI running entirely on your GPU

22 stars

byAradotso

View on GitHub Installation ↓

Best use case

modly-image-to-3d is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Desktop app that generates 3D models from images using local AI running entirely on your GPU

Teams using modly-image-to-3d should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/modly-image-to-3d/SKILL.md --create-dirs "https://raw.githubusercontent.com/Aradotso/trending-skills/main/skills/modly-image-to-3d/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/modly-image-to-3d/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How modly-image-to-3d Compares

Feature / Agent	modly-image-to-3d	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Desktop app that generates 3D models from images using local AI running entirely on your GPU

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Modly Image-to-3D Skill

> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.

Modly is a local, open-source desktop application (Windows/Linux) that converts photos into 3D mesh models using AI models running entirely on your GPU — no cloud, no API keys required.

---

## Architecture Overview

```
modly/
├── src/                    # Electron + TypeScript frontend
│   ├── main/               # Electron main process
│   ├── renderer/           # React UI (renderer process)
│   └── preload/            # IPC bridge
├── api/                    # Python FastAPI backend
│   ├── generator.py        # Core generation logic
│   └── requirements.txt
├── resources/
│   └── icons/
├── launcher.bat            # Windows quick-start
├── launcher.sh             # Linux quick-start
└── package.json
```

The app runs as an Electron shell over a local Python FastAPI server. Extensions are GitHub repos with a `manifest.json` + `generator.py` that plug into the extension system.

---

## Installation

### Quick start (no build required)

```bash
# Windows
launcher.bat

# Linux
chmod +x launcher.sh
./launcher.sh
```

### Development setup

```bash
# 1. Clone
git clone https://github.com/lightningpixel/modly
cd modly

# 2. Install JS dependencies
npm install

# 3. Set up Python backend
cd api
python -m venv .venv

# Activate (Windows)
.venv\Scripts\activate

# Activate (Linux/macOS)
source .venv/bin/activate

pip install -r requirements.txt
cd ..

# 4. Run dev mode (starts Electron + Python backend)
npm run dev
```

### Production build

```bash
# Build installers for current platform
npm run build

# Output goes to dist/
```

---

## Key npm Scripts

```bash
npm run dev        # Start app in development mode (hot reload)
npm run build      # Package app for distribution
npm run lint       # Run ESLint
npm run typecheck  # TypeScript type checking
```

---

## Extension System

Extensions are GitHub repositories containing:
- `manifest.json` — metadata and model variants
- `generator.py` — generation logic implementing the Modly extension interface

### manifest.json structure

```json
{
  "name": "My 3D Extension",
  "id": "my-extension-id",
  "description": "Generates 3D models using XYZ model",
  "version": "1.0.0",
  "author": "Your Name",
  "repository": "https://github.com/yourname/my-modly-extension",
  "variants": [
    {
      "id": "model-small",
      "name": "Small (faster)",
      "description": "Lighter variant for faster generation",
      "size_gb": 4.2,
      "vram_gb": 6,
      "files": [
        {
          "url": "https://huggingface.co/yourorg/yourmodel/resolve/main/weights.safetensors",
          "filename": "weights.safetensors",
          "sha256": "abc123..."
        }
      ]
    }
  ]
}
```

### generator.py interface

```python
# api/extensions/<extension-id>/generator.py
# Required interface every extension must implement

import sys
import json
from pathlib import Path

def generate(
    image_path: str,
    output_path: str,
    variant_id: str,
    models_dir: str,
    **kwargs
) -> dict:
    """
    Required entry point for all Modly extensions.
    
    Args:
        image_path:  Path to input image file
        output_path: Path where output .glb/.obj should be saved
        variant_id:  Which model variant to use
        models_dir:  Directory where downloaded model weights live
    
    Returns:
        dict with keys:
            success (bool)
            output_file (str) — path to generated mesh
            error (str, optional)
    """
    try:
        # Load your model weights
        weights = Path(models_dir) / variant_id / "weights.safetensors"
        
        # Run your inference
        mesh = run_inference(str(weights), image_path)
        
        # Save output
        mesh.export(output_path)
        
        return {
            "success": True,
            "output_file": output_path
        }
    except Exception as e:
        return {
            "success": False,
            "error": str(e)
        }
```

### Installing an extension (UI flow)

1. Open Modly → go to **Models** page
2. Click **Install from GitHub**
3. Paste the HTTPS URL, e.g. `https://github.com/lightningpixel/modly-hunyuan3d-mini-extension`
4. After install, click **Download** on the desired model variant
5. Select the installed model and upload an image to generate

### Official Extensions

| Extension | Model |
|-----------|-------|
| [modly-hunyuan3d-mini-extension](https://github.com/lightningpixel/modly-hunyuan3d-mini-extension) | Hunyuan3D 2 Mini |

---

## Python Backend API (FastAPI)

The backend runs locally. Key endpoints used by the Electron frontend:

```python
# Typical backend route patterns (api/main.py or similar)

# GET /extensions         — list installed extensions
# GET /extensions/{id}    — get extension details + variants
# POST /extensions/install — install extension from GitHub URL
# POST /generate          — trigger 3D generation
# GET /generate/status    — poll generation progress
# GET /models             — list downloaded model variants
# POST /models/download   — download a model variant
```

### Calling the backend from Electron (IPC pattern)

```typescript
// src/preload/index.ts — exposing backend calls to renderer
import { contextBridge, ipcRenderer } from 'electron'

contextBridge.exposeInMainWorld('modly', {
  generate: (imagePath: string, extensionId: string, variantId: string) =>
    ipcRenderer.invoke('generate', { imagePath, extensionId, variantId }),

  installExtension: (repoUrl: string) =>
    ipcRenderer.invoke('install-extension', { repoUrl }),

  listExtensions: () =>
    ipcRenderer.invoke('list-extensions'),
})
```

```typescript
// src/main/ipc-handlers.ts — main process handling
import { ipcMain } from 'electron'

ipcMain.handle('generate', async (_event, { imagePath, extensionId, variantId }) => {
  const response = await fetch('http://localhost:PORT/generate', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ image_path: imagePath, extension_id: extensionId, variant_id: variantId }),
  })
  return response.json()
})
```

```typescript
// src/renderer/components/GenerateButton.tsx — UI usage
declare global {
  interface Window {
    modly: {
      generate: (imagePath: string, extensionId: string, variantId: string) => Promise<{ success: boolean; output_file?: string; error?: string }>
      installExtension: (repoUrl: string) => Promise<{ success: boolean }>
      listExtensions: () => Promise<Extension[]>
    }
  }
}

async function handleGenerate(imagePath: string) {
  const result = await window.modly.generate(
    imagePath,
    'modly-hunyuan3d-mini-extension',
    'hunyuan3d-mini-turbo'
  )

  if (result.success) {
    console.log('Mesh saved to:', result.output_file)
  } else {
    console.error('Generation failed:', result.error)
  }
}
```

---

## Writing a Custom Extension

### Minimal extension repository structure

```
my-modly-extension/
├── manifest.json
└── generator.py
```

### Example: wrapping a HuggingFace diffusion model

```python
# generator.py
import torch
from PIL import Image
from pathlib import Path

def generate(image_path, output_path, variant_id, models_dir, **kwargs):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    weights_dir = Path(models_dir) / variant_id

    try:
        # Load model (example pattern)
        from your_model_lib import ImageTo3DPipeline
        
        pipe = ImageTo3DPipeline.from_pretrained(
            str(weights_dir),
            torch_dtype=torch.float16
        ).to(device)

        image = Image.open(image_path).convert("RGB")
        
        with torch.no_grad():
            mesh = pipe(image).mesh

        mesh.export(output_path)

        return {"success": True, "output_file": output_path}

    except Exception as e:
        return {"success": False, "error": str(e)}
```

---

## Configuration & Environment

Modly runs fully locally — no environment variables or API keys needed. GPU/CUDA is auto-detected by PyTorch in extensions.

Relevant configuration lives in:

```
package.json          # Electron app metadata, build targets
api/requirements.txt  # Python dependencies for backend
```

If you need to configure the backend port or extension directory, check the Electron main process config (typically `src/main/index.ts`) for constants like `API_PORT` or `EXTENSIONS_DIR`.

---

## Common Patterns

### Check if CUDA is available in an extension

```python
import torch

def get_device():
    if torch.cuda.is_available():
        print(f"Using GPU: {torch.cuda.get_device_name(0)}")
        return "cuda"
    print("No GPU found, falling back to CPU (slow)")
    return "cpu"
```

### Progress reporting from generator.py

```python
import sys
import json

def report_progress(percent: int, message: str):
    """Write progress to stdout so Modly can display it."""
    print(json.dumps({"progress": percent, "message": message}), flush=True)

def generate(image_path, output_path, variant_id, models_dir, **kwargs):
    report_progress(0, "Loading model...")
    # ... load model ...
    report_progress(30, "Processing image...")
    # ... inference ...
    report_progress(90, "Exporting mesh...")
    # ... export ...
    report_progress(100, "Done")
    return {"success": True, "output_file": output_path}
```

### Adding a new page in the renderer (React)

```typescript
// src/renderer/pages/MyPage.tsx
import React, { useEffect, useState } from 'react'

interface Extension {
  id: string
  name: string
  description: string
}

export default function MyPage() {
  const [extensions, setExtensions] = useState<Extension[]>([])

  useEffect(() => {
    window.modly.listExtensions().then(setExtensions)
  }, [])

  return (
    <div>
      <h1>Installed Extensions</h1>
      {extensions.map(ext => (
        <div key={ext.id}>
          <h2>{ext.name}</h2>
          <p>{ext.description}</p>
        </div>
      ))}
    </div>
  )
}
```

---

## Troubleshooting

| Problem | Fix |
|---------|-----|
| `npm run dev` — Python backend not starting | Ensure venv is set up: `cd api && python -m venv .venv && pip install -r requirements.txt` |
| CUDA out of memory | Use a smaller model variant or close other GPU processes |
| Extension install fails | Verify the GitHub URL is HTTPS and the repo contains `manifest.json` at root |
| Generation hangs | Check that your GPU drivers and CUDA toolkit match the PyTorch version in `requirements.txt` |
| App won't launch on Linux | Make `launcher.sh` executable: `chmod +x launcher.sh` |
| Model download stalls | Check disk space; large models (4–10 GB) need adequate free space |
| `torch` not found in extension | Ensure PyTorch is in `api/requirements.txt`, not just the extension's own deps |

### Verifying GPU is detected

```bash
cd api
source .venv/bin/activate   # or .venv\Scripts\activate on Windows
python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'no GPU')"
```

---

## Resources

- **Homepage**: https://modly3d.app
- **Releases**: https://github.com/lightningpixel/modly/releases/latest
- **Official extension**: https://github.com/lightningpixel/modly-hunyuan3d-mini-extension
- **Discord**: https://discord.gg/FjzjRgweVk
- **License**: MIT (attribution required — credit Modly + Lightning Pixel in forks)

Related Skills

ppt-image-first-workflow

from Aradotso/trending-skills

Conversation-first, image-first PPT generation workflow skill using GPT Image 2 for full-page visual slides packaged into PPTX files.

gpt-image-playground

from Aradotso/trending-skills

AI coding agent skill for GPT Image Playground — a React/TypeScript web app for OpenAI image generation and editing using gpt-image-1 and related APIs.

gpt-image-2-skill

from Aradotso/trending-skills

GPT Image 2 prompt gallery, agentic skill, and CLI for OpenAI image generation and editing with curated prompts and reference workflows

```markdown

from Aradotso/trending-skills

---

zeroboot-vm-sandbox

from Aradotso/trending-skills

Sub-millisecond VM sandboxes for AI agents using copy-on-write KVM forking via Zeroboot

yourvpndead-vpn-detection

from Aradotso/trending-skills

Android app that detects VPN/proxy servers (VLESS/xray/sing-box) via local SOCKS5 vulnerability, exposing exit IPs and server configs without root

xata-postgres-platform

from Aradotso/trending-skills

Expert skill for Xata open-source cloud-native Postgres platform with copy-on-write branching, scale-to-zero, and Kubernetes deployment

x-mentor-skill-nuwa

from Aradotso/trending-skills

AI-powered X (Twitter) content strategy skill that distills methodologies from 6 top creators + open-source algorithm data into actionable writing, growth, and monetization guidance.

wx-favorites-report

from Aradotso/trending-skills

End-to-end pipeline to extract, decrypt, and visualize WeChat Mac favorites from encrypted SQLite DB into an interactive HTML report.

wterm-web-terminal

from Aradotso/trending-skills

Web terminal emulator with Zig/WASM core, DOM rendering, and React/vanilla JS bindings

worldmonitor-intelligence-dashboard

from Aradotso/trending-skills

Real-time global intelligence dashboard with AI-powered news aggregation, geopolitical monitoring, and infrastructure tracking

witr-process-inspector

from Aradotso/trending-skills

CLI and TUI tool that explains why processes, services, and ports are running by tracing causality chains across supervisors, containers, and shells.