openless-voice-input

OpenLess open-source voice input for macOS & Windows — press a hotkey, speak, get AI-polished text inserted at your cursor in any app.

22 stars

byAradotso

View on GitHub Installation ↓

Best use case

openless-voice-input is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

OpenLess open-source voice input for macOS & Windows — press a hotkey, speak, get AI-polished text inserted at your cursor in any app.

Teams using openless-voice-input should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/openless-voice-input/SKILL.md --create-dirs "https://raw.githubusercontent.com/Aradotso/trending-skills/main/skills/openless-voice-input/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/openless-voice-input/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How openless-voice-input Compares

Feature / Agent	openless-voice-input	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

OpenLess open-source voice input for macOS & Windows — press a hotkey, speak, get AI-polished text inserted at your cursor in any app.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# OpenLess Voice Input

> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.

OpenLess is a cross-platform (macOS 12+, Windows 10+) voice-input app built with Tauri 2 + Rust + React/TypeScript. Press a global hotkey, speak, release — the app records audio, transcribes via Volcengine streaming ASR or Whisper, polishes the transcript with an LLM, and inserts the result at the active cursor in any app. It is a fully open-source alternative to Typeless, Wispr Flow, and Superwhisper.

---

## Installation (End Users)

### macOS
1. Download `OpenLess_<version>_aarch64.dmg` from [Releases](https://github.com/appergb/openless/releases/latest).
2. Open the DMG, drag `OpenLess.app` to `/Applications`.
3. Launch, grant **Microphone** and **Accessibility** permissions when prompted.
4. **Quit and reopen** — Accessibility only takes effect after a restart.
5. Open Settings → fill in ASR + LLM credentials.

### Windows
1. Download `OpenLess_<version>_x64-setup.exe` from Releases.
2. Run the installer.
3. Grant Microphone access when prompted.
4. Open Settings → Permissions → verify the global hotkey listener is active.
5. Fill in ASR + LLM credentials in Settings.

---

## Build from Source (Developers)

### Prerequisites
- Node.js 18+, npm
- Rust 1.77+ (`rustup`)
- Tauri CLI v2 (`cargo install tauri-cli --version "^2"`)
- macOS: Xcode Command Line Tools
- Windows: MSVC build tools or MinGW (see `openless-all/README.md`)

### Steps

```bash
git clone https://github.com/appergb/openless.git
cd openless/openless-all/app

npm ci

# Development (Vite at :1420 + Tauri shell with hot reload)
npm run tauri dev

# Production build — macOS (signs, installs, resets TCC)
./scripts/build-mac.sh

# Build only, skip install step
INSTALL=0 ./scripts/build-mac.sh

# Rust type-check without full compile
cargo check --manifest-path src-tauri/Cargo.toml

# Frontend TypeScript type-check
npm run build
```

### Log locations
- **macOS**: `~/Library/Logs/OpenLess/openless.log`
- **Windows**: `%LOCALAPPDATA%\OpenLess\Logs\openless.log`

---

## Configuration & Credentials

Credentials are stored in the platform Keychain (service = `com.openless.app`). A plaintext fallback is written to `~/.openless/credentials.json` (mode `0600`) when Keychain is unavailable in dev mode.

**Never commit API keys.** Reference them via environment variables or enter them in the Settings UI.

### Required credentials

| Key | Where to get it |
|-----|----------------|
| Volcengine ASR APP ID | Volcengine console → Speech Recognition |
| Volcengine ASR Access Token | Same console |
| Volcengine ASR Resource ID | Same console |
| Ark/LLM API Key | Volcengine Ark console or any OpenAI-compatible provider |
| Ark Model ID | e.g. `ep-XXXXXXXX` or a DeepSeek model ID |
| Ark Endpoint | Default: `https://ark.cn-beijing.volces.com/api/v3/chat/completions` |

### Setting credentials programmatically (Rust, for tests/CI)

```rust
// src-tauri/src/persistence.rs pattern
use crate::types::Credentials;

let creds = Credentials {
    asr_app_id: std::env::var("OPENLESS_ASR_APP_ID").unwrap(),
    asr_token: std::env::var("OPENLESS_ASR_TOKEN").unwrap(),
    asr_resource_id: std::env::var("OPENLESS_ASR_RESOURCE_ID").unwrap(),
    ark_api_key: std::env::var("OPENLESS_ARK_API_KEY").unwrap(),
    ark_model_id: std::env::var("OPENLESS_ARK_MODEL_ID").unwrap(),
    ark_endpoint: std::env::var("OPENLESS_ARK_ENDPOINT")
        .unwrap_or_else(|_| "https://ark.cn-beijing.volces.com/api/v3/chat/completions".to_string()),
};
// Pass to coordinator via Tauri state
```

---

## Architecture Overview

```
openless-all/app/
├── src/                    # React/TypeScript frontend
│   ├── pages/
│   │   ├── _atoms.tsx      # Recoil global state atoms
│   │   ├── Home.tsx
│   │   ├── History.tsx
│   │   ├── Dictionary.tsx
│   │   └── Settings.tsx
│   └── lib/
│       └── ipc.ts          # All Tauri invoke() calls (IPC surface)
└── src-tauri/src/          # Rust backend
    ├── types.rs            # Value types: DictationSession, PolishMode, errors
    ├── hotkey.rs           # CGEventTap (macOS) / WH_KEYBOARD_LL (Windows)
    ├── recorder.rs         # Mic → 16 kHz mono Int16 PCM + RMS callback
    ├── asr/                # Volcengine WebSocket ASR + Whisper HTTP
    ├── polish.rs           # OpenAI-compatible chat-completions
    ├── insertion.rs        # AX focused-element → clipboard+paste → copy fallback
    ├── persistence.rs      # History / prefs / vocab JSON + Keychain
    ├── permissions.rs      # TCC checks
    ├── coordinator.rs      # State machine: Idle→Starting→Listening→Processing
    └── commands.rs         # Tauri #[tauri::command] IPC surface
```

### Dictation pipeline

```
hotkey DOWN
  → coordinator: Idle → Starting → Listening
  → recorder.start() + asr.open_session()
  → [audio frames streamed to ASR via WebSocket]

hotkey UP
  → recorder.stop() + asr.send_last_frame()
  → coordinator: Listening → Processing
  → polish(transcript, mode) → LLM API call
  → insertion.insert_at_cursor(polished_text)
      ├─ AX focused element write (macOS Accessibility API)
      ├─ clipboard + Cmd+V / Ctrl+V paste
      └─ copy-only fallback (text in clipboard, user pastes manually)
  → history.save(session)
  → coordinator: Processing → Idle
```

`Esc` cancels at **any** phase including polish/insert.

---

## Polish Modes

| Mode | Tauri enum | Behaviour |
|------|-----------|-----------|
| Raw | `PolishMode::Raw` | Transcript verbatim, no LLM call |
| Light | `PolishMode::Light` | Remove filler words, fix punctuation |
| Structured | `PolishMode::Structured` | **AI-prompt mode** — reshapes speech into a structured, context-rich prompt |
| Formal | `PolishMode::Formal` | Formal prose, fixes grammar, organises paragraphs |

### Structured mode — what it does

Input (spoken): *"uh so I want ChatGPT to write me a SQL query from the orders table get last month's orders group by customer sort by amount desc top ten"*

Output inserted at cursor:
```text
Please write a SQL query that:

- Pulls orders from last month from the `orders` table.
- Groups by customer.
- Sorts by total amount, descending.
- Returns the top 10 rows only.
```

**Key invariant**: the LLM only *reshapes* your text. It does not answer questions or execute commands. If you say "what features are missing?", the output is `What features are missing?` — not a feature list.

---

## IPC Surface (Frontend → Backend)

All calls go through `src/lib/ipc.ts` using Tauri's `invoke()`.

```typescript
// src/lib/ipc.ts — representative subset

import { invoke } from "@tauri-apps/api/core";
import type { DictationSession, PolishMode, HotkeyBinding } from "./types";

// Save credentials (written to Keychain)
export async function saveCredentials(creds: {
  asrAppId: string;
  asrToken: string;
  asrResourceId: string;
  arkApiKey: string;
  arkModelId: string;
  arkEndpoint: string;
}): Promise<void> {
  return invoke("save_credentials", { creds });
}

// Load saved credentials (for Settings UI pre-fill)
export async function loadCredentials(): Promise<typeof creds | null> {
  return invoke("load_credentials");
}

// Get dictation history
export async function getHistory(): Promise<DictationSession[]> {
  return invoke("get_history");
}

// Set active polish mode
export async function setPolishMode(mode: PolishMode): Promise<void> {
  return invoke("set_polish_mode", { mode });
}

// Update hotkey binding
export async function setHotkey(binding: HotkeyBinding): Promise<void> {
  return invoke("set_hotkey", { binding });
}

// Check platform permissions
export async function checkPermissions(): Promise<{
  microphone: boolean;
  accessibility: boolean;
}> {
  return invoke("check_permissions");
}

// Add a vocabulary/dictionary entry
export async function addVocabEntry(entry: {
  word: string;
  category: string;
  notes: string;
}): Promise<void> {
  return invoke("add_vocab_entry", { entry });
}
```

---

## Recoil State Atoms

```typescript
// src/pages/_atoms.tsx — key atoms

import { atom, selector } from "recoil";
import type { DictationSession, PolishMode } from "../lib/types";

export const polishModeAtom = atom<PolishMode>({
  key: "polishMode",
  default: "Structured",
});

export const historyAtom = atom<DictationSession[]>({
  key: "history",
  default: [],
});

export const recordingStateAtom = atom<
  "idle" | "starting" | "listening" | "processing"
>({
  key: "recordingState",
  default: "idle",
});

export const rmsLevelAtom = atom<number>({
  key: "rmsLevel",
  default: 0,
});
```

---

## Adding a Custom Polish Mode (Rust)

```rust
// src-tauri/src/types.rs
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub enum PolishMode {
    Raw,
    Light,
    Structured,
    Formal,
    // Add your mode:
    Technical,
}

// src-tauri/src/polish.rs
fn build_system_prompt(mode: &PolishMode) -> &'static str {
    match mode {
        PolishMode::Raw => "",
        PolishMode::Light => LIGHT_PROMPT,
        PolishMode::Structured => STRUCTURED_PROMPT,
        PolishMode::Formal => FORMAL_PROMPT,
        PolishMode::Technical => {
            "You are a technical writer. Convert the spoken transcript into \
             precise technical documentation prose. Use correct terminology. \
             Do not answer questions — output the cleaned text only, no preamble."
        }
    }
}
```

---

## Dictionary / Vocabulary

Dictionary entries are sent as Volcengine ASR `context.hotwords` (improving transcription accuracy) and injected into the polish prompt (the LLM applies context-aware substitution).

### Adding entries via UI
Settings → Dictionary tab → **New** button → fill Word, Category, Notes → Save.

### Adding entries via IPC (programmatically)

```typescript
import { addVocabEntry } from "../lib/ipc";

await addVocabEntry({
  word: "OpenLess",
  category: "product",
  notes: "Open-source voice input app, not 'open list' or 'open less'",
});
```

### How hotword injection works (Rust)

```rust
// src-tauri/src/asr/volcengine.rs (simplified)
async fn open_session(
    &self,
    vocab: &[VocabEntry],
    config: &AsrConfig,
) -> Result<AsrSession> {
    let hotwords: Vec<String> = vocab
        .iter()
        .filter(|e| e.enabled)
        .map(|e| e.word.clone())
        .collect();

    let payload = serde_json::json!({
        "app": { "appid": config.app_id, "token": config.token },
        "audio": { "format": "pcm", "sample_rate": 16000, "bits": 16, "channel": 1 },
        "request": {
            "model_name": config.resource_id,
            "context": { "hotwords": hotwords }
        }
    });
    // open WebSocket, send payload ...
}
```

---

## Hotkey Configuration

### Recording modes
- **Push-to-talk**: hold hotkey → recording → release → process
- **Toggle**: press once to start, press again to stop

### Changing the hotkey (TypeScript)

```typescript
import { setHotkey } from "../lib/ipc";
import type { HotkeyBinding } from "../lib/types";

// Example: Right Option key, push-to-talk mode
const binding: HotkeyBinding = {
  key: "AltRight",
  modifiers: [],
  mode: "PushToTalk",
};
await setHotkey(binding);
```

### Platform hotkey internals (Rust)

```rust
// src-tauri/src/hotkey.rs (macOS path — CGEventTap)
#[cfg(target_os = "macos")]
pub fn register_global_hotkey(
    binding: HotkeyBinding,
    on_press: impl Fn() + Send + 'static,
    on_release: impl Fn() + Send + 'static,
) -> Result<HotkeyHandle> {
    // Uses CGEventTap — requires Accessibility permission
    // Spawns a dedicated CFRunLoop thread
    // Sends Tauri events: "hotkey-press" / "hotkey-release"
    todo!("see hotkey.rs for full implementation")
}

#[cfg(target_os = "windows")]
pub fn register_global_hotkey(/* ... */) -> Result<HotkeyHandle> {
    // Uses SetWindowsHookExA(WH_KEYBOARD_LL, ...)
    todo!("see hotkey.rs for full implementation")
}
```

---

## Coordinator State Machine

```rust
// src-tauri/src/coordinator.rs
#[derive(Debug, Clone, PartialEq)]
pub enum RecordingState {
    Idle,
    Starting,
    Listening,
    Processing,
}

// Transitions:
// Idle      --[hotkey press]-->  Starting
// Starting  --[ASR ready]---->  Listening
// Listening --[hotkey release]-> Processing
// Listening --[Esc]----------->  Idle        (cancel)
// Processing--[done / Esc]---->  Idle
```

Listen for state changes on the frontend:

```typescript
import { listen } from "@tauri-apps/api/event";
import { useSetRecoilState } from "recoil";
import { recordingStateAtom } from "./_atoms";

const setRecordingState = useSetRecoilState(recordingStateAtom);

useEffect(() => {
  const unlisten = listen<string>("recording-state-changed", (event) => {
    setRecordingState(event.payload as any);
  });
  return () => { unlisten.then(fn => fn()); };
}, []);
```

---

## Troubleshooting

### Text not being inserted at cursor

1. **macOS**: Accessibility permission not granted or not restarted after granting.
   - System Settings → Privacy & Security → Accessibility → enable OpenLess → **quit and reopen**.
2. **Windows**: Some apps (e.g. UWP, sandboxed Electron) block synthetic input — OpenLess falls back to clipboard copy. Check the tray notification.
3. Verify insertion fallback chain: AX write → clipboard+paste → copy-only.

### Hotkey not responding

- **macOS**: Ensure Accessibility permission is active. CGEventTap silently fails without it.
- **Windows**: Another app may have grabbed the same key combo. Change the hotkey in Settings.
- Only one OpenLess instance can run (single-instance lock). Check Activity Monitor / Task Manager.

### ASR returns empty / garbled transcript

- Verify Volcengine ASR credentials: APP ID, Access Token, Resource ID all correct.
- Check microphone sample rate — OpenLess records at 16 kHz mono Int16 PCM. Some USB mics need explicit configuration.
- Check `openless.log` for WebSocket errors.

### LLM polish not working / timeout

- Verify Ark API Key and Model ID.
- Confirm endpoint is reachable: `curl -s $OPENLESS_ARK_ENDPOINT` (should return 401, not timeout).
- DeepSeek / OpenAI-compatible endpoints work — set the endpoint URL in Settings.

### Build fails on macOS: `codesign` error

```bash
# Skip signing for local dev build
CODESIGN_IDENTITY="" INSTALL=0 ./scripts/build-mac.sh
```

### Cargo check errors after pulling

```bash
cd openless-all/app
cargo check --manifest-path src-tauri/Cargo.toml
# Look for changed feature flags in Cargo.toml
# Run `cargo update` if lock file is stale
```

---

## Key Files Reference

| File | Purpose |
|------|---------|
| `src-tauri/src/coordinator.rs` | Master state machine — start here to understand the flow |
| `src-tauri/src/commands.rs` | All `#[tauri::command]` IPC endpoints |
| `src-tauri/src/polish.rs` | LLM prompt construction and API calls |
| `src-tauri/src/asr/` | Volcengine WebSocket ASR + Whisper HTTP |
| `src-tauri/src/insertion.rs` | Cursor insertion + clipboard fallback |
| `src/lib/ipc.ts` | Frontend IPC surface — all `invoke()` calls |
| `src/pages/_atoms.tsx` | Recoil global state |
| `CLAUDE.md` | Module-wiring invariants for AI coding agents |
| `docs/polish-reference-corpus.md` | Polish example corpus design |
| `USAGE.md` | Full end-user walkthrough |

Related Skills

type4me-macos-voice-input

from Aradotso/trending-skills

MacOS voice input tool with local/cloud ASR engines, LLM text optimization, and fully local storage built in Swift

omnivoice-tts

from Aradotso/trending-skills

Expert skill for OmniVoice, a massively multilingual zero-shot TTS model supporting 600+ languages with voice cloning and voice design capabilities.

```markdown

from Aradotso/trending-skills

---

zeroboot-vm-sandbox

from Aradotso/trending-skills

Sub-millisecond VM sandboxes for AI agents using copy-on-write KVM forking via Zeroboot

yourvpndead-vpn-detection

from Aradotso/trending-skills

Android app that detects VPN/proxy servers (VLESS/xray/sing-box) via local SOCKS5 vulnerability, exposing exit IPs and server configs without root

xata-postgres-platform

from Aradotso/trending-skills

Expert skill for Xata open-source cloud-native Postgres platform with copy-on-write branching, scale-to-zero, and Kubernetes deployment

x-mentor-skill-nuwa

from Aradotso/trending-skills

AI-powered X (Twitter) content strategy skill that distills methodologies from 6 top creators + open-source algorithm data into actionable writing, growth, and monetization guidance.

wx-favorites-report

from Aradotso/trending-skills

End-to-end pipeline to extract, decrypt, and visualize WeChat Mac favorites from encrypted SQLite DB into an interactive HTML report.

wterm-web-terminal

from Aradotso/trending-skills

Web terminal emulator with Zig/WASM core, DOM rendering, and React/vanilla JS bindings

worldmonitor-intelligence-dashboard

from Aradotso/trending-skills

Real-time global intelligence dashboard with AI-powered news aggregation, geopolitical monitoring, and infrastructure tracking

witr-process-inspector

from Aradotso/trending-skills

CLI and TUI tool that explains why processes, services, and ports are running by tracing causality chains across supervisors, containers, and shells.

wildworld-dataset

from Aradotso/trending-skills

WildWorld large-scale action-conditioned world modeling dataset with 108M+ frames from a photorealistic ARPG game, featuring per-frame annotations, 450+ actions, and explicit state information for generative world modeling research.