This file provides guidance to AI coding assistants working with code in this repository.
Prerequisites:
Core Development:
# Install dependencies
bun install
# Run in development mode
bun run tauri dev
# If cmake error on macOS:
CMAKE_POLICY_VERSION_MINIMUM=3.5 bun run tauri dev
# Build for production
bun run tauri build
# Frontend only development
bun run dev # Start Vite dev server
bun run build # Build frontend (TypeScript + Vite)
bun run preview # Preview built frontendLinting and Formatting (run before committing):
bun run lint # ESLint for frontend
bun run lint:fix # ESLint with auto-fix
bun run format # Prettier + cargo fmt
bun run format:check # Check formatting without changes
bun run format:frontend # Prettier only
bun run format:backend # cargo fmt onlyModel Setup (Required for Development):
mkdir -p src-tauri/resources/models
curl -o src-tauri/resources/models/silero_vad_v4.onnx https://blob.handy.computer/silero_vad_v4.onnxFor detailed platform-specific build setup, see BUILD.md.
Handy is a cross-platform desktop speech-to-text application built with Tauri 2.x (Rust backend + React/TypeScript frontend).
lib.rs- Main entry point, Tauri setup, manager initializationmanagers/- Core business logic:audio.rs- Audio recording and device managementmodel.rs- Model downloading and managementtranscription.rs- Speech-to-text processing pipelinehistory.rs- Transcription history storage
audio_toolkit/- Low-level audio processing:audio/- Device enumeration, recording, resamplingvad/- Voice Activity Detection (Silero VAD)
commands/- Tauri command handlers for frontend communicationcli.rs- CLI argument definitions (clap derive)shortcut.rs- Global keyboard shortcut handlingsettings.rs- Application settings managementoverlay.rs- Recording overlay window (platform-specific)signal_handle.rs-send_transcription_input()reusable functionutils.rs- Platform detection helpers
App.tsx- Main component with onboarding flowcomponents/- React UI components:settings/- Settings UImodel-selector/- Model management interfaceonboarding/- First-run experienceoverlay/- Recording overlay UIupdate-checker/- App update notificationsshared/,ui/,icons/,footer/- Shared components
hooks/useSettings.ts- Settings state management hookstores/settingsStore.ts- Zustand store for settingsbindings.ts- Auto-generated Tauri type bindings (via tauri-specta)overlay/- Recording overlay window entry pointlib/types.ts- Shared TypeScript type definitions
Manager Pattern: Core functionality organized into managers (Audio, Model, Transcription) initialized at startup and managed via Tauri state.
Command-Event Architecture: Frontend → Backend via Tauri commands; Backend → Frontend via events.
Pipeline Processing: Audio → VAD → Whisper/Parakeet → Text output → Clipboard/Paste
State Flow: Zustand → Tauri Command → Rust State → Persistence (tauri-plugin-store)
Core Libraries:
whisper-rs- Local Whisper inference with GPU accelerationcpal- Cross-platform audio I/Ovad-rs- Voice Activity Detectionrdev- Global keyboard shortcutsrubato- Audio resamplingrodio- Audio playback for feedback sounds
- Initialization: App starts minimized to tray, loads settings, initializes managers
- Model Setup: First-run downloads preferred Whisper model (Small/Medium/Turbo/Large)
- Recording: Global shortcut triggers audio recording with VAD filtering
- Processing: Audio sent to Whisper model for transcription
- Output: Text pasted to active application via system clipboard
Settings are stored using Tauri's store plugin with reactive updates:
- Keyboard shortcuts (configurable, supports push-to-talk)
- Audio devices (microphone/output selection)
- Model preferences (Small/Medium/Turbo/Large Whisper variants)
- Audio feedback and translation options
The app enforces single instance behavior — launching when already running brings the settings window to front rather than creating a new process. Remote control flags (--toggle-transcription, etc.) work by launching a second instance that sends args to the running instance via tauri_plugin_single_instance, then exits.
All user-facing strings must use i18next translations. ESLint enforces this (no hardcoded strings in JSX).
Adding new text:
- Add key to
src/i18n/locales/en/translation.json - Use in component:
const { t } = useTranslation(); t('key.path')
File structure:
src/i18n/
├── index.ts # i18n setup
├── languages.ts # Language metadata
└── locales/
├── en/translation.json # English (source)
├── de/, es/, fr/, ja/, ru/, zh/, ...
└── ...
For translation contribution guidelines, see CONTRIBUTING_TRANSLATIONS.md.
Rust:
- Run
cargo fmtandcargo clippybefore committing - Handle errors explicitly (avoid unwrap in production)
- Use descriptive names, add doc comments for public APIs
TypeScript/React:
- Strict TypeScript, avoid
anytypes - Functional components with hooks
- Tailwind CSS for styling
- Path aliases:
@/→./src/
Use conventional commits: feat:, fix:, docs:, refactor:, chore:
Handy supports command-line parameters on all platforms for integration with scripts, window managers, and autostart configurations.
Implementation: cli.rs (definitions), main.rs (parsing), lib.rs (applying), signal_handle.rs (shared logic)
| Flag | Description |
|---|---|
--toggle-transcription |
Toggle recording on/off on a running instance |
--toggle-post-process |
Toggle recording with post-processing on/off |
--cancel |
Cancel the current operation on a running instance |
--start-hidden |
Launch without showing the main window (tray icon visible) |
--no-tray |
Launch without system tray (closing window quits the app) |
--debug |
Enable debug mode with verbose (Trace) logging |
Key design decisions:
- CLI flags are runtime-only overrides — they do NOT modify persisted settings
- Remote control flags work via
tauri_plugin_single_instance: second instance sends args, then exits send_transcription_input()insignal_handle.rsis shared between signal handlers and CLI
Access debug features: Cmd+Shift+D (macOS) or Ctrl+Shift+D (Windows/Linux)
- macOS: Metal acceleration, accessibility permissions required for keyboard shortcuts
- Windows: Vulkan acceleration, code signing
- Linux: OpenBLAS + Vulkan, limited Wayland support, overlay uses GTK layer shell (disable with
HANDY_NO_GTK_LAYER_SHELL=1)
See the Troubleshooting section in README.md.
Follow CONTRIBUTING.md for the full workflow and PR template when submitting pull requests. For translations, see CONTRIBUTING_TRANSLATIONS.md.
Note: Feature freeze is active — bug fixes are top priority. New features require community support via Discussions.