KeyNote

2026 · GitHub

A voice-controlled note-taking and audio-to-text test harness for local multimodal models. KeyNote combines push-to-talk recording, prompt modes, local SQLite storage, exports, clipboard workflows, and a terminal UI around a llama-server transcription loop.

What It Does

The project started as a way to test local audio-to-text capabilities in multimodal LLMs. It grew into a compact note system: notes can be created from recordings, appended through a separate hotkey, searched later, exported to Markdown, and transformed with reusable prompt modes.

System Pieces

Layer	Role	Implementation
Capture	Push-to-talk and long recordings	Global hotkeys, audio device selection, microphone or loopback input.
Processing	Local transcription and mode prompts	Requests to a local `llama-server`, with modes such as mail, Slack, transcript, and summarize.
Storage	Searchable local notes	SQLite-backed notes, metadata, active-note appends, and Markdown export.
Interface	CLI, TUI, and desktop overlay	Click commands, Textual screens, clipboard automation, and a compact mode/status overlay.

Design Notes

Balancing quick push-to-talk capture with longer recordings that need chunked processing.
Keeping prompt modes editable while still making them easy to switch with hotkeys.
Handling local-only data, clipboard paste, and app focus without turning the tool into a cloud service.
Making the same note store usable from both direct CLI commands and an interactive terminal UI.