Run FunASR on CPU — the llama.cpp / GGUF Runtime
FunASR on llama.cpp is to FunASR what whisper.cpp is to Whisper: it runs SenseVoice / Paraformer / Fun-ASR-Nano on the ggml stack, so the models work where there is no GPU and no Python (laptops, edge boxes, embedded C/C++ apps), complementing the PyTorch / vLLM paths for GPU serving. FSMN-VAD is built into the binaries.
Download prebuilt binaries (download & run)
Linux (x64/arm64), macOS (arm64), Windows (x64) — static, self-contained, zero dependencies. Available in all three repos' Releases:
all three models ⬇ FunAudioLLM/SenseVoice
multilingual + emotion/events ⬇ FunAudioLLM/Fun-ASR
Fun-ASR-Nano (LLM-ASR)
Three lines to run
# 1) Unpack the binaries, fetch a model (downloads GGUF + VAD) bash download-funasr-model.sh sensevoice ./gguf # 2) Get text directly (in-binary detok, no Python) llama-funasr-sensevoice -m ./gguf/SenseVoiceSmall-f16.gguf --vad ./gguf/fsmn-vad.gguf -a audio.wav
Other models: download-funasr-model.sh paraformer with llama-funasr-paraformer; download-funasr-model.sh nano with llama-funasr-cli (Fun-ASR-Nano, 31 languages).
Accuracy: far ahead of whisper.cpp on Chinese
Same 184-clip Chinese test set, character error rate (CER, micro-avg, lower is better):
| Model | CER ↓ | Notes |
|---|---|---|
| FunASR SenseVoice | 8.01% | multilingual + emotion/events |
| FunASR Paraformer | 9.85% | non-autoregressive, industrial Chinese |
| FunASR Fun-ASR-Nano | 8.30% | LLM-ASR, 31 languages |
| whisper.cpp small | 22.12% | |
| whisper.cpp large-v3-turbo | 23.15% | |
| whisper.cpp base | 31.33% |
FunASR's Chinese CER is roughly a third of whisper.cpp's. Full methodology in each repo's runtime/llama.cpp/BENCHMARKS.md.
What's included
- 6 binaries:
llama-funasr-{cli,encoder,embd,sensevoice,paraformer,vad}, static, no .so dependencies. - Built-in FSMN-VAD (
--vad), in-binary detokenization (prints text), kaldi-compatible fbank front end. - GGUF models on Hugging Face: FunAudioLLM / funasr (f16/f32 with embedded vocab).
- Source & docs: runtime/llama.cpp/ (README / DESIGN / BENCHMARKS).
If it helps, a GitHub Star really supports the project 👇 Fully open-source, commercial-friendly.
Also star: SenseVoice · Fun-ASR · FunClip
Further reading: FunASR on llama.cpp (a whisper.cpp alternative) — deep dive