Transcribe Audio from the Command Line with FunASR (text / JSON / SRT)
Don't want to write Python? FunASR ships a command-line tool that turns audio into text, JSON, or SRT subtitles right in your terminal. Local, free, and especially strong on Chinese. Every command below is tested.
Install
pip install -U torch torchaudio
pip install -U funasr # requires funasr >= 1.3.10
Simplest: print text
funasr audio.wav
Prints the transcript to the terminal. Defaults to SenseVoice (non-autoregressive, very fast, 50+ languages).
Generate SRT subtitles
funasr audio.wav -f srt -o ./subs
Writes ./subs/audio.srt. Add --spk for speaker-split cues with real timestamps:
funasr meeting.wav --spk -f srt -o ./subs
1
00:00:02,919 --> 00:00:08,169
Hi everyone, let's talk about the Q3 plan.
2
00:00:10,029 --> 00:00:18,550
Quick status update - core features are 80% done.
Structured JSON
funasr audio.wav -f json
{
"text": "Hi everyone let's talk about the Q3 plan...",
"file": "audio.wav",
"model": "sensevoice",
"language": "auto",
"audio_duration_s": 59.52,
"processing_s": 2.17
}
Common options
| Command | What it does |
|---|---|
-f text/json/srt/tsv | Output format (default text) |
--spk | Speaker diarization (who spoke when) |
--model sensevoice/paraformer/fun-asr-nano | Pick a model (fun-asr-nano = highest accuracy) |
--hotwords "term,jargon" | Hotwords to boost rare terms |
-o ./out | Output directory |
funasr a.wav b.wav | Transcribe multiple files at once |
Deploy as an API server (OpenAI-compatible)
funasr-server --device cuda # localhost:8000, POST /v1/audio/transcriptions
Call it with any OpenAI SDK:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="x")
r = client.audio.transcriptions.create(model="sensevoice", file=open("audio.wav","rb"))
print(r.text)
Why the FunASR CLI
- Local, free, private - no internet, no API key.
- Fast: SenseVoice is non-autoregressive, far faster than Whisper (benchmark) - real-time even on CPU.
- Stronger on Chinese + 50+ languages; subtitles, speakers, hotwords, JSON out of the box.
FunASR is Tongyi Lab's open-source, industrial-grade speech recognition toolkit.
Star FunASR on GitHub ★