Audio Chunking for Long-Form Transcription: Splitting and Stitching with ffmpeg + TypeScript

APIs that do speech-to-text — Groq Whisper, OpenAI Whisper, and friends — all have one thing in common: a file size limit. Groq’s hard cap is 25MB. A typical one-hour interview at decent quality can easily be 80–150MB. If you just try to send that, you…


This content originally appeared on DEV Community and was authored by nareshipme

APIs that do speech-to-text — Groq Whisper, OpenAI Whisper, and friends — all have one thing in common: a file size limit. Groq's hard cap is 25MB. A typical one-hour interview at decent quality can easily be 80–150MB. If you just try to send that, you'll get a 413 or a rate-limit error before the transcription even starts.

The fix is chunking: split the audio into manageable pieces, transcribe each one, then stitch the results back together — with correct timestamps. That last part is where most implementations go wrong.

Here's the approach I landed on, built around ffmpeg and TypeScript.

The Strategy

if file < 24MB → send directly (fast path)
else           → chunk into 20-min segments at 32kbps mono → transcribe each → stitch

The 20-minute / 32kbps combination keeps each chunk well under 5MB, which gives plenty of headroom below the 25MB limit regardless of source format.


This content originally appeared on DEV Community and was authored by nareshipme


Print Share Comment Cite Upload Translate Updates
APA

nareshipme | Sciencx (2026-03-21T15:38:27+00:00) Audio Chunking for Long-Form Transcription: Splitting and Stitching with ffmpeg + TypeScript. Retrieved from https://www.scien.cx/2026/03/21/audio-chunking-for-long-form-transcription-splitting-and-stitching-with-ffmpeg-typescript/

MLA
" » Audio Chunking for Long-Form Transcription: Splitting and Stitching with ffmpeg + TypeScript." nareshipme | Sciencx - Saturday March 21, 2026, https://www.scien.cx/2026/03/21/audio-chunking-for-long-form-transcription-splitting-and-stitching-with-ffmpeg-typescript/
HARVARD
nareshipme | Sciencx Saturday March 21, 2026 » Audio Chunking for Long-Form Transcription: Splitting and Stitching with ffmpeg + TypeScript., viewed ,<https://www.scien.cx/2026/03/21/audio-chunking-for-long-form-transcription-splitting-and-stitching-with-ffmpeg-typescript/>
VANCOUVER
nareshipme | Sciencx - » Audio Chunking for Long-Form Transcription: Splitting and Stitching with ffmpeg + TypeScript. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2026/03/21/audio-chunking-for-long-form-transcription-splitting-and-stitching-with-ffmpeg-typescript/
CHICAGO
" » Audio Chunking for Long-Form Transcription: Splitting and Stitching with ffmpeg + TypeScript." nareshipme | Sciencx - Accessed . https://www.scien.cx/2026/03/21/audio-chunking-for-long-form-transcription-splitting-and-stitching-with-ffmpeg-typescript/
IEEE
" » Audio Chunking for Long-Form Transcription: Splitting and Stitching with ffmpeg + TypeScript." nareshipme | Sciencx [Online]. Available: https://www.scien.cx/2026/03/21/audio-chunking-for-long-form-transcription-splitting-and-stitching-with-ffmpeg-typescript/. [Accessed: ]
rf:citation
» Audio Chunking for Long-Form Transcription: Splitting and Stitching with ffmpeg + TypeScript | nareshipme | Sciencx | https://www.scien.cx/2026/03/21/audio-chunking-for-long-form-transcription-splitting-and-stitching-with-ffmpeg-typescript/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.