CompanyRemote

English Speech-to-Text Adaptation

Deadline: 2026-04-10
Project-Based

Description

Budget: £18 - £36/hr

I have a collection of recorded English audio that needs to be converted into accurate, well-punctuated plain-text transcripts. The recordings vary in speaker accent and subject matter, so I’m after more than a straight “run it through an API” pass—I want the model or pipeline adapted to the material so accuracy climbs beyond the default.

You’re free to use Whisper, Kaldi, Vosk, DeepSpeech, or any comparable stack, provided you can fine-tune vocabulary, handle automatic punctuation and casing, and cope with the occasional background noise. Speaker diarization is a welcome bonus if your toolset supports it.

Deliverables • One clean UTF-8 TXT file per audio clip, reflecting the spoken words verbatim (with punctuation). • A short reproducibility note outlining the toolchain and any model tweaks applied.

I’ll start you on a small sample; once the quality checks out, we’ll roll straight into the full batch. Audio files (WAV/MP3) are ready to go as soon as you are.

Skills

AI Model DevelopmentAI DevelopmentAI (Artificial Intelligence) HW/SWAI ResearchAI Text-to-textAI WritingAutomatic Speech RecognitionAPIAI Chatbot DevelopmentGo

Want AI to find more roles like this?

Upload your CV once. Get matched to relevant assignments automatically.

Try personalized matching