With audio transcription, you get what you pay for. For a dollar per minute of audio, Rev will give you a well-formatted transcript. For 25 cents a minute (a little lower for large volumes), Trint will give you a transcript riddled with errors, but will help you correct it with a sophisticated editing tool.
And for 10 cents a minute, the brand-new CheapTranscription.io will give you a no-frills text doc with even more errors. Depending on your needs, you might want to use all three. Here’s how to choose.
For Public Transcripts: Rev
Say you put out a podcast with multiple voices on it, and you’d like to publish transcripts for each episode. Or say you need captions for a video. Or say you’ve interviewed someone and want to print the entire interview. If you can afford it, Rev is the best solution.
Rev charges $1/minute for transcripts and for video captions in English. Video captions include timestamps.
This adds up fast if you’re transcribing a weekly hour-long podcast out of pocket, but it’s cheap enough for a corporate budget. I’ve also used Rev to make high-quality captions for my personal creative video projects. There’s no minimum fee, so I’ve even run three-minute YouTube videos through Rev.
These are flat rates, with no extra charge for more speakers. You can pay extra for things like including every verbal tic, or for including timestamps on a transcript. Rev also offers translations for 10 cents/word on written documents, $4/minute on audio and video recordings.
Transcript Quality: High
Upload your audio or video to Rev, and they’ll come back within 24 hours, with a meticulous transcription, each line identified by speaker, and with “ums” and other verbal tics edited out. Any unintelligible words will be marked. These are usually minimal, maybe two to five in an hour-long phone call.
Rev’s transcribers can accurately write out both sides of a phone conversation, or an in-person interview with medium background noise. They can work with a fairly wide range of accents, though if you have trouble hearing something in your own audio recording, they’ll probably have trouble too. They’re not magic.
Here’s an excerpt from a phone interview I did with Welcome to Night Vale actor Cecil Baldwin, transcribed by Rev:
Video caption transcription is accurate and accurately timed, which is crucial—in online video, sloppy captions stand out immediately. At Rev’s rate, it’s always worth it to pay for transcripts of any short to medium video, especially if you’re uploading to platforms like Facebook or Instagram, where many people watch videos on mute. You’ll save yourself a lot of grunt work.
No transcription service is always perfect. So most of the time, you want to do at least a very light skim to check for names, and for any insertions of [unintelligible] or [inaudible], or a few too many faithful transcriptions of an interviewer butting in with “Right” and “Oh!”
Rev delivers its transcripts in an editor interface, which includes a player for the original audio. In the editor, you can directly edit the text, change speaker names, make comments on certain passages, and share the transcript with other users. You can export your text as docx, PDF, or txt.
Annoyingly, Rev’s audio player doesn’t sync up to your cursor in the text, nor does it highlight the corresponding text as the audio plays (at least in our tests on Chrome). So if you need to find a certain line to correct, you’ll have to hunt around in the recording. It’s an embarrassing oversight from an otherwise top-shelf service.
When You Don’t Need Perfection: Trint
If you need to dig around through a recording, but you don’t actually need to publish it (or you have time to edit it), you don’t need to pay a high rate for human transcription. Machine learning has gotten very good at speech recognition, and while automated transcription services aren’t nearly as accurate, they’re good enough for many uses, and they’re a lot cheaper. Our favourite, and the one we use at Lifehacker, is Trint.
Courtesy of Jeff Kofman
Emmy-winning journalist Jeff Kofman has covered some of the world's biggest news stories for ABC, CBS, and BBC, including the Arab Spring, the Libyan Revolution, the invasion of Iraq, the trapped Chilean miners, and the international war on the drug trade.
Now he runs the company behind Trint, an A.I.-based automated transcription service that saves reporters, and many other workers, from hours of tedious work. He told us all about entering the business world, and learning how it works.
Trint was founded by war correspondent Jeff Kofman (whom I’ve interviewed for Lifehacker), and it’s calibrated to be useful for a lot of back-end transcription needs. It’s a great solution for journalists who need to hunt through a recording for a certain quote, or who want to publish a partial transcript, or want to build a full transcript in less time.
Trint charges 25 cents per minute of audio or video, a fourth of Rev’s rate. And because it’s automated, the turnaround time for a transcription is usually under an hour. Like Rev, Trint charges a flat rate regardless of multiple voices or accents, and optional timestamps are included.
If you transcribe over 3 hours of audio each month, you can subscribe for a slightly lower rate of 22 cents a minute. At 10+ hours per month, you pay just 20 cents per minute.
A subscription includes use of Trint’s excellent transcript editor, which is the real killer product.
Transcript Quality: Medium
At these rates, transcribing a weekly podcast is relatively affordable—but it will take some extra work. Trint’s automated transcripts aren’t nearly as accurate as anything human-made. They trip up easily on phone calls. But if you’ve got a direct recording of each side of the conversation—even with cheap mics—then Trint will get the transcript mostly right.
Here’s that same Cecil Baldwin excerpt, transcribed by Trint:
One annoyance is that Trint does catch every verbal tic. And when you read through a transcript, you might notice that plenty of intelligent, articulate people fill their speech with little ums and likes and you knows.
I’ve had to delete annoying filler words in every sentence he spoke. Barely noticeable in speech, highly noticeable in transcripts. (In fact, an overly literal transcript is a great way to make someone sound stupid, so a little editing is a matter of ethics.)
Trint’s editor is the service’s real advantage. Put your cursor anywhere in the transcript, and you can start playing audio from that exact point. This allows for quick precise editing, and helps you interpret the many baffling mis-transcriptions that Trint will make.
The editor lets you check off paragraphs as you finish them, add bookmarks and comments, and teach Rev specific words, names, and phrases. There’s also a find and replace function. There are granular playback controls with keyboard shortcuts, and once you learn them you’ll have a smooth workflow, with audio seamlessly attached to the written word. You can export text as docx, xml, edl for Premiere, an html player, an embeddable Trint player, or srt or vtt subtitles.
Even if you need a publication-worthy clean transcript, you’ll spend way less time editing a Trint transcript than would when manually transcribing audio from scratch. You’ll have to measure your own performance, but you’ll probably be paying a lot less for that saved time than your own hourly wage. And if you can use an organizational budget, it’s absolutely worth it. That’s why Lifehacker’s publisher, Gizmodo Media Group, bought a blanket Trint subscription for its reporters.
When You Just Need the Gist: CheapTranscriptions.io
Maybe you want a “good enough” transcript just to get the gist across. You don’t care about editing software, and you don’t mind if there’s at least one incorrect word in every sentence. Maybe you just need a reference so you can manually scrub through an audio interview for a particular quote.
Maybe you just want your podcast episode to show up in relevant Google searches. In that case, you can get away with using CheapTranscription.io, a new quick-and-dirty automatic transcription service from tech journalist and podcaster John Biggs and developer Tom Printy.
Price: Very low
CheapTranscription really is cheap, charging 10 cents ($0.14)/minute with a minimum order of 50 cents. Upload your audio (in our test, the site didn’t accept video), and after a few minutes you get a text transcript.
Right now there’s no export feature, so you’ll have to copy-paste your transcript into a text editor. And there are no timestamps. Speakers are separated, but given generic names.
Transcript Quality: Low
Anecdotally, CheapTranscription seems to make more mistakes than Trint, as would be expected from a new service. Like Trint, it includes every verbal tic, which makes reading a transcription very annoying. But you can still get the gist of a passage without any editing, and you can usually tell which words are mistakes.
Your transcript is just text on a static web page. Edit in Word or Google Docs or whatever you usually write in.
And that’s the real problem: Trint helps you correct its work. If you want to edit a CheapTranscription transcript, you’ll still want to use a specialised transcription editor like oTranscribe or Otter.
You’ll still get better results than the free option: running your audio through YouTube and grabbing the transcript. At that point, you’re getting near gibberish.
As of publication, this service is two days old, so expect it to get new features as Biggs and Printy keep developing. They’ve already announced plans to add human transcribers. Also expect an increase in transcription quality as they train their algorithms. Of course, this doesn’t mean they’ll ever catch up to Trint or other algorithmic services.
Only Buy What You Need
Rev and CheapTranscription have basically no minimum order, while Trint offers all its best features at $40/month. So you can put in orders at all three services as you need.
Most of the time, Trint is plenty for my purposes. (If I were paying out of pocket, I’d use it less often but still choose it over CheapTranscription.) But when I’m on a deadline with a lot of transcripts to process, I ask my editor to shell out for Rev.
The human transcription costs four times as much, but it’s incredibly accurate. Last year, my editor and I used Rev to churn out transcripts for four phone interviews, which we then edited into one long feature about the touring audio drama Welcome to Night Vale. Getting perfect transcripts saved us hours of work on a high-profile piece.