Lab Notes: Audio Transcription Tools

By Nakiyyah Adams

How do you review hundreds of jail calls? What about 3 months of 24-hour video surveillance on a house? Twenty bodycameras covering  1-2 hours each of a multi-jurisdictional car chase and stop?

Audio transcription tools are one potential answer. These tools can analyze audio and video files and create text searchable transcripts. You can imagine the huge time-saving benefit here—instead of trying to watch or listen to hours and hours of recordings, only some of which might be relevant, a text searchable transcript could be searched using keywords, or even just read quickly in less time than experiencing the recording in real-time.

There are significant considerations besides efficiency. One is cost. Another is security—who owns the data once it is uploaded to the transcription service? How easily can it be accessed by nefarious actors? Where is the data going to be stored? Who is going to be doing the transcription? Yet another is quality and thus value—the transcript will only be as good as the audio submitted.

NLST does not have a national contract with any transcription company, and we do not take a position as to whether any of these companies is better than another. However, because we are frequently asked for information about audio transcription vendors, we have put together a non-exclusive list of different tools we have looked at and considerations for each. They are presented below in alphabetical order.  If you have a tool you have used and liked, please reach out to us and let us know!

JusticeText: https://justicetext.com/

JusticeText generates automatic transcripts from body-worn camera footage, interrogation videos, and jail calls using AI. The transcripts are timestamped, synched with the files, and searchable. Files can be clipped and presented, and the program allows for timestamped notetaking. They are partnered with numerous public defender agencies, including federal ones.

It is a cloud-based company, and videos that are uploaded are transcribed almost instantly. There is a built-in AI tool that can be asked defense-orientated questions that will point the user to relevant portions of video. Videos can be translated to 7 different languages including Spanish- English. A demo version is available to try before purchasing.

Reduct.Video: https://reduct.video/legal

Reduct offers both AI and human transcription (for an additional cost). The transcript can be searched in the program and text can be selected to download the corresponding video clips. Captions can be included. There is a synched multicam view option that allows the viewing of footage from different angles simultaneously:

Reduct also offers a live capture option for live proceedings that will produce an audio file and transcript immediately after the proceeding concludes. They offer a 14-day free trial with 2 hours of AI transcription included.

Rev: https://www.rev.com/

Rev offers both AI and human transcription services ranging from “instant rough draft” for AI-run transcription that can produce a rough transcript in 30-minutes or less to “ready to certify”  transcripts that can be signed, certified, and filed. Closed captioning and subtitling give the option of different languages. The program outsources to freelancers for the human-transcription, and rush orders, verbatim transcription, and timestamping can be done but add to the overall cost.

Sonix AI: https://sonix.ai

Sonix is an automated transcription in-browser editor tool that allows the user to search, play, edit, organize, and share transcripts. It offers automated translation, subtitle options, and a media player for sharing and publishing. Users can grant access to members to upload, comment, edit, or restrict access to certain files and folders. A free trial includes 30 minutes of transcription. Translation is a separate cost.

Subtitle Edit: https://www.nikse.dk/subtitleedit

Subtitle Edit is a free open-source editor tool that allows you to create and modify subtitles that are synchronized with video files. The main window looks like this:

Some features of the program include auto timing, batch processing, find and replace, merge/split of subtitles, network collaboration, OCR, and more. The translation tool is based on Google Translate. There is no speaker diarization and the transcript format cannot be changed easily.

Veritone: https://www.veritone.com

Veritone has three different platforms (Investigative, Illuminate, and Redact) that can help transcribe audio and video. “Investigate”  offers facial recognition, object tracking, transcription, and translation. “Illuminate” and “Redact” allow  searching within audio and video files. The transcription is done at the initial ingestion of the audio and video files into the platform. The enhanced searchability feature allows users to input a term and the tool will search the audio or video files for where the term is either mentioned or where it appears in the video file. It can search across all content and import a preset dictionary for things like slang,

Whisper by Open Ai: https://openai.com/index/whisper/

Whisper is a free open-source automatic speech recognition tool, It is trained on a dataset of roughly 700,000 hours of multilingual and multitask data collected from the web and offers transcription and translation into multiple languages. Whisper provides the code, but many features require a user to know how to program, and any computer used will need a good amount of processing capability—on average a minute of audio or video will take 3-5 minutes to transcribe.