How to install Kdenlive requirements for speech to text feature

The open source video editor Kdenlive supports speech to text. This neat feature automatically creates sub titles from the audio track.

You can enable "speech to text" in Kdenlive under Settings -> Configure Kdenlive -> Speech To Text.

However in order to enable this feature, a couple of requirements need to be installed first. The User Interface also informs about this.

Some required Python modules are missing in order to enable speech to text.

In this situation, Kdenlive requires additional Python modules:

The srt python module is required for automated subtitling
The vosk python module is required for speech features

If you are on a Debian or Debian-based Linux, such as Ubuntu or Linux Mint, you can install the Python srt module from the official APT repos:

ck@mint ~ $ sudo apt-get install python3-srt

The vosk module however is not available from the APT repositories. The easiest way to install this Python module is to use pip3:

ck@mint ~ $ sudo pip3 install vosk --break-system-packages
Installing collected packages: tqdm, pycparser, cffi, vosk
Successfully installed cffi-2.0.0 pycparser-2.23 tqdm-4.67.1 vosk-0.3.45

Back in Kdenlive, you can now scroll down a bit and click on the "Check configuration" button. Kdenlive then verifies the Python modules and should show a green background, that all requirements are installed.

All required Python modules were installed and Kdenlive can now use speech to text.

You can now proceed to install a speech model from the mentioned URL.

Claudio Kuenzler

Claudio already wrote way over 1000 articles on his own blog since 2008. He is fascinated by technology, especially Open Source Software. As a Senior Systems Engineer he has seen and solved a lot of problems - and writes about them.