SpeechCluster: now on PyPi!
Available for download here: https://pypi.org/project/speechcluster/
SpeechCluster is a library (set of modules) and a cli (set of terminal commands) that help automate some of the data handling and curation tasks related to building a speech database – i.e., a corpus of speech audio and matching transcriptions.
The new version 3.0 has two sets of updates: I have updated the main codepaths to work with Python 3.12, and I have added the cli. Now, instead of doing fake segmentaton by running a script like this:
$ segFake.py -d wav -c context.json -o TextGrid
… we can use the cli like this:
$ sc fake -d wav -c context.json -o TextGrid
Next steps
Any development work will include updating the codebase generally and expanding the tests into a comprehensive safety net.
The next main feature to add will be a force
subcommand, which will use PyTorch to do forced alignment on input data (eg pairs of audio & transcription files).
This will open the way to developing speechcluster as a cli frontend to PyTorch/JAX/etc.