Prepis zvukových nahrávok do textovej podoby
Loading...
Downloads
10
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská – Technická univerzita Ostrava
Location
Signature
Abstract
This thesis deals with methods for transcribing spoken speech into text and training the Whisper speech recognition model for Slovak. The goal was to develop a model capable of efficiently processing natural spoken Slovak with varying sentence length and speech rate. Publicly available data from the Common Voice project and our own collection of recordings were used for training. The data was properly preprocessed for training purposes.The training was performed using the Transformers library. The resulting model was evaluated on the basis of recognition accuracy (WER and CER) and shows improvement in the Slovak domain compared to existing pre-trained models.
Description
Subject(s)
speech recognition, Slovak language, Whisper, machine learning, audio processing, Hugging Face Transformers, training, Common Voice