Instructions to use saattrupdan/wav2vec2-xls-r-300m-ftspeech with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use saattrupdan/wav2vec2-xls-r-300m-ftspeech with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="saattrupdan/wav2vec2-xls-r-300m-ftspeech")# Load model directly from transformers import AutoProcessor, AutoModelForCTC processor = AutoProcessor.from_pretrained("saattrupdan/wav2vec2-xls-r-300m-ftspeech") model = AutoModelForCTC.from_pretrained("saattrupdan/wav2vec2-xls-r-300m-ftspeech") - Notebooks
- Google Colab
- Kaggle
XLS-R-300m-FTSpeech
Model description
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the FTSpeech dataset, being a dataset of 1,800 hours of transcribed speeches from the Danish parliament.
Performance
The model achieves the following WER scores (lower is better):
| Dataset | WER without LM | WER with 5-gram LM |
|---|---|---|
| Danish part of Common Voice 8.0 | 20.48 | 17.91 |
| Alvenir test set | 15.46 | 13.84 |
License
The use of this model needs to adhere to this license from the Danish Parliament.
- Downloads last month
- 196,800
Model tree for saattrupdan/wav2vec2-xls-r-300m-ftspeech
Base model
facebook/wav2vec2-xls-r-300mSpace using saattrupdan/wav2vec2-xls-r-300m-ftspeech 1
Evaluation results
- wer on Danish Common Voice 8.0self-reported17.910
- wer on Alvenir ASR test datasetself-reported13.840