base_model: openai/whisper-large-v3 | |
datasets: | |
- bn | |
language: bn | |
library_name: transformers | |
license: apache-2.0 | |
model-index: | |
- name: Finetuned openai/whisper-large-v3 on Bengali | |
results: | |
- task: | |
type: automatic-speech-recognition | |
name: Speech-to-Text | |
dataset: | |
name: Common Voice (Bengali) | |
type: common_voice | |
metrics: | |
- type: wer | |
value: 9.651 | |
# Finetuned openai/whisper-large-v3 on 21409 Bengali training audio samples from cv-corpus-21.0-2025-03-14/bn. | |
This model was created from the Mozilla.ai Blueprint: | |
[speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune). | |
## Evaluation results on 9363 audio samples of Bengali: | |
### Baseline model (before finetuning) on Bengali | |
- Word Error Rate (Normalized): 55.463 | |
- Word Error Rate (Orthographic): 83.344 | |
- Character Error Rate (Normalized): 35.66 | |
- Character Error Rate (Orthographic): 40.754 | |
- Loss: 0.567 | |
### Finetuned model (after finetuning) on Bengali | |
- Word Error Rate (Normalized): 9.651 | |
- Word Error Rate (Orthographic): 24.288 | |
- Character Error Rate (Normalized): 4.876 | |
- Character Error Rate (Orthographic): 6.312 | |
- Loss: 0.092 | |