|
--- |
|
library_name: transformers |
|
language: |
|
- sq |
|
license: mit |
|
base_model: openai/whisper-large-v3-turbo |
|
datasets: |
|
- Kushtrim/audioshqip-200h |
|
metrics: |
|
- wer |
|
model-index: |
|
- name: Whisper Large v3 Turbo Shqip |
|
results: |
|
- task: |
|
type: automatic-speech-recognition |
|
name: Automatic Speech Recognition |
|
dataset: |
|
name: Audio Shqip 200 orë |
|
type: Kushtrim/audioshqip-200h |
|
args: 'config: sq, split: test' |
|
metrics: |
|
- type: wer |
|
value: 19.891368436098556 |
|
name: Wer |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# Whisper Large V3 Turbo Shqip |
|
|
|
This model is a fine-tuned version of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) specifically for the Albanian language, including the Gheg dialect. It was trained on a meticulously curated dataset comprising 200 hours of high-quality Albanian audio. |
|
|
|
## Key Features |
|
- **Language Coverage**: Supports standard Albanian as well as the Gheg dialect, ensuring robust transcription performance across regional variations. |
|
- **Dataset**: Fine-tuned on 200 hours of diverse and well-annotated Albanian audio data, capturing a wide range of accents, speech contexts, and domains. |
|
|
|
This model is optimized for automatic speech recognition (ASR) tasks in Albanian and can be used in applications such as transcription, subtitling, and real-time speech processing. |
|
|