Automatic Speech Recognition
TensorBoard
Safetensors
Welsh
wav2vec2
Generated from Trainer
DewiBrynJones commited on
Commit
b594d32
·
verified ·
1 Parent(s): 1c61e48

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -63
README.md CHANGED
@@ -3,77 +3,28 @@ license: apache-2.0
3
  base_model: facebook/wav2vec2-large-xlsr-53
4
  tags:
5
  - automatic-speech-recognition
6
- - ./data-configs/btb-cv.json
7
  - generated_from_trainer
8
  metrics:
9
  - wer
10
  model-index:
11
  - name: wav2vec2-xlsr-53-ft-btb-cv-cy-cand
12
  results: []
 
 
 
 
 
 
13
  ---
14
 
15
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
- should probably proofread and complete it, then remove this comment. -->
17
 
18
- # wav2vec2-xlsr-53-ft-btb-cv-cy-cand
 
 
19
 
20
- This model is a fine-tuned version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on an unknown dataset.
21
- It achieves the following results on the evaluation set:
22
- - Loss: inf
23
- - Wer: 0.3598
24
 
25
- ## Model description
26
-
27
- More information needed
28
-
29
- ## Intended uses & limitations
30
-
31
- More information needed
32
-
33
- ## Training and evaluation data
34
-
35
- More information needed
36
-
37
- ## Training procedure
38
-
39
- ### Training hyperparameters
40
-
41
- The following hyperparameters were used during training:
42
- - learning_rate: 0.0003
43
- - train_batch_size: 8
44
- - eval_batch_size: 64
45
- - seed: 42
46
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
- - lr_scheduler_type: linear
48
- - lr_scheduler_warmup_steps: 800
49
- - training_steps: 8000
50
- - mixed_precision_training: Native AMP
51
-
52
- ### Training results
53
-
54
- | Training Loss | Epoch | Step | Validation Loss | Wer |
55
- |:-------------:|:------:|:----:|:---------------:|:------:|
56
- | 4.6973 | 0.0714 | 500 | inf | 1.0 |
57
- | 1.448 | 0.1428 | 1000 | inf | 0.7574 |
58
- | 1.053 | 0.2142 | 1500 | inf | 0.6584 |
59
- | 0.9304 | 0.2856 | 2000 | inf | 0.5963 |
60
- | 0.8755 | 0.3569 | 2500 | inf | 0.5946 |
61
- | 0.8238 | 0.4283 | 3000 | inf | 0.5392 |
62
- | 0.7819 | 0.4997 | 3500 | inf | 0.4967 |
63
- | 0.729 | 0.5711 | 4000 | inf | 0.4834 |
64
- | 0.6923 | 0.6425 | 4500 | inf | 0.4564 |
65
- | 0.7052 | 0.7139 | 5000 | inf | 0.4346 |
66
- | 0.6675 | 0.7853 | 5500 | inf | 0.4163 |
67
- | 0.6217 | 0.8567 | 6000 | inf | 0.3962 |
68
- | 0.5954 | 0.9280 | 6500 | inf | 0.3883 |
69
- | 0.5687 | 0.9994 | 7000 | inf | 0.3746 |
70
- | 0.477 | 1.0708 | 7500 | inf | 0.3647 |
71
- | 0.4804 | 1.1422 | 8000 | inf | 0.3598 |
72
-
73
-
74
- ### Framework versions
75
-
76
- - Transformers 4.44.0
77
- - Pytorch 2.4.0+cu121
78
- - Datasets 2.21.0
79
- - Tokenizers 0.19.1
 
3
  base_model: facebook/wav2vec2-large-xlsr-53
4
  tags:
5
  - automatic-speech-recognition
 
6
  - generated_from_trainer
7
  metrics:
8
  - wer
9
  model-index:
10
  - name: wav2vec2-xlsr-53-ft-btb-cv-cy-cand
11
  results: []
12
+ datasets:
13
+ - techiaith/commonvoice_18_0_cy
14
+ - techiaith/banc-trawsgrifiadau-bangor
15
+ language:
16
+ - cy
17
+ pipeline_tag: automatic-speech-recognition
18
  ---
19
 
20
+ # wav2vec2-xlsr-53-ft-btb-cv-cy
 
21
 
22
+ This model is a version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
23
+ fine-tuned on Welsh language read-speech from [commonvoice_18_cy](https://huggingface.co/datasets/techiaith/commonvoice_18_0_cy)
24
+ and spontaneous speech from [Bangor Transcriptions 24.10](https://huggingface.co/datasets/techiaith/banc-trawsgrifiadau-bangor/tree/24.10)
25
 
26
+ It achieves the following WER results on the Bangor Transcriptions test set:
 
 
 
27
 
28
+ - WER: 36.23
29
+ - CER: 12.55
30
+