w2v-bert-2.0-bemgen-combined-model

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the BEMGEN - BEM dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2647
  • Wer: 0.4656

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 3000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
No log 0.0516 100 0.8343 0.9377
No log 0.1031 200 0.9492 1.0774
No log 0.1547 300 0.9122 0.9494
No log 0.2063 400 0.8973 0.9051
1.0755 0.2579 500 0.8540 0.9089
1.0755 0.3094 600 0.9065 0.9213
1.0755 0.3610 700 0.7465 0.8448
1.0755 0.4126 800 0.7102 0.8322
1.0755 0.4642 900 0.6741 0.8340
0.6705 0.5157 1000 0.6682 0.8348
0.6705 0.5673 1100 0.6621 0.8139
0.6705 0.6189 1200 0.5506 0.7664
0.6705 0.6704 1300 0.5300 0.7415
0.6705 0.7220 1400 0.4942 0.7151
0.5147 0.7736 1500 0.4778 0.6796
0.5147 0.8252 1600 0.4969 0.7064
0.5147 0.8767 1700 0.4353 0.6733
0.5147 0.9283 1800 0.4286 0.6409
0.5147 0.9799 1900 0.4428 0.6467
0.4399 1.0315 2000 0.3634 0.5654
0.4399 1.0830 2100 0.3541 0.5706
0.4399 1.1346 2200 0.3472 0.5540
0.4399 1.1862 2300 0.3454 0.5528
0.4399 1.2378 2400 0.3253 0.5276
0.3065 1.2893 2500 0.3191 0.5279
0.3065 1.3409 2600 0.3047 0.5028
0.3065 1.3925 2700 0.2911 0.4922
0.3065 1.4440 2800 0.2828 0.4775
0.3065 1.4956 2900 0.2689 0.4666
0.2666 1.5472 3000 0.2647 0.4654

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for csikasote/w2v-bert-2.0-bemgen-combined-model

Finetuned
(237)
this model