w2v-bert-2.0-bemgen-combined-model

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the BEMGEN - BEM dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 3000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
No log	0.0516	100	0.8343	0.9377
No log	0.1031	200	0.9492	1.0774
No log	0.1547	300	0.9122	0.9494
No log	0.2063	400	0.8973	0.9051
1.0755	0.2579	500	0.8540	0.9089
1.0755	0.3094	600	0.9065	0.9213
1.0755	0.3610	700	0.7465	0.8448
1.0755	0.4126	800	0.7102	0.8322
1.0755	0.4642	900	0.6741	0.8340
0.6705	0.5157	1000	0.6682	0.8348
0.6705	0.5673	1100	0.6621	0.8139
0.6705	0.6189	1200	0.5506	0.7664
0.6705	0.6704	1300	0.5300	0.7415
0.6705	0.7220	1400	0.4942	0.7151
0.5147	0.7736	1500	0.4778	0.6796
0.5147	0.8252	1600	0.4969	0.7064
0.5147	0.8767	1700	0.4353	0.6733
0.5147	0.9283	1800	0.4286	0.6409
0.5147	0.9799	1900	0.4428	0.6467
0.4399	1.0315	2000	0.3634	0.5654
0.4399	1.0830	2100	0.3541	0.5706
0.4399	1.1346	2200	0.3472	0.5540
0.4399	1.1862	2300	0.3454	0.5528
0.4399	1.2378	2400	0.3253	0.5276
0.3065	1.2893	2500	0.3191	0.5279
0.3065	1.3409	2600	0.3047	0.5028
0.3065	1.3925	2700	0.2911	0.4922
0.3065	1.4440	2800	0.2828	0.4775
0.3065	1.4956	2900	0.2689	0.4666
0.2666	1.5472	3000	0.2647	0.4654