LayerNorm.__init__() got an unexpected keyword argument 'bias'
#65 opened 1 day ago
by
clabluo
ModernBert vs Bert for text classification
2
#64 opened 5 days ago
by
Joseph2805
Question about MLDR Evaluation Metrics in ModernBERT Paper
#62 opened 11 days ago
by
WoutDeRijck
![](https://cdn-avatars.huggingface.co/v1/production/uploads/64e9bfa02f70f2a4c74ac611/_fhUglaEGuQ-SkJmdekeo.png)
I have trained a multilingual version of ModernBert
#60 opened 11 days ago
by
neavo
![](https://cdn-avatars.huggingface.co/v1/production/uploads/66652c689f5a0b3229a43446/JiU3GGp0KNNx1iO7r6rZ6.jpeg)
nan or 0.0 loss when training with flash attention
16
#59 opened 11 days ago
by
roadtoagi
![](https://cdn-avatars.huggingface.co/v1/production/uploads/677a6a5ab06a2c07ece49e9d/JUYG31uT4i0SuYrbK2k7y.jpeg)
Modernbert with Golang
#58 opened 15 days ago
by
Thibault-Requesty
ModernBERT fails to work without FlashAttention !
3
#56 opened 18 days ago
by
benhachem
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1672318259412-noauth.jpeg)
Import fails on AWS lamba instance.
4
#55 opened 20 days ago
by
obeijbom
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6070c710227ff331937110ea/36xEaxRRjzXKQHDwiEF42.jpeg)
Performance vs the original architecture on approximate original data sizes (BooksCorpus/Wikipedia)
#54 opened 26 days ago
by
tollefj
Problem with highly padded sequences
4
#49 opened about 1 month ago
by
fmrs
Speed Benchmarks with MPS Backend
1
#47 opened about 1 month ago
by
mlburnham
Continual pre-training for multilingual support (extend embedding matrix and tokenizer)
1
#46 opened about 1 month ago
by
ibotana
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1677758110783-63ca56d80609f1def7e347cd.png)
Encountering Error: cannot import name 'shard_checkpoint' from 'transformers.modeling_utils'
1
#44 opened about 1 month ago
by
rkabir
ModernBertModel works on the CPU but fails on the GPU
1
#43 opened about 1 month ago
by
rudigung
ModernBERT-base-chinese
4
#42 opened about 1 month ago
by
ZBW
Error: RuntimeError: Failed to import transformers.models.modernbert.modeling_modernbert because of the following error (look up to see its traceback): Windows not yet supported for torch.compile
4
#40 opened about 1 month ago
by
JoAmps42i
ModernBART wen?
6
#38 opened about 1 month ago
by
Fizzarolli
![](https://cdn-avatars.huggingface.co/v1/production/uploads/634262af8d8089ebaefd410e/pr6KcEebXTo5V2XAlpQNw.png)
Pretraining Using HF Tokenizers and Transformers
2
#36 opened about 1 month ago
by
akhooli
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1599631923514-5f12e6030c833276f61f1b28.png)
Update README.md
1
#35 opened about 1 month ago
by
solankibhargav
![](https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/23HrDrHLEIoWwHHc7ZKWH.png)
Unpadding and Sequence Packing inference example?
2
#34 opened about 1 month ago
by
denti
Interview Request: Thoughts on Model Documentation
#33 opened about 1 month ago
by
evatang
Training Data?
2
#32 opened about 2 months ago
by
binarymax
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1644337656323-600efd5eb1845a1d4c71d813.png)
What is the position of this model in MTEB leaderboard?
3
#31 opened about 2 months ago
by
deepak-banka
tokenizer
1
#24 opened about 2 months ago
by
ulasarikaya
RuntimeError: Failed to import transformers.models.modernbert.modeling_modernbert
2
#21 opened about 2 months ago
by
SantoshHF
Pretraining data cutoff?
#17 opened about 2 months ago
by
ytsaig
How to use ModernBERT with the AutoModelForQuestionAnswering class?
3
#15 opened about 2 months ago
by
sraj
Is ModernBERT already fine-tuned for IR tasks?
4
#13 opened about 2 months ago
by
belerico
Question about output embedding vector of ModernBERT
#12 opened about 2 months ago
by
Youm9602
ModernBert for multi-vector embeddings
3
#11 opened about 2 months ago
by
admarcosai
How to use ModernBERT as a sentence transformer?
30
#9 opened about 2 months ago
by
hungrybiker
multilingual
3
#8 opened about 2 months ago
by
ale-volpe
Is this model meant for full bfloat16, AMP bfloat16 or no bfloat16?
2
#7 opened about 2 months ago
by
umarbutler
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/RG1zarVQK8PSeCPuKVoro.jpeg)
# Fine-tuning ModernBERT on a Large Dataset with Masked Language Modelling
1
#6 opened about 2 months ago
by
ssmits
Precisions about the config properties wrt the paper
1
#5 opened about 2 months ago
by
TomSchelsen
bug: model output logits have detached gradient
#4 opened about 2 months ago
by
andersonbcdefg
How to see which version of Transformers library is needed to get access to this model
16
#3 opened about 2 months ago
by
aero-artem