CrossEncoder based on answerdotai/ModernBERT-base

This is a Cross Encoder model finetuned from answerdotai/ModernBERT-base on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: answerdotai/ModernBERT-base
  • Maximum Sequence Length: 8192 tokens
  • Number of Output Labels: 1 label
  • Training Dataset:
  • Language: en

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-msmarco-v1.1-ModernBERT-base-cmnrl")
# Get scores for pairs...
pairs = [
    ['how much is a pelvic ultrasound', '1 The typical cost range is $250-$1,100, with a national average cost of $525, according to NewChoiceHealth.com. 2  For example, Concierge Medicine in California charges $275 for a pelvic ultrasound. 3  Baptist Memorial Health Care in Tennessee charges $395 for a transvaginal ultrasound. 1 And Saint Elizabeth Regional Medical Center in Nebraska, charges $240-$620 for a non-obstetric pelvic ultrasound, $484 for a transrectal ultrasound and $580 for a transvaginal ultrasound, not including the radiologist fee.'],
    ['what is in paella ingredients', '1 Heat 2 tablespoons olive oil in a large skillet or paella pan over medium heat. 2  Stir in garlic, red pepper flakes, and rice. 3  Cook, stirring, to coat rice with oil, about 3 minutes. 4  Stir in saffron threads, bay leaf, parsley, chicken stock, and lemon zest. Ready In. 1  In a medium bowl, mix together 2 tablespoons olive oil, paprika, oregano, and salt and pepper. 2  Stir in chicken pieces to coat. 3  Cover, and refrigerate. 4  Heat 2 tablespoons olive oil in a large skillet or paella pan over medium heat. 5'],
    ['what is the geocentric model', 'A Geocentric theory is an astronomical theory which describes the universe as a Geocentric system, i.e., a system which puts the Earth in the center of the universe, and describes other objects from the point of view of the Earth. '],
    ['how can you prevent osteoporosis', 'You can build strong bones and help prevent osteoporosis with weight-bearing exercise and a diet rich in calcium and vitamin D. Young women in particular need to be aware of their risk for osteoporosis. They can take steps early to slow its progress and prevent complications. '],
    ['spay neuter costs dogs', '1 Some clinics and animal hospitals can charge up to $200-$300 or more, depending on the weight of the dog. 2  The cost of both neutering and spaying vary greatly by geographic region, and even by veterinarian. 1 But lower cost sometimes means an assembly-line approach is used, so the dog might not get as much attention or recovery time. 2  Spay/USA has a referral service for reduced cost spay and neuter clinics. 3  In rare cases, such as with programs that use veterinary students, spaying and neutering can be free.'],
]
scores = model.predict(pairs)
print(scores.shape)
# [5]

# ... or rank different texts based on similarity to a single text
ranks = model.rank(
    'how much is a pelvic ultrasound',
    [
        '1 The typical cost range is $250-$1,100, with a national average cost of $525, according to NewChoiceHealth.com. 2  For example, Concierge Medicine in California charges $275 for a pelvic ultrasound. 3  Baptist Memorial Health Care in Tennessee charges $395 for a transvaginal ultrasound. 1 And Saint Elizabeth Regional Medical Center in Nebraska, charges $240-$620 for a non-obstetric pelvic ultrasound, $484 for a transrectal ultrasound and $580 for a transvaginal ultrasound, not including the radiologist fee.',
        '1 Heat 2 tablespoons olive oil in a large skillet or paella pan over medium heat. 2  Stir in garlic, red pepper flakes, and rice. 3  Cook, stirring, to coat rice with oil, about 3 minutes. 4  Stir in saffron threads, bay leaf, parsley, chicken stock, and lemon zest. Ready In. 1  In a medium bowl, mix together 2 tablespoons olive oil, paprika, oregano, and salt and pepper. 2  Stir in chicken pieces to coat. 3  Cover, and refrigerate. 4  Heat 2 tablespoons olive oil in a large skillet or paella pan over medium heat. 5',
        'A Geocentric theory is an astronomical theory which describes the universe as a Geocentric system, i.e., a system which puts the Earth in the center of the universe, and describes other objects from the point of view of the Earth. ',
        'You can build strong bones and help prevent osteoporosis with weight-bearing exercise and a diet rich in calcium and vitamin D. Young women in particular need to be aware of their risk for osteoporosis. They can take steps early to slow its progress and prevent complications. ',
        '1 Some clinics and animal hospitals can charge up to $200-$300 or more, depending on the weight of the dog. 2  The cost of both neutering and spaying vary greatly by geographic region, and even by veterinarian. 1 But lower cost sometimes means an assembly-line approach is used, so the dog might not get as much attention or recovery time. 2  Spay/USA has a referral service for reduced cost spay and neuter clinics. 3  In rare cases, such as with programs that use veterinary students, spaying and neutering can be free.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Metric NanoMSMARCO NanoNFCorpus NanoNQ
map 0.4642 (-0.0254) 0.3093 (+0.0389) 0.5583 (+0.1376)
mrr@10 0.4513 (-0.0262) 0.4855 (-0.0143) 0.5600 (+0.1333)
ndcg@10 0.5191 (-0.0213) 0.3283 (+0.0033) 0.6032 (+0.1025)

Cross Encoder Nano BEIR

Metric Value
map 0.4439 (+0.0504)
mrr@10 0.4989 (+0.0309)
ndcg@10 0.4835 (+0.0282)

Training Details

Training Dataset

ms_marco

  • Dataset: ms_marco at a47ee7a
  • Size: 82,326 training samples
  • Columns: query, positive, negative_1, negative_2, negative_3, negative_4, and negative_5
  • Approximate statistics based on the first 1000 samples:
    query positive negative_1 negative_2 negative_3 negative_4 negative_5
    type string string string string string string string
    details
    • min: 11 characters
    • mean: 34.43 characters
    • max: 92 characters
    • min: 72 characters
    • mean: 433.57 characters
    • max: 992 characters
    • min: 85 characters
    • mean: 415.68 characters
    • max: 1018 characters
    • min: 110 characters
    • mean: 413.13 characters
    • max: 902 characters
    • min: 65 characters
    • mean: 409.55 characters
    • max: 922 characters
    • min: 85 characters
    • mean: 415.51 characters
    • max: 866 characters
    • min: 109 characters
    • mean: 405.71 characters
    • max: 863 characters
  • Samples:
    query positive negative_1 negative_2 negative_3 negative_4 negative_5
    how soon do puppies start eating solid food Starting Puppies On Solid Food When puppies reach the age of four weeks, they can start to eat solid food. It is best to only supply the food for the puppies and allow mama dog to tech her babies what to do, and when to do it. 6 weeks you can start them on foods, and as soon as they are eating and no longer needing there mother they should be ready to sell...weve sold three out of our litter of 9 at 7 weeks old now :). ANSWER #13 of 13. Puppies can start eating sofr puppy food at 4 weeks. The entire process usually takes a little over a month or so, with many puppies not being completely weaned until they're about 8 weeks old. However, puppies can begin eating soft-textured foods as soon as weaning begins -- think 3 weeks old, for instance. The best time to introduce water and puppy food is around 3 to 4 weeks of age. This is the time to start to slowly wean the puppy from her mother’s milk and care so she can become more independent. Be patient and take your time with this. When to Start Weaning. Start introducing puppies to puppy food at the age of 3.5 weeks. This allows you to gradually transition them from their mother's milk to solid puppy food. By the time they are independent at 8 weeks old, they should be eating solid food. By the time the puppies are 6½-7 weeks of age, they should be fully weaned from the dam's milk, eating dry food, and drinking water. If the weaning is not rushed, she will naturally start decreasing milk production, as the puppies increase their intake of solid food.
    what age should a baby go into a jumperoo Pinterest0. With any baby toy that parents introduce to their newborn, safety is the highest priority. The introduction of the jumperoo to your baby is the same. Parents should wait until their baby can hold their head up without any assistance for a good 10-20 minutes. If their neck is not strong enough or if they get too tired then they are not ready for the jumperoo. Typically, newborns are able to reach this milestone around 3-4 months. For the parents that we asked, 4 to 6 months is the typical age they start using the jumperoo for their baby to play. There are many different variety of jumperoos from companies like Fisher Price, Baby Einstein, EvenFlo and Bright Starts. RE: Fisher Price Jumperoo-How old should baby be before I buy one? Thinking of getting one of these for our little one. He's 13 weeks, can hold his head steady and loves standing up when we hold him. Our son will be 10 months in 6 days and is 21 lbs and makes a god-awful racket in his, though to be fair it seems like this particular one (Fisher Price's Laugh & Learn farm theme jumperoo with the red barn arch) seems to make a TON of noise. The Fisher-Price Rainforest Jumperoo is a great alternative to doorway bouncers for your child's development, and a height-adjustable seat means that it will grow with baby. My son just turned 5 months and I put him in both a bouncer: fisher price jumperoo (when he was 4 1/2 and a walker). He loves the bouncer, and he just learned how to move in the walker 3 days ago. He can already roll over on his own and hold himself up with assistance. Witht he new Fisher-Price Go Wild Jumperoo baby will have a wild time laughing and jumping, with two ways to play! In Musical mode, a variety of toy stations and overhead toys let baby activate lights and music.
    what is tea tree oil uses for Tea Tree Oil (also known as Melaleuca) is a natural antibacterial disinfectant that was commonly used as a general antiseptic by the aborigine tribes for thousands of years. More recently, the scientific community has confirmed that Tea Tree Oil has tremendous medicinal benefits. Many people have also found that Tea Tree Oil can be used as a very effective treatment for Genital Warts. An effective remedy is to dip a cotton swab into the tea tree oil and gently apply it to the wart. Repeat this once a day for 10 day Acne 2 - a comparative study of tea-tree oil versus benzoyl peroxide in the treatment of acne found that 5% tea-tree oil and 5% benzoyl peroxide had a significant effect in ameliorating the patients' acne. Tea-tree oil and tea oil are completely different products. Tea oil is the sweet seasoning and cooking oil from pressed Camellia sinensis (beverage tea plant), or the tea oil plant Camellia oleifera. Acne Treatment. A common use of Tea Tree Oil is as a natural Acne treatment. An effective remedy is to dab a cotton swab into tea tree oil and then gently apply the oil onto affected areas before goin to sleep. In the morning, rinse off the oil and wash your face as usual. Many people have also found that Tea Tree Oil can be used as a very effective treatment for Genital Warts. An effective remedy is to dip a cotton swab into the tea tree oil and gently apply it to the wart. Repeat this once a day for 10 days 9. For dandruff and dry scalp. 10. In the form of aromatherapy, tea tree oil is used to treat colds, persistent coughs, acne, toothaches, and sunburn. For Cleaning. 11. To create an all-purpose cleaner, combine 2 teaspoons of tea tree oil in 2 cups of water in a spray bottle. 12 Tea tree oil, also known as Melaleuca alternifolia is, is an essential oil that has been around for quite a while but not until the past decade has its healing benefits been catching on like wildfire. The oil is extracted from a plant native to Australia and cannot be found naturally occurring elsewhere. Tea tree oil is a “jack of all trades” as far as remedies go. The easiest way to grasp the benefits is to think: skin issues + tea tree oil = healing (in most cases). Tea tree oil is known for its topical antiseptic and anti-fungal treatment or infection-reducing benefits. Tea tree oil is thought to have antiseptic properties and has been used to prevent and treat infections. Other traditional uses of tea tree oil include treatment of fungal infections (including fungal infections of the nails and athlete's foot), dental health, parasites, skin allergic reactions, and vaginal infections. In addition, there is evidence supporting tea tree oil use for acne; however, further research is needed.
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "num_negatives": 5
    }
    

Evaluation Dataset

ms_marco

  • Dataset: ms_marco at a47ee7a
  • Size: 82,326 evaluation samples
  • Columns: query, positive, negative_1, negative_2, negative_3, negative_4, and negative_5
  • Approximate statistics based on the first 1000 samples:
    query positive negative_1 negative_2 negative_3 negative_4 negative_5
    type string string string string string string string
    details
    • min: 11 characters
    • mean: 33.37 characters
    • max: 95 characters
    • min: 53 characters
    • mean: 435.97 characters
    • max: 939 characters
    • min: 63 characters
    • mean: 416.84 characters
    • max: 1010 characters
    • min: 54 characters
    • mean: 423.42 characters
    • max: 972 characters
    • min: 102 characters
    • mean: 422.71 characters
    • max: 934 characters
    • min: 69 characters
    • mean: 423.67 characters
    • max: 944 characters
    • min: 100 characters
    • mean: 418.21 characters
    • max: 936 characters
  • Samples:
    query positive negative_1 negative_2 negative_3 negative_4 negative_5
    how much is a pelvic ultrasound 1 The typical cost range is $250-$1,100, with a national average cost of $525, according to NewChoiceHealth.com. 2 For example, Concierge Medicine in California charges $275 for a pelvic ultrasound. 3 Baptist Memorial Health Care in Tennessee charges $395 for a transvaginal ultrasound. 1 And Saint Elizabeth Regional Medical Center in Nebraska, charges $240-$620 for a non-obstetric pelvic ultrasound, $484 for a transrectal ultrasound and $580 for a transvaginal ultrasound, not including the radiologist fee. 1. Transvaginal Ultrasound is an examination of the female pelvis and urogenital tract (kidneys and bladder). It helps to see if there is any abnormality in your uterus (or womb), cervix (the neck of the womb), endometrium (lining of the womb), fallopian tubes, ovaries, bladder and the pelvic cavity. Pelvic Ultrasound. Guide. A pelvic ultrasound uses sound waves to make a picture of the organs and structures in the lower belly (pelvis). A pelvic ultrasound looks at the bladder and: 1 The ovaries, uterus, cervix, and fallopian tubes of a woman (female organs). 2 The prostate gland and seminal vesicles of a man (male organs). Why It Is Done. For men and women, pelvic ultrasound may be done to: 1 Find the cause of blood in the urine (hematuria). 2 An ultrasound of the kidneys may also be done. 3 Find the cause of urinary problems. 4 Look at the size of the bladder before and after urination. 1 Dartmouth-Hitchcock Medical Center, in New Hampshire, charges $561 for a non-obstetric pelvic ultrasound, and $710 for an obstetric ultrasound, including the doctor fee, after an uninsured discount of 30%. 1 And Saint Elizabeth Regional Medical Center in Nebraska, charges $240-$620 for a non-obstetric pelvic ultrasound, $484 for a transrectal ultrasound and $580 for a transvaginal ultrasound, not including the radiologist fee. A pelvic ultrasound is a noninvasive (the skin is not pierced) procedure used to assess organs and structures within the female pelvis. A pelvic ultrasound allows quick visualization of the female pelvic organs and structures including the uterus, cervix, vagina, fallopian tubes, and ovaries. A pelvic ultrasound has no known risks. Typical costs: A pelvic ultrasound typically is covered by health insurance when ordered by a doctor for diagnosis of a problem. For patients covered by health insurance, out-of-pocket costs typically consist of a copay of $10 -$50 or more, or coinsurance of 10%-50% or more. For patients not covered by health insurance, the cost of a pelvic ultrasound typically varies by provider and geographic region. The typical cost range is $250 -$1,100, with a national average cost of $525, according to NewChoiceHealth.com. For example, Concierge Medicine in California charges $275 for a pelvic ultrasound. Baptist Memorial Health Care in Tennessee charges $395 for a transvaginal ultrasound
    what is in paella ingredients 1 Heat 2 tablespoons olive oil in a large skillet or paella pan over medium heat. 2 Stir in garlic, red pepper flakes, and rice. 3 Cook, stirring, to coat rice with oil, about 3 minutes. 4 Stir in saffron threads, bay leaf, parsley, chicken stock, and lemon zest. Ready In. 1 In a medium bowl, mix together 2 tablespoons olive oil, paprika, oregano, and salt and pepper. 2 Stir in chicken pieces to coat. 3 Cover, and refrigerate. 4 Heat 2 tablespoons olive oil in a large skillet or paella pan over medium heat. 5 Paella Ingredients. Great paella starts with great ingredients. From essentials like rice, saffron, and paprika to extras like piquillo peppers and chorizo, our mission is to deliver the best Spanish foods to your table. Bomba Rice (1/2 kilo to 1 kilo bag). Heat oil in a large nonstick skillet over medium-high heat. Sprinkle shrimp with 1/4 teaspoon salt and 1/8 teaspoon black pepper. Add shrimp to pan; saute sauté 4 minutes or until shrimp are. Done place shrimp in a medium. Bowl add chorizo to, pan and cook for 1 minute or until. browned Cover, reduce heat, and simmer 25 minutes or until rice is tender. Add shrimp mixture, peas, 1/4 cup water, and mussels to pan. Cover and cook 8 minutes over medium heat or until mussels open; discard any unopened shells. Remove from heat, and stir in bell pepper and cilantro. Sprinkle chicken with 1/4 teaspoon salt and remaining 1/8 teaspoon black pepper. Add chicken to pan, and cook for 2 minutes on each side or until browned. Add onion and garlic to pan; cook 2 minutes or until tender, stirring frequently. Stir in the tomato, capers, and saffron; cook 1 minute. Cover, reduce heat, and simmer 25 minutes or until rice is tender. Add shrimp mixture, peas, 1/4 cup water, and mussels to pan. Cover and cook 8 minutes over medium heat or until mussels open; discard any unopened shells. Remove from heat, and stir in bell pepper and cilantro. To prepare paella, combine water, saffron, and broth in a large saucepan. Bring to a simmer (do not boil). Keep warm over low heat. Peel and devein shrimp, leaving tails intact; set aside. Heat 1 tablespoon oil in a large paella pan or large skillet over medium-high heat. Add chicken; saute 2 minutes on each side. Remove from pan. Add sausage and prosciutto; saute 2 minutes. Remove from pan. Add shrimp, and saute 2 minutes. Remove from pan. Reduce heat to medium-low. Valencian paella is believed to be the original recipe and consists of white rice, green beans (bajoqueta and tavella), meat (chicken and rabbit), white beans (garrofon), garrofón, snails and seasoning such as saffron and. rosemary Consequently, paella recipes went from being relatively simple to including a wide variety of seafood, meat, sausage, (even chorizo) vegetables and many different seasonings. However, the most globally popular recipe is seafood paella.
    what is the geocentric model A Geocentric theory is an astronomical theory which describes the universe as a Geocentric system, i.e., a system which puts the Earth in the center of the universe, and describes other objects from the point of view of the Earth. In astronomy, the geocentric model (also known as geocentrism, or the Ptolemaic system) is a description of the cosmos where Earth is at the orbital center of all celestial bodies. The second observation supporting the geocentric model was that the Earth does not seem to move from the perspective of an Earth-bound observer, and that it is solid, stable, and unmoving. The geocentric theory was the model that the earth is at the center of the universe, and everything (sun, stars,moon,ect.) revolve around it daily. It was disproven about 500 … years ago by Copernipus in favor of the heliocentric model which said that the earth went around the sun. Geo means earth, and centric means centered.. The geocentric view is therefore the earth centered view. The word typically refers to the view of the earth as the ce … nter of the universe. The earth is the center of human interest. Beyond that the earth/moon system isn't the center of anything. Geocentric model. The geocentric model of the cosmos is a paradigm which places the Earth at the center of the universe. Common in ancient Greece, it was believed by both Aristotle and Ptolemy. Most Greeks assumed that the Sun, Moon, stars, and planets orbit Earth. Similar ideas were held in ancient China. The geocentric model was gradually replaced by the heliocentric model of Copernicus and Galileo due to the simplicity and predictive accuracy of that newer model. In this model, a set of fifty-five concentric crystalline spheres were considered to hold the Sun, the planets, and the stars. The geocentric model, also known as the Ptolemaic system, is a theory that was developed by philosophers in Ancient Greece and was named after the philosopher Claudius Ptolemy who lived circa 90 to 168 A.D. It was developed to explain how the planets, the Sun, and even the stars orbit around the Earth. Copernicus proposed a heliocentric model of the solar system – a model where everything orbited around the Sun. Today, with advancements in science and technology, the geocentric model seems preposterous. Simple tools, such as the telescope – which helped convince Galileo that the Earth was not the center of the universe – can prove that ancient theory incorrect The geocentric model of our solar system is how people believed the universe to be hundreds of years ago, in which the Earth was the center of the universe, and the sun and in … ner planets orbited the Earth. Wikipedia has plenty more information on this. Geo means earth, and centric means centered.. The geocentric view is therefore the earth centered view. The word typically refers to the view of the earth as the ce … nter of the universe. The earth is the center of human interest. Beyond that the earth/moon system isn't the center of anything.
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "num_negatives": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss NanoMSMARCO_ndcg@10 NanoNFCorpus_ndcg@10 NanoNQ_ndcg@10 NanoBEIR_mean_ndcg@10
-1 -1 - - 0.0063 (-0.5341) 0.1991 (-0.1260) 0.0144 (-0.4862) 0.0733 (-0.3821)
0.0020 1 9.6141 - - - - -
0.0396 20 5.8181 - - - - -
0.0792 40 2.3293 - - - - -
0.1188 60 2.1089 - - - - -
0.1584 80 1.7934 - - - - -
0.1980 100 1.6682 1.5977 0.3668 (-0.1736) 0.3009 (-0.0242) 0.4574 (-0.0432) 0.3750 (-0.0803)
0.2376 120 1.5908 - - - - -
0.2772 140 1.5464 - - - - -
0.3168 160 1.5186 - - - - -
0.3564 180 1.4791 - - - - -
0.3960 200 1.4319 1.4150 0.4541 (-0.0863) 0.3288 (+0.0038) 0.6130 (+0.1123) 0.4653 (+0.0099)
0.4356 220 1.4298 - - - - -
0.4752 240 1.4115 - - - - -
0.5149 260 1.3997 - - - - -
0.5545 280 1.3786 - - - - -
0.5941 300 1.3716 1.3619 0.4681 (-0.0724) 0.3412 (+0.0161) 0.5753 (+0.0746) 0.4615 (+0.0061)
0.6337 320 1.378 - - - - -
0.6733 340 1.3577 - - - - -
0.7129 360 1.2947 - - - - -
0.7525 380 1.3589 - - - - -
0.7921 400 1.3628 1.3226 0.5075 (-0.0329) 0.3316 (+0.0066) 0.6003 (+0.0997) 0.4798 (+0.0244)
0.8317 420 1.3227 - - - - -
0.8713 440 1.3347 - - - - -
0.9109 460 1.3518 - - - - -
0.9505 480 1.3185 - - - - -
0.9901 500 1.3089 1.3084 0.5191 (-0.0213) 0.3283 (+0.0033) 0.6032 (+0.1025) 0.4835 (+0.0282)
-1 -1 - - 0.5191 (-0.0213) 0.3283 (+0.0033) 0.6032 (+0.1025) 0.4835 (+0.0282)
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.10
  • Sentence Transformers: 3.5.0.dev0
  • Transformers: 4.49.0.dev0
  • PyTorch: 2.6.0.dev20241112+cu121
  • Accelerate: 1.2.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
0
Safetensors
Model size
150M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support sentence-transformers models with pipeline type text-classification

Model tree for tomaarsen/reranker-msmarco-v1.1-ModernBERT-base-cmnrl

Finetuned
(261)
this model

Dataset used to train tomaarsen/reranker-msmarco-v1.1-ModernBERT-base-cmnrl