CrossEncoder based on bansalaman18/bert-uncased_L-8_H-768_A-12

This is a Cross Encoder model finetuned from bansalaman18/bert-uncased_L-8_H-768_A-12 on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-8_H-768_A-12-listnet")
# Get scores for pairs of texts
pairs = [
    ['do mice come out during the day or night', 'Making the world better, one answer at a time. Most definitely. Although mice tend to come out in the night when most of the occupants of the house are asleep or immobile. This allows them to stealthily infiltrate any of your food bags, crumbs etc. During the day time, mice will normally not come out, but if opportunities exist then they surely can.'],
    ['do mice come out during the day or night', "It is really hot during the day but it is much cooler during the night so that's why rats stay underground during the hot day and come out during the cooler night."],
    ['do mice come out during the day or night', 'Rats and mice are usually active at night and are not likely to gain entry through open doors and windows during the day. The exception might be when they are under real pressure to find food and shelter. Then they may take the risk of venturing out'],
    ['do mice come out during the day or night', 'Hypothesis. The activity levels of the mice will increase when tested at night, when the lights are off, when compared to during the day when the lights are on. Activity level is defined by number of rotations the mice do in a 10-minute period.'],
    ['do mice come out during the day or night', 'If you do see a mouse during the day, you have a real problem. They usually only come out at night, and only during daylight, if they are having trouble getting food. The daytime mice you see, are not the aggressive mice in the group, but the ones that have to wait for the big guys to eat first. Take action or move out.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'do mice come out during the day or night',
    [
        'Making the world better, one answer at a time. Most definitely. Although mice tend to come out in the night when most of the occupants of the house are asleep or immobile. This allows them to stealthily infiltrate any of your food bags, crumbs etc. During the day time, mice will normally not come out, but if opportunities exist then they surely can.',
        "It is really hot during the day but it is much cooler during the night so that's why rats stay underground during the hot day and come out during the cooler night.",
        'Rats and mice are usually active at night and are not likely to gain entry through open doors and windows during the day. The exception might be when they are under real pressure to find food and shelter. Then they may take the risk of venturing out',
        'Hypothesis. The activity levels of the mice will increase when tested at night, when the lights are off, when compared to during the day when the lights are on. Activity level is defined by number of rotations the mice do in a 10-minute period.',
        'If you do see a mouse during the day, you have a real problem. They usually only come out at night, and only during daylight, if they are having trouble getting food. The daytime mice you see, are not the aggressive mice in the group, but the ones that have to wait for the big guys to eat first. Take action or move out.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

  • Datasets: NanoMSMARCO_R100, NanoNFCorpus_R100 and NanoNQ_R100
  • Evaluated with CrossEncoderRerankingEvaluator with these parameters:
    {
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric NanoMSMARCO_R100 NanoNFCorpus_R100 NanoNQ_R100
map 0.1044 (-0.3852) 0.2809 (+0.0200) 0.0454 (-0.3743)
mrr@10 0.0800 (-0.3975) 0.4532 (-0.0466) 0.0212 (-0.4055)
ndcg@10 0.0897 (-0.4507) 0.2917 (-0.0334) 0.0288 (-0.4718)

Cross Encoder Nano BEIR

  • Dataset: NanoBEIR_R100_mean
  • Evaluated with CrossEncoderNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "msmarco",
            "nfcorpus",
            "nq"
        ],
        "rerank_k": 100,
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric Value
map 0.1436 (-0.2465)
mrr@10 0.1848 (-0.2832)
ndcg@10 0.1368 (-0.3186)

Training Details

Training Dataset

ms_marco

  • Dataset: ms_marco at a47ee7a
  • Size: 78,704 training samples
  • Columns: query, docs, and labels
  • Approximate statistics based on the first 1000 samples:
    query docs labels
    type string list list
    details
    • min: 10 characters
    • mean: 33.52 characters
    • max: 99 characters
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
  • Samples:
    query docs labels
    what are zodiac boats made of ["Most foul weather gear is 220 or 440 denier. Zodiac boats are made of 1000 denier and up fabric (the HD's are as high as 1800). Some other companies vary the weight of the fabric with the weight of the boat. Some light duty boats are 200 denier. The tightness of the weave is another measure. Zodiac uses a polyurethane fabric called Strongan and assembles their inflatable boats by thermobonding the fabric. HAND-GLUED SEAMS = HYPALON FABRIC Traditional assembly method for hypalon fabric. The 2 panels are glued, one overlapping the other. You will only see part of fabric that covers the other section.", 'An inflatable boat is a lightweight boat constructed with its sides and bow made of flexible tubes containing pressurised gas. Meanwhile, in France a very similar pattern was emerging. The airship company Zodiac began to develop inflatable rubber boats and in 1934 they invented the inflatable kayak and catamaran which led on to the modern day Zodiac inflatable boat.', 'Mastering the Ele... [1, 0, 0, 0, 0, ...]
    what is a capping inversion ['Report Abuse. A capping inversion is usually a situation where a layer of relatively warm air exists over a cooler lower air layer. Recall that an inversion is a layer of warm air above a cooler layer. Since the inversion caps the air, keeping it from rising, it suppresses convection. That said, this can be a good thing. The capping inversion can keep storms from developing early on in the day, allowing the surface to warm and become more unstable, and in some cases allow more moisture to advect in.', 'In meteorology, an inversion is a deviation from the normal change of an atmospheric property with altitude. It almost always refers to a temperature inversion, i.e. an increase in temperature with height, or to the layer (inversion layer) within which such an increase occurs. Sometimes the inversion layer is at a high enough altitude that cumulus clouds can condense but can only spread out under the inversion layer. This decreases the amount of sunlight reaching the ground and prevent... [1, 0, 0, 0, 0, ...]
    to be born of water and the spirit what does it mean ['“Born of water and Spirit” occurs as a reiteration of John 3:3’s phrase “born again”. The word, “again” possess two meanings. Though Nicodemus translates the word as “a second time,” the word also means “from above.” It is this later interpretation, which Jesus seems to intend. In other words, we are not to take this is first you must be born of water and then of spirit; rather, unless one is born of water and spirit in v5 is parallel to unless one is born again in v3.', 'In verse 5, Jesus proceeds to say, Unless one is born of water and the Spirit, you cannot enter the kingdom of God. Nicodemus, who was a Pharisee, believed like the other Jews that because he was born a Jew and kept God s ordinances that he should automatically enter into the kingdom of God. You must be born again that which is born of the Spirit is spirit.. The new birth from above is a second birth which gives us eternal life. V.5 The new birth is invisible, he likens it to the wind. It is not from the water benea... [1, 0, 0, 0, 0, ...]
  • Loss: ListNetLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "mini_batch_size": 16
    }
    

Evaluation Dataset

ms_marco

  • Dataset: ms_marco at a47ee7a
  • Size: 1,000 evaluation samples
  • Columns: query, docs, and labels
  • Approximate statistics based on the first 1000 samples:
    query docs labels
    type string list list
    details
    • min: 11 characters
    • mean: 32.97 characters
    • max: 75 characters
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
  • Samples:
    query docs labels
    do mice come out during the day or night ['Making the world better, one answer at a time. Most definitely. Although mice tend to come out in the night when most of the occupants of the house are asleep or immobile. This allows them to stealthily infiltrate any of your food bags, crumbs etc. During the day time, mice will normally not come out, but if opportunities exist then they surely can.', "It is really hot during the day but it is much cooler during the night so that's why rats stay underground during the hot day and come out during the cooler night.", 'Rats and mice are usually active at night and are not likely to gain entry through open doors and windows during the day. The exception might be when they are under real pressure to find food and shelter. Then they may take the risk of venturing out', 'Hypothesis. The activity levels of the mice will increase when tested at night, when the lights are off, when compared to during the day when the lights are on. Activity level is defined by number of rotations the mice do in a 10-minute period.', 'If you do see a mouse during the day, you have a real problem. They usually only come out at night, and only during daylight, if they are having trouble getting food. The daytime mice you see, are not the aggressive mice in the group, but the ones that have to wait for the big guys to eat first. Take action or move out.'] [1, 0, 0, 0, 0]
    what is an association area of the cerebral cortex ["Association Cortex Association cortex is the cerebral cortex outside the primary areas (Figure 1). It is essential for mental functions that are more complex than detecting basic dimensions of sensory stimulation, for which primary sensory areas appear to be necessary. The surface area of the human cerebral cortex (and monkey's, dog's, and horse's as well) is further enlarged by the sulci and gyri shown by the curving lines on the cerebral hemispheres. Figure 1-2e. Primary and association cortical areas in human and rat. The pink area, which shows the association cortex, is much larger and also takes up a much larger percentage of cortex in the human than in the rat cerebral hemisphere", "Association areas take up an increasingly larger percentage of the cerebral cortex as brain size increases among different species. Figure 2-1e illustrates the increase in relative size of association areas as the brain gets bigger. Association cortex is shown as the pink area outside the primary co... [1, 0, 0, 0, 0, ...]
    definition of vitamin c ['Definition of VITAMIN C. : a water-soluble vitamin C6H8O6 found in plants and especially in fruits and leafy vegetables or made synthetically and used in the prevention and treatment of scurvy and as an antioxidant for foods —called also ascorbic acid. See vitamin C defined for kids. ADVERTISEMENT.', "The name 'vitamin C' always refers to the L-enantiomer of ascorbic acid and its oxidized forms. The opposite D-enantiomer called D-ascorbate has equal antioxidant power, but is not found in nature, and has no physiological significance. Vitamin C is a cofactor in at least eight enzymatic reactions, including several collagen synthesis reactions that, when dysfunctional, cause the most severe symptoms of scurvy. In animals, these reactions are especially important in wound-healing and in preventing bleeding from capillaries.", 'Vitamin C or L-ascorbic acid, or simply ascorbate (the anion of ascorbic acid), is an essential nutrient for humans and certain other animal species. Vitamin C is... [1, 0, 0, 0, 0, ...]
  • Loss: ListNetLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "mini_batch_size": 16
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss NanoMSMARCO_R100_ndcg@10 NanoNFCorpus_R100_ndcg@10 NanoNQ_R100_ndcg@10 NanoBEIR_R100_mean_ndcg@10
-1 -1 - - 0.0559 (-0.4845) 0.2567 (-0.0684) 0.0097 (-0.4909) 0.1074 (-0.3480)
0.0002 1 2.0831 - - - - -
0.0508 250 2.099 - - - - -
0.1016 500 2.0925 2.0983 0.0555 (-0.4849) 0.2598 (-0.0652) 0.0494 (-0.4513) 0.1216 (-0.3338)
0.1525 750 2.0823 - - - - -
0.2033 1000 2.0891 2.0973 0.0679 (-0.4725) 0.2620 (-0.0630) 0.0496 (-0.4511) 0.1265 (-0.3289)
0.2541 1250 2.0855 - - - - -
0.3049 1500 2.0898 2.0964 0.0897 (-0.4507) 0.2917 (-0.0334) 0.0288 (-0.4718) 0.1368 (-0.3186)
0.3558 1750 2.091 - - - - -
0.4066 2000 2.0875 2.0961 0.0698 (-0.4706) 0.2837 (-0.0414) 0.0467 (-0.4540) 0.1334 (-0.3220)
0.4574 2250 2.086 - - - - -
0.5082 2500 2.0835 2.0952 0.0526 (-0.4878) 0.2637 (-0.0614) 0.0447 (-0.4559) 0.1203 (-0.3350)
0.5591 2750 2.0873 - - - - -
0.6099 3000 2.0832 2.0951 0.0586 (-0.4818) 0.2491 (-0.0759) 0.0501 (-0.4505) 0.1193 (-0.3361)
0.6607 3250 2.0844 - - - - -
0.7115 3500 2.0839 2.0950 0.0613 (-0.4791) 0.2518 (-0.0732) 0.0444 (-0.4562) 0.1192 (-0.3362)
0.7624 3750 2.0875 - - - - -
0.8132 4000 2.0905 2.0951 0.0674 (-0.4730) 0.2587 (-0.0663) 0.0417 (-0.4589) 0.1226 (-0.3328)
0.8640 4250 2.0858 - - - - -
0.9148 4500 2.0867 2.0948 0.0673 (-0.4731) 0.2449 (-0.0802) 0.0441 (-0.4566) 0.1187 (-0.3366)
0.9656 4750 2.084 - - - - -
-1 -1 - - 0.0897 (-0.4507) 0.2917 (-0.0334) 0.0288 (-0.4718) 0.1368 (-0.3186)
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.18
  • Sentence Transformers: 5.0.0
  • Transformers: 4.56.0.dev0
  • PyTorch: 2.7.1+cu126
  • Accelerate: 1.9.0
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ListNetLoss

@inproceedings{cao2007learning,
    title={Learning to Rank: From Pairwise Approach to Listwise Approach},
    author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
    booktitle={Proceedings of the 24th international conference on Machine learning},
    pages={129--136},
    year={2007}
}
Downloads last month
9
Safetensors
Model size
81.1M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-8_H-768_A-12-listnet

Finetuned
(1)
this model

Dataset used to train rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-8_H-768_A-12-listnet

Evaluation results