SentenceTransformer based on sentence-transformers/multi-qa-MiniLM-L6-cos-v1

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-MiniLM-L6-cos-v1. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Bone Saw',
    'Bone Saw Sklar  Inch',
    'Mask Component  Headgear Opus',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 231,882 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 3 tokens
    • mean: 14.16 tokens
    • max: 49 tokens
    • min: 3 tokens
    • mean: 13.28 tokens
    • max: 53 tokens
  • Samples:
    anchor positive
    Biopsy Cassette Thermo Scientific Shandon Acetal Blue Biopsy Cassette Blue Acetal
    Tissue Cassette Thermo Scientific Shandon Acetal Fluorescent Green Tissue Cassette Fluorescent Green Acetal
    Tissue Cassette Thermo Scientific Shandon Acetal Fluorescent Pink Tissue Cassette Fluorescent Pink Acetal
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • num_train_epochs: 4
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0172 500 0.1383
0.0345 1000 0.1183
0.0517 1500 0.1054
0.0690 2000 0.0727
0.0862 2500 0.0829
0.1035 3000 0.0559
0.1207 3500 0.1274
0.1380 4000 0.0587
0.1552 4500 0.0704
0.1725 5000 0.0863
0.1897 5500 0.0888
0.2070 6000 0.1099
0.2242 6500 0.1126
0.2415 7000 0.1192
0.2587 7500 0.1082
0.2760 8000 0.1069
0.2932 8500 0.1268
0.3105 9000 0.0913
0.3277 9500 0.1267
0.3450 10000 0.1156
0.3622 10500 0.1522
0.3795 11000 0.088
0.3967 11500 0.0906
0.4140 12000 0.0776
0.4312 12500 0.0956
0.4485 13000 0.1111
0.4657 13500 0.0889
0.4830 14000 0.0765
0.5002 14500 0.1162
0.5175 15000 0.0581
0.5347 15500 0.0831
0.5520 16000 0.0915
0.5692 16500 0.0623
0.5865 17000 0.0702
0.6037 17500 0.0447
0.6210 18000 0.0715
0.6382 18500 0.0749
0.6555 19000 0.3381
0.6727 19500 0.0749
0.6900 20000 0.0614
0.7072 20500 0.1093
0.7245 21000 0.0847
0.7417 21500 0.063
0.7590 22000 0.0657
0.7762 22500 0.061
0.7935 23000 0.0837
0.8107 23500 0.0989
0.8280 24000 0.0523
0.8452 24500 0.0817
0.8625 25000 0.0533
0.8797 25500 0.0584
0.8970 26000 0.0353
0.9142 26500 0.0146
0.9315 27000 0.0831
0.9487 27500 0.049
0.9660 28000 0.0741
0.9832 28500 0.0469
1.0004 29000 0.063
1.0177 29500 0.0846
1.0349 30000 0.058
1.0522 30500 0.0701
1.0694 31000 0.0451
1.0867 31500 0.0506
1.1039 32000 0.0311
1.1212 32500 0.0761
1.1384 33000 0.0356
1.1557 33500 0.0387
1.1729 34000 0.0532
1.1902 34500 0.0568
1.2074 35000 0.0654
1.2247 35500 0.0726
1.2419 36000 0.0839
1.2592 36500 0.0698
1.2764 37000 0.0824
1.2937 37500 0.0832
1.3109 38000 0.0622
1.3282 38500 0.0849
1.3454 39000 0.0724
1.3627 39500 0.1039
1.3799 40000 0.0581
1.3972 40500 0.0561
1.4144 41000 0.0666
1.4317 41500 0.0687
1.4489 42000 0.0793
1.4662 42500 0.0638
1.4834 43000 0.0544
1.5007 43500 0.0686
1.5179 44000 0.0408
1.5352 44500 0.0602
1.5524 45000 0.0663
1.5697 45500 0.0488
1.5869 46000 0.047
1.6042 46500 0.0326
1.6214 47000 0.0644
1.6387 47500 0.0582
1.6559 48000 0.2124
1.6732 48500 0.0482
1.6904 49000 0.0389
1.7077 49500 0.0847
1.7249 50000 0.0636
1.7422 50500 0.044
1.7594 51000 0.0403
1.7767 51500 0.0397
1.7939 52000 0.0545
1.8112 52500 0.0681
1.8284 53000 0.0422
1.8456 53500 0.0522
1.8629 54000 0.0394
1.8801 54500 0.041
1.8974 55000 0.0232
1.9146 55500 0.0176
1.9319 56000 0.0471
1.9491 56500 0.0337
1.9664 57000 0.0439
1.9836 57500 0.0321
2.0008 58000 0.0433
2.0181 58500 0.0672
2.0353 59000 0.0441
2.0526 59500 0.0459
2.0698 60000 0.0342
2.0871 60500 0.0369
2.1043 61000 0.0205
2.1216 61500 0.0605
2.1388 62000 0.0252
2.1561 62500 0.0276
2.1733 63000 0.0406
2.1906 63500 0.0451
2.2078 64000 0.0447
2.2251 64500 0.0523
2.2423 65000 0.062
2.2596 65500 0.0514
2.2768 66000 0.0677
2.2941 66500 0.0655
2.3113 67000 0.0494
2.3286 67500 0.0728
2.3458 68000 0.0585
2.3631 68500 0.0866
2.3803 69000 0.0409
2.3976 69500 0.0429
2.4148 70000 0.0534
2.4321 70500 0.0542
2.4493 71000 0.0563
2.4666 71500 0.0488
2.4838 72000 0.0401
2.5011 72500 0.0575
2.5183 73000 0.0344
2.5356 73500 0.052
2.5528 74000 0.0569
2.5701 74500 0.0408
2.5873 75000 0.0384
2.6046 75500 0.0281
2.6218 76000 0.0447
2.6391 76500 0.0495
2.6563 77000 0.1492
2.6736 77500 0.0314
2.6908 78000 0.0314
2.7081 78500 0.0691
2.7253 79000 0.0496
2.7426 79500 0.0309
2.7598 80000 0.0323
2.7771 80500 0.0357
2.7943 81000 0.0387
2.8116 81500 0.0544
2.8288 82000 0.0297
2.8461 82500 0.0384
2.8633 83000 0.0332
2.8806 83500 0.031
2.8978 84000 0.017
2.9151 84500 0.0223
2.9323 85000 0.0271
2.9496 85500 0.0298
2.9668 86000 0.0297
2.9841 86500 0.026
3.0012 87000 0.0266
3.0185 87500 0.0531
3.0357 88000 0.0342
3.0530 88500 0.039
3.0702 89000 0.0263
3.0875 89500 0.0288
3.1047 90000 0.0158
3.1220 90500 0.0484
3.1392 91000 0.0179
3.1565 91500 0.0215
3.1737 92000 0.0316
3.1910 92500 0.0395
3.2082 93000 0.037
3.2255 93500 0.0389
3.2427 94000 0.0512
3.2600 94500 0.0451
3.2772 95000 0.0583
3.2945 95500 0.0502
3.3117 96000 0.0407
3.3290 96500 0.0628
3.3462 97000 0.0434
3.3635 97500 0.0741
3.3807 98000 0.0318
3.3980 98500 0.0387
3.4152 99000 0.041
3.4325 99500 0.0429
3.4497 100000 0.0514
3.4670 100500 0.0377
3.4842 101000 0.0355
3.5015 101500 0.043
3.5187 102000 0.029
3.5360 102500 0.047
3.5532 103000 0.0554
3.5705 103500 0.0385
3.5877 104000 0.0294
3.6050 104500 0.023
3.6222 105000 0.0381
3.6395 105500 0.0422
3.6567 106000 0.1091
3.6740 106500 0.0289
3.6912 107000 0.0276
3.7085 107500 0.0606
3.7257 108000 0.0402
3.7430 108500 0.0256
3.7602 109000 0.0279
3.7775 109500 0.0317
3.7947 110000 0.0303
3.8120 110500 0.0492
3.8292 111000 0.0239
3.8465 111500 0.0297
3.8637 112000 0.0293
3.8810 112500 0.0278
3.8982 113000 0.0134
3.9155 113500 0.0192
3.9327 114000 0.0235
3.9500 114500 0.0268
3.9672 115000 0.022
3.9845 115500 0.0235

Framework Versions

  • Python: 3.9.19
  • Sentence Transformers: 3.0.0
  • Transformers: 4.41.2
  • PyTorch: 2.3.0+cu121
  • Accelerate: 0.30.1
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
6
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for SwapnilVishwakarma/USMED

Finetuned
(14)
this model