SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ManishThota/QueryRouter")
# Run inference
sentences = [
    'Research',
    'Can you provide the latest research insights on ABC?',
    'Who are the main rivals of ABC?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine nan
spearman_cosine nan
pearson_manhattan nan
spearman_manhattan nan
pearson_euclidean nan
spearman_euclidean nan
pearson_dot nan
spearman_dot nan
pearson_max nan
spearman_max nan

Training Details

Training Dataset

Unnamed Dataset

  • Size: 724 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 3 tokens
    • mean: 3.27 tokens
    • max: 4 tokens
    • min: 9 tokens
    • mean: 14.23 tokens
    • max: 29 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    Rating What rating does XYZ have? 1.0
    Rating Can you provide the rating for XYZ? 1.0
    Rating How is XYZ rated? 1.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 60 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 3 tokens
    • mean: 3.25 tokens
    • max: 4 tokens
    • min: 9 tokens
    • mean: 12.48 tokens
    • max: 20 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    Rating What is the current rating of ABC? 1.0
    Rating Can you tell me the rating for ABC? 1.0
    Rating What rating has ABC been assigned? 1.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • learning_rate: 2e-05
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • save_only_model: True
  • seed: 33
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: True
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 33
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss sts-dev_spearman_cosine
0.0220 2 - 0.0 nan
0.0440 4 - 0.0 nan
0.0659 6 - 0.0 nan
0.0879 8 - 0.0 nan
0.1099 10 - 0.0 nan
0.1319 12 - 0.0 nan
0.1538 14 - 0.0 nan
0.1758 16 - 0.0 nan
0.1978 18 - 0.0 nan
0.2198 20 - 0.0 nan
0.2418 22 - 0.0 nan
0.2637 24 - 0.0 nan
0.2857 26 - 0.0 nan
0.3077 28 - 0.0 nan
0.3297 30 - 0.0 nan
0.3516 32 - 0.0 nan
0.3736 34 - 0.0 nan
0.3956 36 - 0.0 nan
0.4176 38 - 0.0 nan
0.4396 40 - 0.0 nan
0.4615 42 - 0.0 nan
0.4835 44 - 0.0 nan
0.5055 46 - 0.0 nan
0.5275 48 - 0.0 nan
0.5495 50 - 0.0 nan
0.5714 52 - 0.0 nan
0.5934 54 - 0.0 nan
0.6154 56 - 0.0 nan
0.6374 58 - 0.0 nan
0.6593 60 - 0.0 nan
0.6813 62 - 0.0 nan
0.7033 64 - 0.0 nan
0.7253 66 - 0.0 nan
0.7473 68 - 0.0 nan
0.7692 70 - 0.0 nan
0.7912 72 - 0.0 nan
0.8132 74 - 0.0 nan
0.8352 76 - 0.0 nan
0.8571 78 - 0.0 nan
0.8791 80 - 0.0 nan
0.9011 82 - 0.0 nan
0.9231 84 - 0.0 nan
0.9451 86 - 0.0 nan
0.9670 88 - 0.0 nan
0.9890 90 - 0.0 nan
1.0110 92 - 0.0 nan
1.0330 94 - 0.0 nan
1.0549 96 - 0.0 nan
1.0769 98 - 0.0 nan
1.0989 100 - 0.0 nan
1.1209 102 - 0.0 nan
1.1429 104 - 0.0 nan
1.1648 106 - 0.0 nan
1.1868 108 - 0.0 nan
1.2088 110 - 0.0 nan
1.2308 112 - 0.0 nan
1.2527 114 - 0.0 nan
1.2747 116 - 0.0 nan
1.2967 118 - 0.0 nan
1.3187 120 - 0.0 nan
1.3407 122 - 0.0 nan
1.3626 124 - 0.0 nan
1.3846 126 - 0.0 nan
1.4066 128 - 0.0 nan
1.4286 130 - 0.0 nan
1.4505 132 - 0.0 nan
1.4725 134 - 0.0 nan
1.4945 136 - 0.0 nan
1.5165 138 - 0.0 nan
1.5385 140 - 0.0 nan
1.5604 142 - 0.0 nan
1.5824 144 - 0.0 nan
1.6044 146 - 0.0 nan
1.6264 148 - 0.0 nan
1.6484 150 - 0.0 nan
1.6703 152 - 0.0 nan
1.6923 154 - 0.0 nan
1.7143 156 - 0.0 nan
1.7363 158 - 0.0 nan
1.7582 160 - 0.0 nan
1.7802 162 - 0.0 nan
1.8022 164 - 0.0 nan
1.8242 166 - 0.0 nan
1.8462 168 - 0.0 nan
1.8681 170 - 0.0 nan
1.8901 172 - 0.0 nan
1.9121 174 - 0.0 nan
1.9341 176 - 0.0 nan
1.9560 178 - 0.0 nan
1.9780 180 - 0.0 nan
2.0 182 - 0.0 nan
2.0220 184 - 0.0 nan
2.0440 186 - 0.0 nan
2.0659 188 - 0.0 nan
2.0879 190 - 0.0 nan
2.1099 192 - 0.0 nan
2.1319 194 - 0.0 nan
2.1538 196 - 0.0 nan
2.1758 198 - 0.0 nan
2.1978 200 - 0.0 nan
2.2198 202 - 0.0 nan
2.2418 204 - 0.0 nan
2.2637 206 - 0.0 nan
2.2857 208 - 0.0 nan
2.3077 210 - 0.0 nan
2.3297 212 - 0.0 nan
2.3516 214 - 0.0 nan
2.3736 216 - 0.0 nan
2.3956 218 - 0.0 nan
2.4176 220 - 0.0 nan
2.4396 222 - 0.0 nan
2.4615 224 - 0.0 nan
2.4835 226 - 0.0 nan
2.5055 228 - 0.0 nan
2.5275 230 - 0.0 nan
2.5495 232 - 0.0 nan
2.5714 234 - 0.0 nan
2.5934 236 - 0.0 nan
2.6154 238 - 0.0 nan
2.6374 240 - 0.0 nan
2.6593 242 - 0.0 nan
2.6813 244 - 0.0 nan
2.7033 246 - 0.0 nan
2.7253 248 - 0.0 nan
2.7473 250 - 0.0 nan
2.7692 252 - 0.0 nan
2.7912 254 - 0.0 nan
2.8132 256 - 0.0 nan
2.8352 258 - 0.0 nan
2.8571 260 - 0.0 nan
2.8791 262 - 0.0 nan
2.9011 264 - 0.0 nan
2.9231 266 - 0.0 nan
2.9451 268 - 0.0 nan
2.9670 270 - 0.0 nan
2.9890 272 - 0.0 nan
3.0110 274 - 0.0 nan
3.0330 276 - 0.0 nan
3.0549 278 - 0.0 nan
3.0769 280 - 0.0 nan
3.0989 282 - 0.0 nan
3.1209 284 - 0.0 nan
3.1429 286 - 0.0 nan
3.1648 288 - 0.0 nan
3.1868 290 - 0.0 nan
3.2088 292 - 0.0 nan
3.2308 294 - 0.0 nan
3.2527 296 - 0.0 nan
3.2747 298 - 0.0 nan
3.2967 300 - 0.0 nan
3.3187 302 - 0.0 nan
3.3407 304 - 0.0 nan
3.3626 306 - 0.0 nan
3.3846 308 - 0.0 nan
3.4066 310 - 0.0 nan
3.4286 312 - 0.0 nan
3.4505 314 - 0.0 nan
3.4725 316 - 0.0 nan
3.4945 318 - 0.0 nan
3.5165 320 - 0.0 nan
3.5385 322 - 0.0 nan
3.5604 324 - 0.0 nan
3.5824 326 - 0.0 nan
3.6044 328 - 0.0 nan
3.6264 330 - 0.0 nan
3.6484 332 - 0.0 nan
3.6703 334 - 0.0 nan
3.6923 336 - 0.0 nan
3.7143 338 - 0.0 nan
3.7363 340 - 0.0 nan
3.7582 342 - 0.0 nan
3.7802 344 - 0.0 nan
3.8022 346 - 0.0 nan
3.8242 348 - 0.0 nan
3.8462 350 - 0.0 nan
3.8681 352 - 0.0 nan
3.8901 354 - 0.0 nan
3.9121 356 - 0.0 nan
3.9341 358 - 0.0 nan
3.9560 360 - 0.0 nan
3.9780 362 - 0.0 nan
4.0 364 - 0.0 nan
4.0220 366 - 0.0 nan
4.0440 368 - 0.0 nan
4.0659 370 - 0.0 nan
4.0879 372 - 0.0 nan
4.1099 374 - 0.0 nan
4.1319 376 - 0.0 nan
4.1538 378 - 0.0 nan
4.1758 380 - 0.0 nan
4.1978 382 - 0.0 nan
4.2198 384 - 0.0 nan
4.2418 386 - 0.0 nan
4.2637 388 - 0.0 nan
4.2857 390 - 0.0 nan
4.3077 392 - 0.0 nan
4.3297 394 - 0.0 nan
4.3516 396 - 0.0 nan
4.3736 398 - 0.0 nan
4.3956 400 - 0.0 nan
4.4176 402 - 0.0 nan
4.4396 404 - 0.0 nan
4.4615 406 - 0.0 nan
4.4835 408 - 0.0 nan
4.5055 410 - 0.0 nan
4.5275 412 - 0.0 nan
4.5495 414 - 0.0 nan
4.5714 416 - 0.0 nan
4.5934 418 - 0.0 nan
4.6154 420 - 0.0 nan
4.6374 422 - 0.0 nan
4.6593 424 - 0.0 nan
4.6813 426 - 0.0 nan
4.7033 428 - 0.0 nan
4.7253 430 - 0.0 nan
4.7473 432 - 0.0 nan
4.7692 434 - 0.0 nan
4.7912 436 - 0.0 nan
4.8132 438 - 0.0 nan
4.8352 440 - 0.0 nan
4.8571 442 - 0.0 nan
4.8791 444 - 0.0 nan
4.9011 446 - 0.0 nan
4.9231 448 - 0.0 nan
4.9451 450 - 0.0 nan
4.9670 452 - 0.0 nan
4.9890 454 - 0.0 nan
5.0110 456 - 0.0 nan
5.0330 458 - 0.0 nan
5.0549 460 - 0.0 nan
5.0769 462 - 0.0 nan
5.0989 464 - 0.0 nan
5.1209 466 - 0.0 nan
5.1429 468 - 0.0 nan
5.1648 470 - 0.0 nan
5.1868 472 - 0.0 nan
5.2088 474 - 0.0 nan
5.2308 476 - 0.0 nan
5.2527 478 - 0.0 nan
5.2747 480 - 0.0 nan
5.2967 482 - 0.0 nan
5.3187 484 - 0.0 nan
5.3407 486 - 0.0 nan
5.3626 488 - 0.0 nan
5.3846 490 - 0.0 nan
5.4066 492 - 0.0 nan
5.4286 494 - 0.0 nan
5.4505 496 - 0.0 nan
5.4725 498 - 0.0 nan
5.4945 500 0.0 0.0 nan
5.5165 502 - 0.0 nan
5.5385 504 - 0.0 nan
5.5604 506 - 0.0 nan
5.5824 508 - 0.0 nan
5.6044 510 - 0.0 nan
5.6264 512 - 0.0 nan
5.6484 514 - 0.0 nan
5.6703 516 - 0.0 nan
5.6923 518 - 0.0 nan
5.7143 520 - 0.0 nan
5.7363 522 - 0.0 nan
5.7582 524 - 0.0 nan
5.7802 526 - 0.0 nan
5.8022 528 - 0.0 nan
5.8242 530 - 0.0 nan
5.8462 532 - 0.0 nan
5.8681 534 - 0.0 nan
5.8901 536 - 0.0 nan
5.9121 538 - 0.0 nan
5.9341 540 - 0.0 nan
5.9560 542 - 0.0 nan
5.9780 544 - 0.0 nan
6.0 546 - 0.0 nan
6.0220 548 - 0.0 nan
6.0440 550 - 0.0 nan
6.0659 552 - 0.0 nan
6.0879 554 - 0.0 nan
6.1099 556 - 0.0 nan
6.1319 558 - 0.0 nan
6.1538 560 - 0.0 nan
6.1758 562 - 0.0 nan
6.1978 564 - 0.0 nan
6.2198 566 - 0.0 nan
6.2418 568 - 0.0 nan
6.2637 570 - 0.0 nan
6.2857 572 - 0.0 nan
6.3077 574 - 0.0 nan
6.3297 576 - 0.0 nan
6.3516 578 - 0.0 nan
6.3736 580 - 0.0 nan
6.3956 582 - 0.0 nan
6.4176 584 - 0.0 nan
6.4396 586 - 0.0 nan
6.4615 588 - 0.0 nan
6.4835 590 - 0.0 nan
6.5055 592 - 0.0 nan
6.5275 594 - 0.0 nan
6.5495 596 - 0.0 nan
6.5714 598 - 0.0 nan
6.5934 600 - 0.0 nan
6.6154 602 - 0.0 nan
6.6374 604 - 0.0 nan
6.6593 606 - 0.0 nan
6.6813 608 - 0.0 nan
6.7033 610 - 0.0 nan
6.7253 612 - 0.0 nan
6.7473 614 - 0.0 nan
6.7692 616 - 0.0 nan
6.7912 618 - 0.0 nan
6.8132 620 - 0.0 nan
6.8352 622 - 0.0 nan
6.8571 624 - 0.0 nan
6.8791 626 - 0.0 nan
6.9011 628 - 0.0 nan
6.9231 630 - 0.0 nan
6.9451 632 - 0.0 nan
6.9670 634 - 0.0 nan
6.9890 636 - 0.0 nan
7.0110 638 - 0.0 nan
7.0330 640 - 0.0 nan
7.0549 642 - 0.0 nan
7.0769 644 - 0.0 nan
7.0989 646 - 0.0 nan
7.1209 648 - 0.0 nan
7.1429 650 - 0.0 nan
7.1648 652 - 0.0 nan
7.1868 654 - 0.0 nan
7.2088 656 - 0.0 nan
7.2308 658 - 0.0 nan
7.2527 660 - 0.0 nan
7.2747 662 - 0.0 nan
7.2967 664 - 0.0 nan
7.3187 666 - 0.0 nan
7.3407 668 - 0.0 nan
7.3626 670 - 0.0 nan
7.3846 672 - 0.0 nan
7.4066 674 - 0.0 nan
7.4286 676 - 0.0 nan
7.4505 678 - 0.0 nan
7.4725 680 - 0.0 nan
7.4945 682 - 0.0 nan
7.5165 684 - 0.0 nan
7.5385 686 - 0.0 nan
7.5604 688 - 0.0 nan
7.5824 690 - 0.0 nan
7.6044 692 - 0.0 nan
7.6264 694 - 0.0 nan
7.6484 696 - 0.0 nan
7.6703 698 - 0.0 nan
7.6923 700 - 0.0 nan
7.7143 702 - 0.0 nan
7.7363 704 - 0.0 nan
7.7582 706 - 0.0 nan
7.7802 708 - 0.0 nan
7.8022 710 - 0.0 nan
7.8242 712 - 0.0 nan
7.8462 714 - 0.0 nan
7.8681 716 - 0.0 nan
7.8901 718 - 0.0 nan
7.9121 720 - 0.0 nan
7.9341 722 - 0.0 nan
7.9560 724 - 0.0 nan
7.9780 726 - 0.0 nan
8.0 728 - 0.0 nan
8.0220 730 - 0.0 nan
8.0440 732 - 0.0 nan
8.0659 734 - 0.0 nan
8.0879 736 - 0.0 nan
8.1099 738 - 0.0 nan
8.1319 740 - 0.0 nan
8.1538 742 - 0.0 nan
8.1758 744 - 0.0 nan
8.1978 746 - 0.0 nan
8.2198 748 - 0.0 nan
8.2418 750 - 0.0 nan
8.2637 752 - 0.0 nan
8.2857 754 - 0.0 nan
8.3077 756 - 0.0 nan
8.3297 758 - 0.0 nan
8.3516 760 - 0.0 nan
8.3736 762 - 0.0 nan
8.3956 764 - 0.0 nan
8.4176 766 - 0.0 nan
8.4396 768 - 0.0 nan
8.4615 770 - 0.0 nan
8.4835 772 - 0.0 nan
8.5055 774 - 0.0 nan
8.5275 776 - 0.0 nan
8.5495 778 - 0.0 nan
8.5714 780 - 0.0 nan
8.5934 782 - 0.0 nan
8.6154 784 - 0.0 nan
8.6374 786 - 0.0 nan
8.6593 788 - 0.0 nan
8.6813 790 - 0.0 nan
8.7033 792 - 0.0 nan
8.7253 794 - 0.0 nan
8.7473 796 - 0.0 nan
8.7692 798 - 0.0 nan
8.7912 800 - 0.0 nan
8.8132 802 - 0.0 nan
8.8352 804 - 0.0 nan
8.8571 806 - 0.0 nan
8.8791 808 - 0.0 nan
8.9011 810 - 0.0 nan
8.9231 812 - 0.0 nan
8.9451 814 - 0.0 nan
8.9670 816 - 0.0 nan
8.9890 818 - 0.0 nan
9.0110 820 - 0.0 nan
9.0330 822 - 0.0 nan
9.0549 824 - 0.0 nan
9.0769 826 - 0.0 nan
9.0989 828 - 0.0 nan
9.1209 830 - 0.0 nan
9.1429 832 - 0.0 nan
9.1648 834 - 0.0 nan
9.1868 836 - 0.0 nan
9.2088 838 - 0.0 nan
9.2308 840 - 0.0 nan
9.2527 842 - 0.0 nan
9.2747 844 - 0.0 nan
9.2967 846 - 0.0 nan
9.3187 848 - 0.0 nan
9.3407 850 - 0.0 nan
9.3626 852 - 0.0 nan
9.3846 854 - 0.0 nan
9.4066 856 - 0.0 nan
9.4286 858 - 0.0 nan
9.4505 860 - 0.0 nan
9.4725 862 - 0.0 nan
9.4945 864 - 0.0 nan
9.5165 866 - 0.0 nan
9.5385 868 - 0.0 nan
9.5604 870 - 0.0 nan
9.5824 872 - 0.0 nan
9.6044 874 - 0.0 nan
9.6264 876 - 0.0 nan
9.6484 878 - 0.0 nan
9.6703 880 - 0.0 nan
9.6923 882 - 0.0 nan
9.7143 884 - 0.0 nan
9.7363 886 - 0.0 nan
9.7582 888 - 0.0 nan
9.7802 890 - 0.0 nan
9.8022 892 - 0.0 nan
9.8242 894 - 0.0 nan
9.8462 896 - 0.0 nan
9.8681 898 - 0.0 nan
9.8901 900 - 0.0 nan
9.9121 902 - 0.0 nan
9.9341 904 - 0.0 nan
9.9560 906 - 0.0 nan
9.9780 908 - 0.0 nan
10.0 910 - 0.0 nan
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.0.1+cu118
  • Accelerate: 0.31.0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}
Downloads last month
16
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ManishThota/QueryRouter

Finetuned
(181)
this model

Evaluation results