SentenceTransformer based on dunzhang/stella_en_400M_v5
This is a sentence-transformers model finetuned from dunzhang/stella_en_400M_v5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: dunzhang/stella_en_400M_v5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: NewModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 1024, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: Title: \nText: what was the average for "other" loans held in 2012 and 2011?',
'Title: \nText: LOANS HELD FOR SALE Table 15: Loans Held For Sale\n| In millions | December 312012 | December 312011 |\n| Commercial mortgages at fair value | $772 | $843 |\n| Commercial mortgages at lower of cost or market | 620 | 451 |\n| Total commercial mortgages | 1,392 | 1,294 |\n| Residential mortgages at fair value | 2,096 | 1,415 |\n| Residential mortgages at lower of cost or market | 124 | 107 |\n| Total residential mortgages | 2,220 | 1,522 |\n| Other | 81 | 120 |\n| Total | $3,693 | $2,936 |\nWe stopped originating commercial mortgage loans held for sale designated at fair value in 2008 and continue pursuing opportunities to reduce these positions at appropriate prices.\nAt December 31, 2012, the balance relating to these loans was $772 million, compared to $843 million at December 31, 2011.\nWe sold $32 million in unpaid principal balances of these commercial mortgage loans held for sale carried at fair value in 2012 and sold $25 million in 2011.',
'Title: \nText: Investments and Derivative Instruments (continued) Security Unrealized Loss Aging The following tables present the Company’s unrealized loss aging for AFS securities by type and length of time the security was in a continuous unrealized loss position.\n| | December 31, 2011 |\n| | Less Than 12 Months | 12 Months or More | Total |\n| | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized |\n| | Cost | Value | Losses | Cost | Value | Losses | Cost | Value | Losses |\n| ABS | $629 | $594 | $-35 | $1,169 | $872 | $-297 | $1,798 | $1,466 | $-332 |\n| CDOs | 81 | 59 | -22 | 2,709 | 2,383 | -326 | 2,790 | 2,442 | -348 |\n| CMBS | 1,297 | 1,194 | -103 | 2,144 | 1,735 | -409 | 3,441 | 2,929 | -512 |\n| Corporate [1] | 4,388 | 4,219 | -169 | 3,268 | 2,627 | -570 | 7,656 | 6,846 | -739 |\n| Foreign govt./govt. agencies | 218 | 212 | -6 | 51 | 47 | -4 | 269 | 259 | -10 |\n| Municipal | 299 | 294 | -5 | 627 | 560 | -67 | 926 | 854 | -72 |\n| RMBS | 415 | 330 | -85 | 1,206 | 835 | -371 | 1,621 | 1,165 | -456 |\n| U.S. Treasuries | 343 | 341 | -2 | — | — | — | 343 | 341 | -2 |\n| Total fixed maturities | 7,670 | 7,243 | -427 | 11,174 | 9,059 | -2,044 | 18,844 | 16,302 | -2,471 |\n| Equity securities | 167 | 138 | -29 | 439 | 265 | -174 | 606 | 403 | -203 |\n| Total securities in an unrealized loss | $7,837 | $7,381 | $-456 | $11,613 | $9,324 | $-2,218 | $19,450 | $16,705 | $-2,674 |\nDecember 31, 2010\n| | December 31, 2010 |\n| | Less Than 12 Months | 12 Months or More | Total |\n| | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized |\n| | Cost | Value | Losses | Cost | Value | Losses | Cost | Value | Losses |\n| ABS | $302 | $290 | $-12 | $1,410 | $1,026 | $-384 | $1,712 | $1,316 | $-396 |\n| CDOs | 321 | 293 | -28 | 2,724 | 2,274 | -450 | 3,045 | 2,567 | -478 |\n| CMBS | 556 | 530 | -26 | 3,962 | 3,373 | -589 | 4,518 | 3,903 | -615 |\n| Corporate | 5,533 | 5,329 | -199 | 4,017 | 3,435 | -548 | 9,550 | 8,764 | -747 |\n| Foreign govt./govt. agencies | 356 | 349 | -7 | 78 | 68 | -10 | 434 | 417 | -17 |\n| Municipal | 7,485 | 7,173 | -312 | 1,046 | 863 | -183 | 8,531 | 8,036 | -495 |\n| RMBS | 1,744 | 1,702 | -42 | 1,567 | 1,147 | -420 | 3,311 | 2,849 | -462 |\n| U.S. Treasuries | 2,436 | 2,321 | -115 | 158 | 119 | -39 | 2,594 | 2,440 | -154 |\n| Total fixed maturities | 18,733 | 17,987 | -741 | 14,962 | 12,305 | -2,623 | 33,695 | 30,292 | -3,364 |\n| Equity securities | 53 | 52 | -1 | 637 | 506 | -131 | 690 | 558 | -132 |\n| Total securities in an unrealized loss | $18,786 | $18,039 | $-742 | $15,599 | $12,811 | $-2,754 | $34,385 | $30,850 | $-3,496 |\n[1] Unrealized losses exclude the change in fair value of bifurcated embedded derivative features of certain securities.\nSubsequent changes in fair value are recorded in net realized capital gains (losses).\nAs of December 31, 2011, AFS securities in an unrealized loss position, comprised of 2,549 securities, primarily related to corporate securities within the financial services sector, CMBS, and RMBS which have experienced significant price deterioration.\nAs of December 31, 2011, 75% of these securities were depressed less than 20% of cost or amortized cost.\nThe decline in unrealized losses during 2011 was primarily attributable to a decline in interest rates, partially offset by credit spread widening.\nMost of the securities depressed for twelve months or more relate to structured securities with exposure to commercial and residential real estate, as well as certain floating rate corporate securities or those securities with greater than 10 years to maturity, concentrated in the financial services sector.\nCurrent market spreads continue to be significantly wider for structured securities with exposure to commercial and residential real estate, as compared to spreads at the security’s respective purchase date, largely due to the economic and market uncertainties regarding future performance of commercial and residential real estate.\nIn addition, the majority of securities have a floating-rate coupon referenced to a market index where rates have declined substantially.\nThe Company neither has an intention to sell nor does it expect to be required to sell the securities outlined above.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
Evaluate
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.3617 |
cosine_accuracy@3 | 0.5194 |
cosine_accuracy@5 | 0.6092 |
cosine_accuracy@10 | 0.7015 |
cosine_precision@1 | 0.3617 |
cosine_precision@3 | 0.1788 |
cosine_precision@5 | 0.1267 |
cosine_precision@10 | 0.0752 |
cosine_recall@1 | 0.331 |
cosine_recall@3 | 0.4768 |
cosine_recall@5 | 0.5614 |
cosine_recall@10 | 0.6548 |
cosine_ndcg@10 | 0.496 |
cosine_mrr@10 | 0.4668 |
cosine_map@100 | 0.4482 |
dot_accuracy@1 | 0.3325 |
dot_accuracy@3 | 0.5243 |
dot_accuracy@5 | 0.5922 |
dot_accuracy@10 | 0.6748 |
dot_precision@1 | 0.3325 |
dot_precision@3 | 0.1796 |
dot_precision@5 | 0.1248 |
dot_precision@10 | 0.0726 |
dot_recall@1 | 0.3059 |
dot_recall@3 | 0.4762 |
dot_recall@5 | 0.5446 |
dot_recall@10 | 0.6273 |
dot_ndcg@10 | 0.4723 |
dot_mrr@10 | 0.4422 |
dot_map@100 | 0.4264 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 2,256 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 29 tokens
- mean: 45.01 tokens
- max: 121 tokens
- min: 26 tokens
- mean: 406.1 tokens
- max: 512 tokens
- Samples:
sentence_0 sentence_1 Instruct: Given a web search query, retrieve relevant passages that answer the query.
Query: Title:
Text: In the year with largest amount of Net credit losses, what's the amount of Revenues, net of interest expense and Total operating expenses? (in million)Title:
Text: Comparison of Five-Year Cumulative Total Return The following graph compares the cumulative total return on Citigroup’s common stock with the S&P 500 Index and the S&P Financial Index over the five-year period extending through December31, 2009.
The graph assumes that $100 was invested on December31, 2004 in Citigroup’s common stock, the S&P 500 Index and the S&P Financial Index and that all dividends were reinvested.Instruct: Given a web search query, retrieve relevant passages that answer the query.
Query: Title:
Text: what was the total of net earnings attributable to pmi in 2017?Title:
Text: the fair value of the psu award at the date of grant is amortized to expense over the performance period , which is typically three years after the date of the award , or upon death , disability or reaching the age of 58 .
as of december 31 , 2017 , pmi had $ 34 million of total unrecognized compensation cost related to non-vested psu awards .
this cost is recognized over a weighted-average performance cycle period of two years , or upon death , disability or reaching the age of 58 .
during the years ended december 31 , 2017 , and 2016 , there were no psu awards that vested .
pmi did not grant any psu awards during note 10 .
earnings per share : unvested share-based payment awards that contain non-forfeitable rights to dividends or dividend equivalents are participating securities and therefore are included in pmi 2019s earnings per share calculation pursuant to the two-class method .
basic and diluted earnings per share ( 201ceps 201d ) were calculated using the following: .
( in millions )Instruct: Given a web search query, retrieve relevant passages that answer the query.
Query: Title:
Text: for the terrestar acquisition what will the final cash purchase price be in millions paid upon closing?Title:
Text: dish network corporation notes to consolidated financial statements - continued this transaction was accounted for as a business combination using purchase price accounting .
the allocation of the purchase consideration is in the table below .
purchase allocation ( in thousands ) . - Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 2fp16
: Truebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 2max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Evaluate_cosine_map@100 |
---|---|---|
0 | 0 | 0.2566 |
1.0 | 141 | 0.3931 |
2.0 | 282 | 0.4482 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.1.1
- Transformers: 4.45.2
- PyTorch: 2.5.1+cu121
- Accelerate: 1.1.1
- Datasets: 3.1.0
- Tokenizers: 0.20.3
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 316
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for thomaskim1130/stella_en_400M_v5-FinanceRAG
Evaluation results
- Cosine Accuracy@1 on Evaluateself-reported0.362
- Cosine Accuracy@3 on Evaluateself-reported0.519
- Cosine Accuracy@5 on Evaluateself-reported0.609
- Cosine Accuracy@10 on Evaluateself-reported0.701
- Cosine Precision@1 on Evaluateself-reported0.362
- Cosine Precision@3 on Evaluateself-reported0.179
- Cosine Precision@5 on Evaluateself-reported0.127
- Cosine Precision@10 on Evaluateself-reported0.075
- Cosine Recall@1 on Evaluateself-reported0.331
- Cosine Recall@3 on Evaluateself-reported0.477