You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

SentenceTransformer based on answerdotai/ModernBERT-base

This is a sentence-transformers model finetuned from answerdotai/ModernBERT-base on the test-minn dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: answerdotai/ModernBERT-base
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("conceptofmind/teraflop-minn-caselaw")
# Run inference
sentences = [
    'Minnesota court ruling on co-owned farm division',
    "ELLEN BRANDIN v. JASPER SWENSON.\nJune 19, 1925.\nNo 24,681.\n8. H. Écfanqn, for appellant.\nJohn Heitmarm, for respondent.\nReported in 204 N. W. 468.\n\nDibell, J.\nAction in St. Louis county to have the plaintiff adjudged to be the owner of an 80-acre tract of land, and, if such relief were denied, that the land be partitioned. There was judgment for a partition in specie, and the plaintiff appeals.\nThe evidence is not returned. The only question, as correctly stated by counsel for appellant, is whether the findings of fact justify the judgment; and in stating the facts we follow the findings of the trial court.\nA marriage ceremony' was performed between the plaintiff, Ellen Brandin, and the defendant, Jasper Swenson, on February 10, 1906. At that time the plaintiff had a husband living. He had deserted her 10 years before and she and the defendant supposed that he was dead. In 1920 it was ascertained that he was living; and on January 8, 1922, a judgment was entered annuling the marriage of the plaintiff and defendant.\nOn April 9, 1906, the plaintiff and the defendant purchased an 80-acre tract as tenants in common and the deed ran to both. The purchase price was paid by the plaintiff, but a part thereof the defendant had given her from his earnings subsequent to their formal marriage, and not long afterwards he gave her money exceeding his one-half of the purchase price. In 1907 the defendant moved upon the land and has since lived there and improved one forty. The plaintiff continued living in Duluth, operating a boarding house. She contributed to the improvement of the farm, and received cash and products from it. The court set off to her the west forty of the eighty, and to the defendant the east forty upon which he had made the improvements. This was done on the basis of its finding that the value of the west forty was to the value contributed by the plaintiff approximately as was the value-of the east forty to the amount contributed by the defendant. This was an equitable division. Each got one-half in area of the land. The defendant got'the forty upon which he had improved. Each got a value proportionate to contribution. The principles stated in Hunt v. Meeker County A. & L. Co. 135 Minn. 134, 160 N. W. 496, sustain the division. With the record as it is, neither the form of the pleadings nor the procedure adopted is important. No complaint is made of either.\nJudgment affirmed.",
    'STATE of Minnesota, Respondent, v. James Darrell GIBSON, Petitioner, Appellant.\nNo. C1-91-1332.\nSupreme Court of Minnesota.\nDec. 20, 1991.\nJohn M. Stuart, State Public Defender, Mark F. Anderson, Asst. State Public Defender, Minneapolis, for appellant.\nScott A. Hersey, Isanti County Atty., Cambridge, and Hubert H. Humphrey, III, Atty. Gen., St. Paul, for respondent.\n\nTOMLJANOVICH, Justice.\nIn its decision in this case the court of appeals affirmed the use of multiple concurrent sentences for two offenses that defendant contends arose from a single behavioral incident. State v. Gibson, 475 N.W.2d 896 (Minn.App.1991). We agree with defendant and therefore vacate the lesser of the two sentences pursuant to Minn.Stat. § 609.035 (1990), the so-called single-behavioral-incident statute.\nThe offenses of conviction here are criminal vehicular operation resulting in injury and felony leaving the scene of an accident, for which defendant received concurrent terms of 23 and 15 months. The first conviction is based on defendant’s involvement in a head-on collision while driving under the influence of alcohol. The second conviction is based on the fact that immediately after the accident, in which both defendant and the driver of the other vehicle were injured, defendant fled the scene on foot, went to a nearby farmhouse and called his girl friend to come and pick him up.\nMinnesota Statute § 609.035 provides in relevant part that if a person’s conduct “constitutes more than one offense under the laws of this state, the person may be punished for only one of such offenses.” The approach we have used in determining whether two nonintentional crimes or a nonintentional and an intentional crime are part of the same course of conduct is to analyze all the facts and determine whether the offenses “[arose] out of a continuing and uninterrupted course of conduct, manifesting an indivisible state of mind or coincident errors of judgment.” State v. Sailor, 257 N.W.2d 349, 352 (Minn.1977); see also State v. Johnson, 273 Minn. 394, 405, 141 N.W.2d 517, 525 (1966). When both crimes are intentional crimes we focus on factors such as time and place and whether the conduct involved was motivated by an effort to obtain but one criminal objective. State v. Johnson, supra.\nIn a series of decisions — the avoidance-of-apprehension cases — we have held that multiple sentences may not be used for two offenses if the defendant, substantially contemporaneously committed the second offense in order to avoid apprehension for the first offense. State v. Gilbertson, 323 N.W.2d 810 (Minn.1982); State v. Zuehlke, 320 N.W.2d 79 (Minn.1982); State v. Boley, 299 N.W.2d 924 (Minn.1980); Matter of Castillo, 293 N.W.2d 839 (Minn.1980); State v. White, 292 N.W.2d 16 (Minn.1980); State v. Finn, 295 Minn. 520, 203 N.W.2d 114 (1972).\nHere the defendant committed the felonious act of leaving the scene of an accident in part to avoid being apprehended for any crime committed in connection with the accident. Accordingly, we vacate the lesser of the two sentences, the 15 month concur rent sentence for leaving the scene of an accident.\nAffirmed in part, reversed in part.\n. Closely related to the avoidance-of-apprehension cases are the facilitation-of-offense cases. See State v. Naylor, 474 N.W.2d 314 (Minn.1991); State v. Beito, 332 N.W.2d 645 (Minn.1983).',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

test-minn

  • Dataset: test-minn at a383680
  • Size: 248,554 training samples
  • Columns: query and reponse
  • Approximate statistics based on the first 1000 samples:
    query reponse
    type string string
    details
    • min: 4 tokens
    • mean: 15.14 tokens
    • max: 31 tokens
    • min: 119 tokens
    • mean: 2706.53 tokens
    • max: 8192 tokens
  • Samples:
    query reponse
    The role of seed-grain notes in property liens and collections WINTER & AMES COMPANY v. ATLANTIC ELEVATOR COMPANY.
    January 9,1903.
    Nos. 13,155 — (140).
    Authority of Agent — Evidence.
    Evidence examined, and held sufficient to sustain .the findings of the trial court to the effect that an agent of plaintiff had authority to authorize the sale of certain flaxseed on which plaintiff held a lien by virtue of a seed-grain note, and to release the lien thus held.
    Action in the municipal court of Minneapolis to recover $250, and interest, for the conversion of certain flaxseed. The case was tried before Holt, J., who found in favor of defendant. From an order denying a motion for a new trial, plaintiff appealed.
    Affirmed.
    L. J. Van Fossen, for appellant.
    Wilson & Van Derlip, for respondent.
    Reported in 92 N. W. 955.

    BROWN, J.
    Action to recover the value of certain flaxseed alleged to have been converted by defendant, in which defendant had judgment in the court below, and plaintiff appeals from an order denying a new trial.
    The short facts are as follows...
    on priority disputes involving misdescribed mortgages and judgments? Lucy H. Gill vs. William C. Russell, impleaded, etc.
    February 12, 1877.
    Exceptions where Evidence is Taken by Referee. — Upon the hearing of a case upon evidence taken and reported by a referee appointed for that purpose-alone, a party desiring to avail himself of any objection interposed before-the referee must renew it, and obtain a ruling thereon by the court, and, if adverse, take an exception.
    Estoppel — Director of Corporation Cannot Profit by Mistake in a Mortgage by the Company wbieb he Took Part in Making. — B., a corporation, duly executed to plaintiff a real estate mortgage, for valuable consideration, which, through mutual mistake of parties, misdescribed the premises intended and agreed to be mortgaged. Plaintiff caused the mortgage to bo duly recorded. It., one of the directors, who participated in the giving of the mortgage and in the mistake, afterwards obtained a judgment against the corporation, and duly docketed the same, so as to make it a lien upon the premises, be...
    On what grounds can neglect claims against railroads be challenged? Iver Anderson vs. Southern Minnesota Railroad Company.
    Aug. 10, 1874.
    Waiver by Corporation of Defective Service of Sammons. — A corporation, after appearing generally and pleading to the merits in an action in a justice’s court, cannot afterwards object that the summons was not served in conformity with the requirements of statute.
    Justice of Peace — Adjournment—Docket Entry. — A docket entry, “by consent of parties, the case is adjourned till Monday, September 23, 1873, at one o’clock in the afternoon,” sufficiently complies with the statute requiring that the justice shall enter in his docket “every adjournment, stating to what time and place."
    This action was brought in a justice’s court, where the plaintiff had judgment. The defendant appealed, upon questions of law, to the district court for Fillmore county, Waterman, J., presiding, where the judgment of the justice was reversed, and judgment entered for the defendant, from which the plaintiff appeals. The case is stated in the o...
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

test-minn

  • Dataset: test-minn at a383680
  • Size: 248,554 evaluation samples
  • Columns: query and reponse
  • Approximate statistics based on the first 1000 samples:
    query reponse
    type string string
    details
    • min: 3 tokens
    • mean: 14.9 tokens
    • max: 33 tokens
    • min: 131 tokens
    • mean: 2599.64 tokens
    • max: 8192 tokens
  • Samples:
    query reponse
    Legal definition of "foul brood" in fraudulent bee sales C. E. SAMPSON v. F. C. PENNEY.
    February 17, 1922.
    No. 22,564.
    New trial because of lack of evidence to support verdict.
    1. There is evidence that, in a sale of bees, all of the elements of fraud were present, if certain representations made were proven false. There is doubt as to whether the proof of falsity was sufficient. But a new trial must be granted on the ground that the evidence .fails to sustain the verdict as to.the amount of damages.
    Measure of damages, direct and consequential, from fraud in sale of diseased bees.
    3. The direct damage for fraud which induces a contract, is the difference in value between what the party defrauded parted with and what he received. In addition to this, the party defrauded may recover consequential damages flowing naturally and proximately from, the 'breach. If -one through fraud procures a sale of animals afflicted with disease, the purchaser may recover for the loss of other animals of his own to Which the disease is communicated, but not for...
    What cases differentiate liability based on whether a thief was in flight? ANNE WANNEBO v. ELNATHAN GATES AND ANOTHER.
    November 26, 1948.
    No. 34,713.
    Meagher, Geer & Markham and Clyde F. Anderson, for appellants.
    R. 8. hammers and Allan h. Johnson, for respondent.
    Reported in 34 N. W. (2d) 695.

    Magney, Justice.
    Defendants appeal from an order overruling a demurrer to the complaint herein, the question presented having been certified as important and doubtful.
    On July 2,1947, defendant Frances L. Gates parked a car owned by defendant Elnathan Gates on a public street in a business area in Minneapolis. She went shopping and left the car unattended and the doors and ignition unlocked. The key was not removed from the ignition switch and taken with her. The car was stolen. That night, át about 11:30, the stolen car) negligently operated by a person unknown, collided with plaintiff’s automobile, damaging the same and injuring plaintiff. The above facts state briefly the material allegations of the complaint to which defendants demur.
    A part of § 11 of an ordinanc...
    How does the relationship between the testator and beneficiaries affect claims of undue influence in Minnesota? In the Matter of the ESTATE OF Gerald Charles ANDERSON, a.k.a. Gerald C. Anderson, Deceased.
    No. C5-85-871.
    Court of Appeals of Minnesota.
    Dec. 24, 1985.
    Review Denied Feb. 19, 1986.
    Richard A. Beens, Anoka, for appellant Mary Ann Reynolds.
    Rolf T. Nelson, Robbinsdale, for respondents Sally Ann Sellers, Carol Ann Young, Robert Charles Anderson and Carl Earl Anderson.
    Heard, considered and decided by HUS-PENI, P.J., and FOLEY and FORSBERG, JJ.

    OPINION
    HUSPENI, Judge.
    Mary Ann Reynolds, appellant and daughter of decedent Gerald Anderson, attempted to admit into probate a second codicil to decedent’s will. Respondents, who were decedent’s four other children, objected to the probate of this second codicil. An advisory jury found that the second codicil was executed as a result of undue influence exerted by Reynolds. The trial court adopted the advisory jury’s finding of undue influence. Reynolds appeals from the order denying probate of the second codicil and the trial court’s denial of ...
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 1024
  • per_device_eval_batch_size: 1024
  • learning_rate: 0.0003
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 1024
  • per_device_eval_batch_size: 1024
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0003
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss
0.0457 10 6.5431 -
0.0913 20 4.3376 -
0.1370 30 3.0217 -
0.1826 40 2.5811 -
0.2283 50 2.4191 2.2439
0.2740 60 2.2218 -
0.3196 70 2.1584 -
0.3653 80 2.0668 -
0.4110 90 2.0528 -
0.4566 100 2.0014 1.9200
0.5023 110 1.9779 -
0.5479 120 1.9102 -
0.5936 130 1.9071 -
0.6393 140 1.8794 -
0.6849 150 1.8269 1.8022
0.7306 160 1.8606 -
0.7763 170 1.8572 -
0.8219 180 1.8332 -
0.8676 190 1.8227 -
0.9132 200 1.7875 1.7674
0.9589 210 1.8351 -

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.4.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
0
Safetensors
Model size
149M params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for conceptofmind/teraflop-minn-caselaw

Finetuned
(428)
this model

Dataset used to train conceptofmind/teraflop-minn-caselaw