retrieval finetuning and loss function used

#70

by hail75 - opened 2 days ago

2 days ago

Hi, thanks for your enormous contribution by open-sourcing these models. I want to finetune on my retrieval dataset and i know that we can finetuning using sentencetransformer. As im concern, the library doesnt contain the loss model "InfoNCE" used in the paper (correct me if im wrong). Can you suggest which function should i use that produce similar performance?

tomaarsen

2 days ago

Hello!

I should really clear this up in the documentation, but the MultipleNegativesRankingLoss is equivalent to InfoNCE. Another name is in-batch negatives loss.

Tom Aarsen

hail75

2 days ago

•

edited 2 days ago

Thanks for the kind reply. Just one more question, how many negative samples should there be for my dataset?

tomaarsen

2 days ago

There can be any number, as long as every sample has the same number of negatives.
You can even use 0 negatives if you don't have any, i.e. just (anchor, positive). For a bit of context, the loss uses all texts from other samples in the batch as negatives, so it's okay if you don't have any. Having negatives can result in slightly better performance though.

And you can use mine_hard_negatives if you'd like to "mine" some hard negatives that might help the model train nicely.

Tom Aarsen

hail75

2 days ago

This comment has been hidden

hail75 changed discussion status to closed 2 days ago

hail75 changed discussion status to open 2 days ago

hail75

2 days ago

This comment has been hidden

hail75 changed discussion title from finetuning and loss function used in the paper to retrieval finetuning and loss function used 2 days ago

hail75 changed discussion status to closed 1 day ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment