retrieval finetuning and loss function used

#70
by hail75 - opened

Hi, thanks for your enormous contribution by open-sourcing these models. I want to finetune on my retrieval dataset and i know that we can finetuning using sentencetransformer. As im concern, the library doesnt contain the loss model "InfoNCE" used in the paper (correct me if im wrong). Can you suggest which function should i use that produce similar performance?
image.png

Hello!

I should really clear this up in the documentation, but the MultipleNegativesRankingLoss is equivalent to InfoNCE. Another name is in-batch negatives loss.

  • Tom Aarsen

Thanks for the kind reply. Just one more question, how many negative samples should there be for my dataset?

There can be any number, as long as every sample has the same number of negatives.
You can even use 0 negatives if you don't have any, i.e. just (anchor, positive). For a bit of context, the loss uses all texts from other samples in the batch as negatives, so it's okay if you don't have any. Having negatives can result in slightly better performance though.

And you can use mine_hard_negatives if you'd like to "mine" some hard negatives that might help the model train nicely.

  • Tom Aarsen
This comment has been hidden
hail75 changed discussion status to closed
hail75 changed discussion status to open
This comment has been hidden
hail75 changed discussion title from finetuning and loss function used in the paper to retrieval finetuning and loss function used
hail75 changed discussion status to closed

Sign up or log in to comment