retrieval finetuning and loss function used
Hi, thanks for your enormous contribution by open-sourcing these models. I want to finetune on my retrieval dataset and i know that we can finetuning using sentencetransformer. As im concern, the library doesnt contain the loss model "InfoNCE" used in the paper (correct me if im wrong). Can you suggest which function should i use that produce similar performance?
Hello!
I should really clear this up in the documentation, but the MultipleNegativesRankingLoss
is equivalent to InfoNCE. Another name is in-batch negatives loss.
- Tom Aarsen
Thanks for the kind reply. Just one more question, how many negative samples should there be for my dataset?
There can be any number, as long as every sample has the same number of negatives.
You can even use 0 negatives if you don't have any, i.e. just (anchor, positive)
. For a bit of context, the loss uses all texts from other samples in the batch as negatives, so it's okay if you don't have any. Having negatives can result in slightly better performance though.
And you can use mine_hard_negatives
if you'd like to "mine" some hard negatives that might help the model train nicely.
- Tom Aarsen