bert-base-multilingual-cased-segment1

This is a version of multilingual bert (bert-base-multilingual-cased), where the segment embedding of the 1's is copied into the 0's. Yes, that's all there is to it. We have found that this improves performance substantially in low-resource setups for word-level tasks (e.g. average 2.5 LAS on a variety of UD treebanks). More details are to be released in our LREC2022 paper titled: Frustratingly Easy Performance Improvements for Cross-lingual Transfer: A Tale on BERT and Segment Embeddings.

These embeddings are generated by the following code

import AutoModel
baseEmbeddings = AutoModel.from_pretrained("bert-base-multilingual-cased")
tte = baseEmbeddings.embeddings.token_type_embeddings.weight.clone().detach()
baseEmbeddings.embeddings.token_type_embeddings.weight[0,:] = tte[1,:]

More details and other varieties can be found in the repo: https://bitbucket.org/robvanderg/segmentembeds/

Note that when using this model on a single sentence task (or word-level task), the results would be similar as just using token_type_id=1 for all tokens.

Downloads last month
37
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.