kazzand
/

ru-longformer-tiny-16384

Inference Endpoints

Model card Files Files and versions Community

kazzand commited on Jul 31, 2023

Commit

71073ae

·

1 Parent(s): 1ce35f8

Create README.md

Files changed (1) hide show

README.md +44 -0

README.md ADDED Viewed

	@@ -0,0 +1,44 @@

+---
+language:
+- ru
+- en
+---
+This is a large Longformer model designed for Russian language. It was initialized from [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) weights and has been modified to support a context length of up to 4096 tokens.
+We fine-tuned it on a dataset of Russian books. For a detailed information check out our post on Habr.
+Model attributes:
+- 16 attention heads
+- 24 hidden layers
+- 4096 tokens length of context
+The model can be used as-is to produce text embeddings or it can be further fine-tuned for a specific downstream task.
+Text embeddings can be produced as follows:
+```python
+# pip install transformers sentencepiece
+import torch
+from transformers import LongformerForMaskedLM, LongformerTokenizerFast
+model = LongformerModel.from_pretrained('kazzand/ru-longformer-tiny-16384')
+tokenizer = LongformerTokenizerFast.from_pretrained('kazzand/ru-longformer-tiny-16384')
+def get_cls_embedding(text, model, tokenizer, device='cuda'):
+    model.to(device)
+    batch = tokenizer(text, return_tensors='pt')
+    #set global attention for cls token
+    global_attention_mask = [
+            [1 if token_id == tokenizer.cls_token_id else 0 for token_id in input_ids]
+            for input_ids in batch["input_ids"]
+        ]
+    #add global attention mask to batch
+    batch["global_attention_mask"] = torch.tensor(global_attention_mask)
+    with torch.no_grad():
+        output = model(**batch.to(device))
+    return output.last_hidden_state[:,0,:]
+```