RomanSetu

This was trained as part of the paper RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization. The codebase used to train and evaluate this model can be found at https://github.com/AI4Bharat/romansetu.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "ai4bharat/romansetu-base-sft-roman"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

Downloads last month: 9

Safetensors

Model size

6.74B params

Tensor type

F32

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including ai4bharat/romansetu-base-sft-roman

RomanSetu

Collection

Romansetu is a collection of models address the challenge of extending Large Language Models (LLMs) to non-English languages using non-Latin scripts • 11 items • Updated 2 days ago • 2