Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 956 column 3

#1
by zerozj - opened

Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 956 column 3

Hi,

Can you please provide the context or code that produced this error?

Thanks!

Hi,

Can you please provide the context or code that produced this error?

Thanks!

Thank you very much for your reply. I don't know why it works now, but the translation is not very accurate, such as the translation result of 'བདེ་མོ' is 'a collection of advice like garlanded beams of nectar from the moon of'.

Yes, the translations in this dataset aren't great.

I recommend that you instead use "billingsmoore/tibetan-to-english-translation-dataset".

It's a smaller dataset but it's much higher quality.

Let me know if you have any other issues or questions!

Yes, the translations in this dataset aren't great.

I recommend that you instead use "billingsmoore/tibetan-to-english-translation-dataset".

It's a smaller dataset but it's much higher quality.

Let me know if you have any other issues or questions!

I have a problem with my usage. Is this model for literature and Buddhism rather than daily life?

Yes, the datasets that are currently available on my page are extracted from Buddhist texts and the models on my page have been trained on that data.

For daily life translations, I recommend Monlam AI. Their model is a work in progress but is the best option available right now.

You can use their website here: https://monlam.ai/model/mt

If you are interested in training a model for daily life, I don't know of a high quality dataset that is currently available.

If you would like to stay up to date on Tibetan language machine translation, I recommend following the OpenPecha forum which you can find here:

https://forum.openpecha.org/

Yes, the datasets that are currently available on my page are extracted from Buddhist texts and the models on my page have been trained on that data.

For daily life translations, I recommend Monlam AI. Their model is a work in progress but is the best option available right now.

You can use their website here: https://monlam.ai/model/mt

If you are interested in training a model for daily life, I don't know of a high quality dataset that is currently available.

If you would like to stay up to date on Tibetan language machine translation, I recommend following the OpenPecha forum which you can find here:

https://forum.openpecha.org/

Thank you very much for your suggestion. I am more concerned about the daily translation of Tibetan.

Sign up or log in to comment