indiejoseph
/

bert-base-cantonese

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

indiejoseph commited on Oct 19, 2023

Commit

f5f6f43

•

1 Parent(s): 6f3590e

Update README.md

Files changed (1) hide show

README.md +11 -9

README.md CHANGED Viewed

@@ -5,18 +5,25 @@ tags:
 model-index:
 - name: bert-base-cantonese
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # bert-base-cantonese
-This model is a fine-tuned version of [/notebooks/cantonese/bert-base-cantonese](https://huggingface.co//notebooks/cantonese/bert-base-cantonese) on an unknown dataset.
 ## Model description
-More information needed
 ## Intended uses & limitations
@@ -39,14 +46,9 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 192
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 15.0
 ### Training results
 ### Framework versions
 - Transformers 4.34.0.dev0
 - Pytorch 2.0.1+cu117
 - Datasets 2.14.5

 model-index:
 - name: bert-base-cantonese
   results: []
+license: cc-by-4.0
+language:
+  - yue
+pipeline_tag: fill-mask
+widget:
+  - text: 香港原本[MASK]一個人煙稀少嘅漁港。
+    example_title: 係
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # bert-base-cantonese
+This model is a continue pre-train version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on [indiejoseph/wikipedia-zh-yue-filtered](https://huggingface.co/datasets/indiejoseph/wikipedia-zh-yue-filtered).
 ## Model description
+This model has extended 500 more Chinese characters which very common in Cantonese, such as `冧`, `噉`, `麪`, `笪`, `冚`, `乸` etc, and continue pre-trained with [indiejoseph/wikipedia-zh-yue-filtered](https://huggingface.co/datasets/indiejoseph/wikipedia-zh-yue-filtered)
 ## Intended uses & limitations
 - total_train_batch_size: 192
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 10.0
 ### Training results
 ### Framework versions
 - Transformers 4.34.0.dev0
 - Pytorch 2.0.1+cu117
 - Datasets 2.14.5