YAML Metadata
Error:
"datasets[1]" with value "Custom Rosetta" is not valid. If possible, use a dataset id from https://hf.co/datasets.
YAML Metadata
Error:
"language" with value "protein" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
ProtBert-BFD finetuned on Rosetta 20,40,60AA dataset
This model is finetuned to predict Rosetta fold energy using a dataset of 300k protein sequences: 100k of 20AA, 100k of 40AA, and 100k of 60AA
Current model in this repo: prot_bert_bfd-finetuned-032822_1323
Performance
- 20AA sequences (1k eval set):
Metrics: 'mae': 0.100418, 'r2': 0.989028, 'mse': 0.016266, 'rmse': 0.127537 - 40AA sequences (10k eval set):
Metrics: 'mae': 0.173888, 'r2': 0.963361, 'mse': 0.048218, 'rmse': 0.219587 - 60AA sequences (10k eval set):
Metrics: 'mae': 0.235238, 'r2': 0.930164, 'mse': 0.088131, 'rmse': 0.2968
prot_bert_bfd
from ProtTrans
The starting pretrained model is from ProtTrans, trained on 2.1 billion proteins from BFD. It was trained on protein sequences using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository.
Created by Ladislav Rampasek
- Downloads last month
- 25
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.