Loading ./pipeline/ requires you to execute the configuration file in that repo on your local machine
Hello, I want to avoid re-downaloding the model every time for the code below
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
model = "tiiuae/falcon-7b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
)
so I saved the pipeline by doing
pipeline.save_pretrained("./pipeline_path/")
However, I am unable to simply reload the pipeline with
pipe_load = transformers.pipeline("text-generation", model = "./pipeline_path/")
Getting
ValueError: Loading ./pipeline_path/ requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.
I added trust_remote_code=True
in to avoid this error, but my jupyter kernel dies.
My questions are:
- Any idea how to resolve this? Should I just not run this stuff in a jupyter notebook?
- I'm also not quite understand
execute the configuration file in that repo on your local machine
. - Is there any other better ways to avoid re-downaloding the model?
Any help would be greatly appreciated!
ChatGPT suggest this:
To save and load models using the Hugging Face's Transformers library, you might want to save both the model and the tokenizer, not the pipeline, as they are the primary components. Here's how to do it:
To save:
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)
tokenizer.save_pretrained("./model_path/")
model.save_pretrained("./model_path/")
To load:
tokenizer = AutoTokenizer.from_pretrained("./model_path/")
model = AutoModelForCausalLM.from_pretrained("./model_path/")
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
device_map="auto",
)
In this way, you don't need to download the model each time you run your script, and it should resolve the issues you are encountering with the trust_remote_code=True
setting.
#designfailure
Thank you
@designfailure
I tried loading model and tokenizer separately like below
tokenizer = AutoTokenizer.from_pretrained(model) model = AutoModelForCausalLM.from_pretrained(model)
But getting the require execute the configuration file again on the line of loading model:
ValueError: Loading tiiuae/falcon-7b-instruct requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.
https://huggingface.co./tiiuae/falcon-7b-instruct/discussions/10 flags the same issue when running on AWS Sagemaker.
Thank you @designfailure
I tried loading model and tokenizer separately like belowtokenizer = AutoTokenizer.from_pretrained(model) model = AutoModelForCausalLM.from_pretrained(model)
But getting the require execute the configuration file again on the line of loading model:
ValueError: Loading tiiuae/falcon-7b-instruct requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.
What was the fix in the end?
I am encountering the same problem. What was the fix plz?
hello ! same problem too. Someone have a fix ?Thanks
Thank you @designfailure
I tried loading model and tokenizer separately like belowtokenizer = AutoTokenizer.from_pretrained(model) model = AutoModelForCausalLM.from_pretrained(model)
But getting the require execute the configuration file again on the line of loading model:
ValueError: Loading tiiuae/falcon-7b-instruct requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.
hello ! same problem too.What was the fix in the end? Thanks
I had the same issue, resolved it with updating to transformers==4.34.0
Transformers 4.39.3 and I still have the same problem :(