Loading ./pipeline/ requires you to execute the configuration file in that repo on your local machine

#21

by kycrowe - opened Jun 2, 2023

Jun 2, 2023

Hello, I want to avoid re-downaloding the model every time for the code below

from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)

so I saved the pipeline by doing

pipeline.save_pretrained("./pipeline_path/")

However, I am unable to simply reload the pipeline with

pipe_load = transformers.pipeline("text-generation", model = "./pipeline_path/")

Getting

ValueError: Loading ./pipeline_path/ requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.

I added trust_remote_code=True in to avoid this error, but my jupyter kernel dies.

My questions are:

Any idea how to resolve this? Should I just not run this stuff in a jupyter notebook?
I'm also not quite understand execute the configuration file in that repo on your local machine.
Is there any other better ways to avoid re-downaloding the model?

Any help would be greatly appreciated!

designfailure

Jun 3, 2023

ChatGPT suggest this:

To save and load models using the Hugging Face's Transformers library, you might want to save both the model and the tokenizer, not the pipeline, as they are the primary components. Here's how to do it:

To save:

tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)

tokenizer.save_pretrained("./model_path/")
model.save_pretrained("./model_path/")

To load:

tokenizer = AutoTokenizer.from_pretrained("./model_path/")
model = AutoModelForCausalLM.from_pretrained("./model_path/")

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

In this way, you don't need to download the model each time you run your script, and it should resolve the issues you are encountering with the trust_remote_code=True setting.

#designfailure

kycrowe

Jun 5, 2023

•

edited Jun 5, 2023

Thank you @designfailure
I tried loading model and tokenizer separately like below

tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)

But getting the require execute the configuration file again on the line of loading model:

ValueError: Loading tiiuae/falcon-7b-instruct requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.

rq-aszasz

Jun 8, 2023

https://huggingface.co./tiiuae/falcon-7b-instruct/discussions/10 flags the same issue when running on AWS Sagemaker.

mwgupta

Aug 5, 2023

Thank you @designfailure
I tried loading model and tokenizer separately like below
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)
But getting the require execute the configuration file again on the line of loading model:
ValueError: Loading tiiuae/falcon-7b-instruct requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.

What was the fix in the end?

cazz1

Sep 2, 2023

I am encountering the same problem. What was the fix plz?

alflur

Sep 4, 2023

hello ! same problem too. Someone have a fix ?Thanks

Khyn

Sep 25, 2023

•

edited Sep 25, 2023

Thank you @designfailure
I tried loading model and tokenizer separately like below
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)
But getting the require execute the configuration file again on the line of loading model:
ValueError: Loading tiiuae/falcon-7b-instruct requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code=True` to remove this error.

hello ! same problem too.What was the fix in the end? Thanks

aovalle

Oct 18, 2023

I had the same issue, resolved it with updating to transformers==4.34.0

rjalexa

Apr 16

Transformers 4.39.3 and I still have the same problem :(

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment