What will be the expected training loss and eval loss by loading this adapter for continual training on this task?
#1
by
Xtracta-Qiming
- opened
Hi,
I am loading the adapter and working on continual training on the same ChartQA task and I find the loss and eval_loss looks like not working as continual training and it's pretty much like training from scratch. Here is how I load the adapter. Thanks for any suggestions!
model_id = "Qwen/Qwen2-VL-7B-Instruct"
# adapter_path = "./qwen2-2b-instruct-trl-sft-ChartQA/checkpoint-264"
adapter_path = "sergiopaniego/qwen2-7b-instruct-trl-sft-ChartQA"
"""Next, we’ll load the model and the tokenizer to prepare for inference."""
model = Qwen2VLForConditionalGeneration.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.bfloat16,
)
model.load_adapter(adapter_path)
Hi @Xtracta-Qiming !
Did you manage to solve the issue?
You need to follow the same steps to load the model as in the recipe. Also, since we only use a subset of the dataset in the recipe, if you use another subset that could also be causing the problem.