Baichuan 2 RAG增强 AWQ 量化

快速开始/Quick Start

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, TextStreamer
import time

quant_path = "csdc-atl/buffer-baichuan2-13B-rag-awq-int4"
# Load model
model = AutoAWQForCausalLM.from_quantized(quant_path, fuse_layers=True)
tokenizer = AutoTokenizer.from_pretrained(quant_path, trust_remote_code=True)

streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

prompt_template = """\
<s>
{context}
{question}
</s>
"""
context = '''
“温故而知新”有四解:
一为“温故才知新”,温习已学的知识,并且由其中获得新的领悟;
二为“温故及知新”:一方面要温习典章故事,另一方面又努力撷取新的知识。
三为,温故,知新。随着自己阅历的丰富和理解能力的提高,回头再看以前看过的知识,总能从中体会到更多的东西。
第四,是指通过回味历史,而可以预见,以及解决未来的问题。这才是一个真正的大师应该具有的能力。
合并这四种解法,也许更为完整:在能力范围以内,尽量广泛阅览典籍,反复思考其中的涵义,对已经听闻的知识,也要定期复习,能有心得、有领悟;并且也要尽力吸收新知;如此则进可以开拓人类知识的领域,退也可以为先贤的智能赋予时代的意义。像这样融汇新旧、贯通古今方可称是“温故而知新,可以为师矣”。
也有学者以为作“温故及知新”解不太合适,因为按字面上解释,仅做到吸收古今知识而未有领悟心得,只像是知识的买卖者,不足以为师。所以我们就来看看“师”的意义:在论语中师字一共见于14章,其中意义与今日的老师相近者,除本章外还有三章。
'''
question = '''
解释一下‘温故而知新’
'''

start = time.time()
tokens = tokenizer(
    prompt_template.format(context=context, question=question), 
    return_tensors='pt'
).input_ids.cuda()

# Generate output
generation_output = model.generate(
    tokens, 
    streamer=streamer,
    max_new_tokens=512
)
end = time.time()
elapsed = end-start
print('Elapsed time is %f seconds.' % elapsed)


Downloads last month
18
Safetensors
Model size
2.97B params
Tensor type
I32
·
FP16
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.