catyung
/

t5l-turbo-hotpot-0331

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

catyung commited on Aug 7, 2024

Commit

7c2bec0

·

verified ·

1 Parent(s): 315aebc

Update README.md

Files changed (1) hide show

README.md +14 -7

README.md CHANGED Viewed

@@ -7,19 +7,22 @@ license: apache-2.0
 <!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 - **Developed by:** https://github.com/xbmxb/RAG-query-rewriting
 - **Model type:** google/t5-large
-### Downstream Use [optional]
-```from transformers import T5Tokenizer,T5ForConditionalGeneration,BitsAndBytesConfig
 import torch
-from google.colab import userdata
-os.environ["HUGGINGFACE_TOKEN"] = userdata.get('HF_TOKEN')
 quantization_config = BitsAndBytesConfig(
     load_in_8bit=True)
@@ -30,6 +33,11 @@ model = T5ForConditionalGeneration.from_pretrained('catyung/t5l-turbo-hotpot-033
 tokenizer = T5Tokenizer.from_pretrained('catyung/t5l-turbo-hotpot-0331')
 input_ids = tokenizer(rewrite_prompt, return_tensors="pt").input_ids.to(device)
@@ -38,5 +46,4 @@ outputs = model.generate(input_ids,max_new_tokens=50)
 result = tokenizer.decode(outputs[0], skip_special_tokens=True)
 print(result)
 ```

 <!-- Provide a longer summary of what this model is. -->
+Query Rewriting in Retrieval-Augmented Large Language Models
+Arxiv : https://arxiv.org/abs/2305.14283
+Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline, making remarkable progress in knowledge-intensive tasks. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs from the perspective of the query rewriting. We first prompt an LLM to generate the query, then use a web search engine to retrieve contexts. Furthermore, to better align the query to the frozen modules, we propose a trainable scheme for our pipeline. A small language model is adopted as a trainable rewriter to cater to the black-box LLM reader. The rewriter is trained using the feedback of the LLM reader by reinforcement learning.
 - **Developed by:** https://github.com/xbmxb/RAG-query-rewriting
 - **Model type:** google/t5-large
+- **Checkpoint:** checkpoint_20
+### Inference
+```
+from transformers import T5Tokenizer,T5ForConditionalGeneration,BitsAndBytesConfig
 import torch
+# 8 bit Quantization
 quantization_config = BitsAndBytesConfig(
     load_in_8bit=True)
 tokenizer = T5Tokenizer.from_pretrained('catyung/t5l-turbo-hotpot-0331')
+rewrite_prompt = f"""rewrite a better search query: {user_query}
+answer:"""
+# Inference
+user_query = "What profession does Nicholas Ray and Elia Kazan have in common?"
 input_ids = tokenizer(rewrite_prompt, return_tensors="pt").input_ids.to(device)
 result = tokenizer.decode(outputs[0], skip_special_tokens=True)
 print(result)
 ```