prhegde commited on
Commit
ed47934
1 Parent(s): 7267c23

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -8
README.md CHANGED
@@ -9,20 +9,17 @@ pipeline_tag: text2text-generation
9
  ---
10
 
11
  ## Model Summary
12
- This is a text2text generative model that can be used for search query rewriting.
13
- It uses sequence-to-sequence model to generate reformulated query and uses policy gradient algorithm to fine-tune the model.
14
- Reward functions used to train the model aims to improve the model’s ability to generate queries with a greater variety of paraphrased keywords.
15
- This model can be used in combination with sparse retrieval techniques (e.g. bm25 based retrieval) to improve the search document recall.
16
 
17
  ### Model Description
18
 
19
  Training Procedure
20
 
21
- 1. The sequence-to-sequence model is initialized with google's t5-base model (https://huggingface.co/google-t5/t5-base).
22
- 2. This model is first trained in supervised manner using ms-marco query pairs data (https://github.com/Narabzad/msmarco-query-reformulation/tree/main/datasets/queries)
23
- 3. Model is then fine-tuned with an RL framework to further improve the model's capability to generate more diverse but relevant queries.
24
  4. It uses a policy gradient approach to fine-tune the model. For a given input query, a set of trajectories (reformulated queries) are sampled from the model and reward is computed. Policy gradient algorithm is applied to update the model.
25
- 5. Reward is computed heuristically to improve the paraphrasing capability. But this can be replaced with any other domain/goal specific reward functions.
26
 
27
  Refer https://github.com/PraveenSH/RL-Query-Reformulation for more details.
28
 
 
9
  ---
10
 
11
  ## Model Summary
12
+ This is a generative model designed specifically for search query rewriting, employing a sequence-to-sequence architecture for generating reformulated queries. Its training incorporates a policy gradient algorithm to enhance performance. The model is trained with reward functions aimed at diversifying the generated queries by paraphrasing keywords. It can be integrated with sparse retrieval methods, such as bm25-based retrieval, to enhance document recall in search.
 
 
 
13
 
14
  ### Model Description
15
 
16
  Training Procedure
17
 
18
+ 1. The training process begins by initializing the sequence-to-sequence model with Google's T5-base model (https://huggingface.co/google-t5/t5-base).
19
+ 2. Initially, the model undergoes supervised training using the MS-MARCO query pairs dataset (https://github.com/Narabzad/msmarco-query-reformulation/tree/main/datasets/queries)
20
+ 3. Subsequently, the model is fine-tuned using a reinforcement learning (RL) framework to enhance its ability to generate queries that are both diverse and relevant.
21
  4. It uses a policy gradient approach to fine-tune the model. For a given input query, a set of trajectories (reformulated queries) are sampled from the model and reward is computed. Policy gradient algorithm is applied to update the model.
22
+ 5. Rewards are heuristically computed to enhance the model's paraphrasing capability. However, these rewards can be substituted with other domain-specific or goal-specific reward functions as needed.
23
 
24
  Refer https://github.com/PraveenSH/RL-Query-Reformulation for more details.
25