jaspercatapang commited on
Commit
f177d8c
·
verified ·
1 Parent(s): d58d533

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ inference: false
12
  <img src="logo.png" width=25%>
13
 
14
  # Model Description
15
- RoBERTA ReRanker for Retrieved Results or **R*** (pronounced R-star) is an advanced model designed to enhance search results' relevance and accuracy through reranking. By integrating the retrieval capabilities of **R*** with generative models, this hybrid approach significantly enhances the relevance and contextual depth of search results. Based on the [RoBERTa tiny](https://huggingface.co/haisongzhang/roberta-tiny-cased) architecture, **R*** is specialized in distinguishing relevant from irrelevant query-passage pairs, thereby refining the output of LLMs in retrieval and generative tasks.
16
 
17
  ## Training Data
18
  R* was trained on a dataset derived from the MS MARCO passage ranking dataset, consisting of 2.5 million query-positive passage pairs and an equal number of query-negative passage pairs, totaling 5 million query-passage pairs. This ensures a balanced training approach, exposing R* to both relevant and irrelevant examples equally.
 
12
  <img src="logo.png" width=25%>
13
 
14
  # Model Description
15
+ RoBERTA ReRanker for Retrieved Results or **R*** (pronounced R-star) is an advanced model designed to enhance search results' relevance and accuracy through reranking. By integrating the retrieval capabilities of **R*** with generative models, this hybrid approach significantly enhances the relevance and contextual depth of search results. Based on the [RoBERTa tiny](https://huggingface.co/haisongzhang/roberta-tiny-cased) architecture, **R*** is specialized in distinguishing relevant from irrelevant query-passage pairs, thereby refining the output of LLMs in retrieval and generative tasks. This model is an experiment featured and presented in [PACLIC 38 (2024)](https://sites.google.com/view/paclic38), which would be published in the ACL Anthology.
16
 
17
  ## Training Data
18
  R* was trained on a dataset derived from the MS MARCO passage ranking dataset, consisting of 2.5 million query-positive passage pairs and an equal number of query-negative passage pairs, totaling 5 million query-passage pairs. This ensures a balanced training approach, exposing R* to both relevant and irrelevant examples equally.