buzzCraft commited on
Commit
5aaced9
1 Parent(s): 37e4dff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -3
README.md CHANGED
@@ -3,8 +3,8 @@
3
  ## Abstract
4
  The rapid evolution of digital sports media necessitates sophisticated information retrieval systems that can efficiently parse extensive multimodal datasets. This work introduces SoccerRAG, an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large Language Models (LLMs) to extract soccer-related information through natural language queries. By leveraging a multimodal dataset, SoccerRAG supports dynamic querying and automatic data validation, enhancing user interaction and accessibility to sports archives. Our evaluations indicate that SoccerRAG effectively handles complex queries, offering significant improvements over traditional retrieval systems in terms of accuracy and user engagement. The results underscore the potential of using RAG and LLMs in sports analytics, paving the way for future advancements in the accessibility and real-time processing of sports data.
5
 
6
- ## Setup
7
- The framework run on python 3.12
8
  ````bash
9
  pip install -r requirements.txt
10
  ````
@@ -19,7 +19,17 @@ Files needed are:
19
  * Labels-v2.json [link](https://www.soccer-net.org/data#h.5klq86rmgt96)
20
  * Labels-captions.json [link](https://www.soccer-net.org/data#h.ccybjenq8od4)
21
 
22
- For a full guide on how to download the data, please refer to the [SoccerNet package website](https://pypi.org/project/SoccerNet/).
 
 
 
 
 
 
 
 
 
 
23
 
24
  The data should be placed in the ./data/Dataset/SoccerNet/ directory
25
  For each league, create a new folder with the name of the leauge
@@ -27,6 +37,9 @@ For each season create a new folder with the name of the season (YYYY-YYYY)
27
  For each game create a new folder with the name of the game (YYYY-MM-DD - HomeTeam Score - Score AwayTeam)
28
  In each game folder, place the Labels-v2.json and Labels-captions.json files
29
 
 
 
 
30
  ### Setting up and populating the database
31
  To set up the database, execute the following command:
32
  ````bash
 
3
  ## Abstract
4
  The rapid evolution of digital sports media necessitates sophisticated information retrieval systems that can efficiently parse extensive multimodal datasets. This work introduces SoccerRAG, an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large Language Models (LLMs) to extract soccer-related information through natural language queries. By leveraging a multimodal dataset, SoccerRAG supports dynamic querying and automatic data validation, enhancing user interaction and accessibility to sports archives. Our evaluations indicate that SoccerRAG effectively handles complex queries, offering significant improvements over traditional retrieval systems in terms of accuracy and user engagement. The results underscore the potential of using RAG and LLMs in sports analytics, paving the way for future advancements in the accessibility and real-time processing of sports data.
5
 
6
+ ## Enviroment setup
7
+ The framework requires Python 3.12.
8
  ````bash
9
  pip install -r requirements.txt
10
  ````
 
19
  * Labels-v2.json [link](https://www.soccer-net.org/data#h.5klq86rmgt96)
20
  * Labels-captions.json [link](https://www.soccer-net.org/data#h.ccybjenq8od4)
21
 
22
+ One can use the soccernet package to download the data:
23
+ ````bash
24
+ pip install soccernet
25
+ ````
26
+
27
+ ````python
28
+ from SoccerNet.Downloader import SoccerNetDownloader
29
+ mySoccerNetDownloader = SoccerNetDownloader(LocalDirectory="data/dataset/SoccerNet")
30
+ mySoccerNetDownloader.downloadDataTask(task="caption-2023", split=["train", "valid", "test", "challenge"])
31
+ mySoccerNetDownloader.downloadGames(files=["Labels-v2.json"], split=["train", "valid", "test"])
32
+ ````
33
 
34
  The data should be placed in the ./data/Dataset/SoccerNet/ directory
35
  For each league, create a new folder with the name of the leauge
 
37
  For each game create a new folder with the name of the game (YYYY-MM-DD - HomeTeam Score - Score AwayTeam)
38
  In each game folder, place the Labels-v2.json and Labels-captions.json files
39
 
40
+ For a full guide on how to download the data, please refer to the [SoccerNet package website](https://pypi.org/project/SoccerNet/).
41
+
42
+
43
  ### Setting up and populating the database
44
  To set up the database, execute the following command:
45
  ````bash