jiviteshjain commited on
Commit
5fbb9be
·
1 Parent(s): 1536dad

improvements

Browse files
README.md CHANGED
@@ -8,51 +8,128 @@ app_port: 8501
8
  pinned: false
9
  ---
10
 
11
- # Answering Questions about Pittsburgh using Retrieval Augmented Generation
 
12
 
13
- This is a retrieval-augmented generation (RAG) system for answering questions about the city of Pittsburgh, PA, and
14
- Carnegie Mellon University. Please refer to the technical report (`technical-report.pdf`) for details.
 
15
 
16
- ## System Components
17
 
18
- 1. Data Collection Pipeline, including web scraping using Scrapy and BeautifulSoup, HTML, PDF, and structured data (calendar events, tables) parsing (see `src/data_collection/using_scrapy`).
19
- 2. Synthetic QA pair generation using a large language model (see `src/data_annotation/qa-generation.ipynb`).
20
- 3. Manual annotation of test set (see `src/data_annotation/annotate.py` for a simple Annotation UI in Marimo).
21
- 4. Parameter-efficient reader model finetuning in 4 bit using Quantized LoRA (see `src/finetuning`).
22
- 5. Embedding and dense FAISS index creation (see `src/rag_pipeline/embed_and_build_dense_index.py` and `src/rag_pipeline/load_embeddings_and_build_dense_index.py`).
23
- 6. Sparse entity and date index for augmentation (see `src/rag_pipeline/build_sparse_index.py`).
24
- 7. Query rewording using Hypothetical Document Embeddings (see `src/rag_pipeline/rag_inference.ipynb`).
25
- 7. Retrieval using sparse and dense indices (see `src/rag_pipeline/rag_inference.ipynb`).
26
- 8. Retrieved document reranking using a cross-encoder model (see `src/rag_pipeline/rag_inference.ipynb`).
27
- 9. Retrieved document summarization over all documents (see `src/rag_pipeline/rag_inference.ipynb`).
28
- 10. Answer generation using our quantized, finetuned models (see `src/rag_pipeline/rag_inference.ipynb`).
29
 
30
- ## Example Outputs
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- | *Mistral - No Finetune* | *Mistral - Finetuned* | *Mistral - RAG* |
33
- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------|
34
- | *Question:* Where is the NNOC House located? <br> *Answer:* The NNOC House is located in the heart of the city, on a busy street. It is a three-story building with a large front door and a small side entrance. The building is made of brick and has a red roof. The windows are small and the walls are white. The building is surrounded by a small garden. | *Question:* Where is the NNOC House located? <br> *Answer:* The NNOC House is located in Pittsburgh, Pennsylvania. | *Question:* Where is the NNOC House located? <br> *Answer:* 7101 Apple Street at Pittsburgh, PA. |
35
- | *Question:* What is the theme of Tim Pearce's jokes? <br> *Answer:* The theme of Tim Pearce's jokes is that he is a very funny person. | *Question:* What is the theme of Tim Pearce's jokes? <br> *Answer:* His jokes are about his life and experiences. | *Question:* What is the theme of Tim Pearce's jokes? <br> *Answer:* They are primarily snail-related jokes. |
36
 
37
- As can be seen above, the RAG model provides the most accurate and concise answers that are factual, followed by the
38
- fine-tuned mistral, and then followed by Mistral with in-context learning which does not control it’s output despite strong prompts.
39
 
40
- ## Ablations and Comparisons
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
- Please see the report (`technical-report.pdf`).
43
 
44
  ## Usage
45
 
46
- To run any component of this RAG system, please first install the required dependencies listed in
47
- `src/environment.yaml`. All code has only been tested on Linux. Dependency support for quantization and GPU acceleration
48
- libraries can vary from system to system.
49
 
50
  ```shell
51
- $ conda env create -n rag_env -f src/environment.yaml
52
- $ conda activate rag_env
53
  ```
54
 
55
- ### Data Collection
 
 
 
 
 
56
 
57
  All data collection scripts, which include crawlers and parsers for various websites are located in the
58
  `src/data_collection` repository.
@@ -64,7 +141,7 @@ $ cd src/data_collection/using_scrapy
64
  $ scrapy crawl visit_pittsburgh -O path/to/output.jsonl # or pittsburgh_pa, steelers, pirates, penquins
65
  ```
66
 
67
- ### Data Annotation
68
 
69
  `src/data_annotation` includes a QA generation notebook (
70
  `qa-generation.ipynb`) for automated data processing and question-answer generation.
@@ -72,46 +149,97 @@ $ scrapy crawl visit_pittsburgh -O path/to/output.jsonl # or pittsburgh_pa, ste
72
  To execute the notebook, open it in Jupyter Notebook or a compatible IDE and run the cells in
73
  order.
74
 
75
- ### Finetuning
76
 
77
- `src/finetuning_scripts` includes training notebooks for four 4-bit models, `gemma-2b`,`llama3.2-3b` and
78
- `mistral-7b` designed for fine-tuning the question-answer pairs and model optimization.
79
 
80
  To execute the notebook, open it in Jupyter Notebook or a compatible IDE and run the cells in order.
81
 
82
- ### RAG Pipeline
83
 
84
  Components of the RAG pipeline, such as embedding documents and building the dense index, loading existing embeddings
85
  and building the dense index, building the sparse index and infering using the pipeline can be run as Python scripts
86
- from the `src/rag_pipeline` directory. The appropriate configuration needs to be set in a config file present in the `
87
- src/rag_pipeline/conf` directory and then specified in the command. Config files are managed by using Hydra/OmegaConf
88
  and are in the Hydra format. Please look at existing files for an example. To run the pipeline with a specific
89
  configuration, run:
90
 
91
  ```shell
92
- $ python src/rag_pipeline/embed_and_build_index.py --config-name=config
93
  ```
94
 
95
- Further, the complete inference pipeline can be run as IPython notebook -- `src/rag_pipeline/rag_inference.ipynb`. Open the notebook in a Jupyter IDE, set the configuration in OmegaConf format in the configuration cell located at the bottom of the notebook, and execute all cells in order.
 
 
 
 
 
 
 
 
 
 
 
 
96
 
97
  ### Data
98
 
99
- Our processed dataset, including the complete knowledge corpus, generated and manually annotated QA pairs, embeddings, and dense and sparse indices is available on Kaggle at https://www.kaggle.com/datasets/jiviteshjain/final-rag-data/.
100
 
101
  ### Fine-tuned Model Weights
102
 
103
- Our reader models, fine-tuned using Q-Lora on ~40,000 QA pairs, are available on the Huggingface Hub at the following
104
- links:
105
 
106
- - [Fine-tuned Gemma-2b 4 bit](https://huggingface.co/SupritiVijay/RAG-finetuned-gemma-2b-bnb-4bit)
107
- - [Fine-tuned Mistral-7b 4 bit](https://huggingface.co/neelbhandari/lora_mistral_model)
108
- - [Fine-tuned Llama-3.2-3b 4 bit](https://huggingface.co/neelbhandari/lora_llama3b_model)
109
 
110
  They can be loaded via Unsloth AI's `FastLanguageModel` or Huggingface's `AutoPeftModel` classes. Please see `
111
- src/rag_pipeline/rag_inference.ipynb` for an example.
112
 
113
  ### Weights and Biases dashboard with visualizations of our experiments
114
 
115
- [Link to dashboard (please request an invitation)](https://wandb.ai/aanlp/anlp-ass-2)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
 
117
- ![wandb](wandb.png)
 
8
  pinned: false
9
  ---
10
 
11
+ # Answering Questions using Retrieval-Augmented Generation
12
+ _Implementation and Analysis of State of the Art Methods_
13
 
14
+ <div align="center">
15
+ <img src="assets/pittsburgh.webp" width=200>
16
+ </div>
17
 
18
+ This project implements and evaluates an end-to-end retreival-augmented generation pipeline for question answering, using state of the art techniques and evaluating performance. The system specializes is answering questions about Pittsburgh and Carnegie Mellon University.
19
 
20
+ We use a fine-tuned, quantized version of __Mistral-7B__ as our answering/reader model.
 
 
 
 
 
 
 
 
 
 
21
 
22
+ ## Try it out
23
+
24
+ You can try out the final best-performing system on Huggingface Spaces here: [![Hugging Face Spaces](https://img.shields.io/badge/🤗%20Spaces-Try%20it%20out-blue)](https://huggingface.co/spaces/jiviteshjn/mistral-rag-qa)
25
+ To understand what's happening behind the scenes, please read the rest of this README before trying it out.
26
+
27
+ You can ask questions like:
28
+
29
+ - _Who is the largest employer in Pittsburgh?_
30
+ - _Where is the Smithsonian affiliated regional history museum in Pittsburgh?_
31
+ - _Who is the president of CMU?_
32
+ - _Where was the polio vaccine developed?_
33
+ - and anything else about Pittsburgh and CMU!
34
+
35
+ A sample list of questions and two sets of outputs generated by the system are provided in `outputs/`.
36
+
37
+ > [!NOTE]
38
+ > Huggingface puts inactive spaces to sleep and they can take a while to cold start. If you find the space sleeping, please press restart and wait for a few minutes.
39
+ >
40
+ > Further, this is a complex pipeline, consisting of several models, running on a low-tier GPU space. It may take a few minutes for models to load and caches to warm up, especially after a cold start. Please be patient. Subsequent queries will be faster.
41
+
42
+ ## System description
43
+
44
+ This project implements and analyzes RAG end-to-end, from knowledge corpus collection, model fine-tuning, developing the index, to inference, ablations, and comparisons.
45
+ The following is a brief description of each component:
46
+
47
+ ### Data collection
48
+
49
+ This project builds its knowledge corpus by scraping websites related to Pittsburgh and Carnegie Mellon University. This includes the official city website, [pittsburghpa.gov](https://www.pittsburghpa.gov), websites about the city's sports teams such as [steelers.com](https://www.steelers.com), websites about the city's events, music, and lifestyle such as [visitpittsburgh.com](https://www.visitpittsburgh.com), websites belonging to Carnegie Mellon University, as well as hundreds of relevant Wikipedia and Encyclopaedia Brittanica pages (obtained by searching for keywords that are related to Pittsburgh according to BERT embeddings).
50
+
51
+ Scrapy is used as the primary web crawler, owing to its flexibility and controls for not overloading web servers. Beautifulsoup and PDF parsers are also used where necessary, and manual tuning if performed to extract structured data such as calendar events and news items.
52
+
53
+ See `src/data_collection`.
54
+
55
+ ### Synthetic QA pair generation.
56
+
57
+ To fine-tune the answering/reader model as well as evaluate our system, we generate synthetic questions and answers from the knowledge corpus using a large language model. We use quantized models for efficiency. A total of __~38,000__ QA pairs are generated.
58
+
59
+ See `src/data_annotation/qa-generation.ipynb`.
60
+
61
+
62
+ ### Manual annotation of test set
63
+
64
+ To evaluate our system on gold standard examples, ~100 question answer pairs are manually annotated.
65
+
66
+ See `src/data_annotation/annotate.py` for a simple Annotation UI in Marimo.
67
+
68
+
69
+ ### Reader model fine-tuning
70
+
71
+ We fine-tune the reader model on the generated QA pairs. We use parameter efficient fine-tuning with 4 bit quantization for efficiency, using Quantized LoRA. We compare Mistral 7B, Llama 3.2 3B, and Gemma 2B, and fine Mistral to be the best performing model.
72
+
73
+ See `src/finetuning`.
74
+
75
+
76
+ ### Embedding and dense FAISS index creation
77
+
78
+ We chunk our documents to a length of 512 tokens and use FAISS as our index store, using a quantized HNSW index for its good performance while saving memory. We use Snowflake's Arctic Embed Medium Long for embedding textual documents, owing to its small size, large context length, and near-SOTA performance on the MTEB leaderboard. We finally embed around 20,000 documents from 14,000 URLs.
79
+
80
+ See `src/rag_pipeline/embed_and_build_dense_index.py` and `src/rag_pipeline/load_embeddings_and_build_dense_index.py`.
81
+
82
+
83
+ ### Sparse entity and date index for augmentation
84
+
85
+ Our experiments reveal that our retrieval system struggles with entities such as event names and dates, as documents corresponding to two different events tend to be similar as a whole, differing only in small specifics, which translates to embeddings that are similar.
86
+
87
+ To mitigate this, we experiment with a sparse TF-IDF index built only over extracted entities and dates. We extract dates at index-building and inference time using SpaCy and entities using an off-the-shelf finetuned RoBERTa model. In practice however, we find the sparse index to be noisy (as is to be expected), and its benefits are not enough to offset the added noise and latency to the retriever system. We hypothesize that fine-tuning the embedding model contrastively will be a better solution to this problem.
88
+
89
+ See `src/rag_pipeline/build_sparse_index.py`.
90
+
91
+
92
+ ### Query rewording using Hypothetical Document Embeddings
93
+
94
+ Query rewriting to make it more similar to the documents that potentially would contain the answer has emerged as a popular technique. We implement this, using an off-the-shelf LLM as the rewriting model. We see significant gains as a result of this modification.
95
+
96
+ See `src/rag_pipeline/rag_validation.py`.
97
 
 
 
 
 
98
 
99
+ ### Retrieval and reranking
 
100
 
101
+ We retrieve documents from the dense and sparse indices separately, and then rerank them using a cross-encoder model (BAAI's BGE-reranker-v2-m3), only keeping the top scoring third of the documents. This approach works remarkably well in maintaining high recall, while also making sure the context is not too large for the reader model to handle (high precision).
102
+
103
+ See `src/rag_pipeline/rag_validation.py`.
104
+
105
+
106
+ ### Retrieved document summarization
107
+
108
+ Our documents are at most 512 tokens, which makes context lengths long enough to degade performance, even at small k's (k = 3, 4, or 5). To mitigate this, we summarize the retrieved documents using an LLM. The summarization LLM is query-aware.
109
+
110
+ See `src/rag_pipeline/rag_validation.py`.
111
+
112
+
113
+ ### Answer generation using our quantized, finetuned models
114
+
115
+ Finally, we get to generating answers! The pipeline implemented in `src/rag_pipeline/rag_validation.py` is batched and meant to run evaluations on a test set, and compute metrics. `src/rag_qa.py` implements a simple question answering class that uses the pipeline to anser queries one at a time. `src/app.py` uses this class to create the demo app hosted on Huggingface.
116
 
 
117
 
118
  ## Usage
119
 
120
+ To run any component of this RAG system, please first install the required dependencies listed in `src/requirements.txt` using pip.
 
 
121
 
122
  ```shell
123
+ $ # preferable inside a virtual environment
124
+ $ pip install -r src/requirements.txt
125
  ```
126
 
127
+ > [!WARNING]
128
+ > Many quantization frameworks are under active development and support varies across systems and hardwares. This project uses BitsAndBytes, which is not compatible with Apple Silicon at this time. This project has only been tested on Linux servers. The exact requirements may require some tweaking to ensure compatibility with your system (hardware, OS, CUDA versions, etc.)
129
+ >
130
+ > The Huggingface space is the more convenient way to try this project out.
131
+
132
+ ### Data collection
133
 
134
  All data collection scripts, which include crawlers and parsers for various websites are located in the
135
  `src/data_collection` repository.
 
141
  $ scrapy crawl visit_pittsburgh -O path/to/output.jsonl # or pittsburgh_pa, steelers, pirates, penquins
142
  ```
143
 
144
+ ### Data annotation
145
 
146
  `src/data_annotation` includes a QA generation notebook (
147
  `qa-generation.ipynb`) for automated data processing and question-answer generation.
 
149
  To execute the notebook, open it in Jupyter Notebook or a compatible IDE and run the cells in
150
  order.
151
 
152
+ ### Fine-tuning
153
 
154
+ `src/finetuning_scripts` includes the notebook used to fine-tune the Mistral 7B model on the generated QA pairs using Q-LoRA in 4 bit quantization.
 
155
 
156
  To execute the notebook, open it in Jupyter Notebook or a compatible IDE and run the cells in order.
157
 
158
+ ### RAG pipeline
159
 
160
  Components of the RAG pipeline, such as embedding documents and building the dense index, loading existing embeddings
161
  and building the dense index, building the sparse index and infering using the pipeline can be run as Python scripts
162
+ from the `src/rag_pipeline` directory. The appropriate configuration needs to be set in a config file present in the `src/rag_pipeline/conf` directory and then specified in the command. Config files are managed using Hydra/OmegaConf
 
163
  and are in the Hydra format. Please look at existing files for an example. To run the pipeline with a specific
164
  configuration, run:
165
 
166
  ```shell
167
+ $ python src/rag_pipeline/embed_and_build_index.py --config-name=validation
168
  ```
169
 
170
+ The complete validation pipeline can be run as:
171
+
172
+ ```shell
173
+ $ python src/rag_pipeline/rag_validation.py --config-name=validation
174
+ ```
175
+
176
+ ### Demo app
177
+
178
+ The demo app can be run as:
179
+
180
+ ```shell
181
+ streamlit run src/app.py
182
+ ```
183
 
184
  ### Data
185
 
186
+ The processed dataset, including the complete knowledge corpus, generated and manually annotated QA pairs, embeddings, and dense and sparse indices is available on [Kaggle](https://www.kaggle.com/datasets/jiviteshjain/final-rag-data/).
187
 
188
  ### Fine-tuned Model Weights
189
 
190
+ Adapters for the best-performing reader model, Mistral 7B fine-tuned using Q-Lora on ~38,000 QA pairs, are available on the [Huggingface Hub](https://huggingface.co/jiviteshjn/pittsburgh-rag-qa-mistral-finetuned).
 
191
 
 
 
 
192
 
193
  They can be loaded via Unsloth AI's `FastLanguageModel` or Huggingface's `AutoPeftModel` classes. Please see `
194
+ src/rag_pipeline/rag_validation.py` for an example.
195
 
196
  ### Weights and Biases dashboard with visualizations of our experiments
197
 
198
+ Along with LangChain for implementation, this project uses Hydra for configuration management and Weights and Biases for experiment tracking.
199
+
200
+ ![wandb](assets/wandb.png)
201
+
202
+
203
+
204
+ ## Experiments and ablations
205
+
206
+ ### RAG and fine-tuning helps
207
+
208
+ While numbers tell the same story, the difference is best seen through examples. Note the quality and factuality difference across the Mistral 7B model off-the-shelf, with fine-tuning, and with fine-tuning and retrieved context.
209
+
210
+ | *Mistral - No Finetune* | *Mistral - Finetuned* | *Mistral - RAG* |
211
+ |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------|
212
+ | *Question:* Where is the NNOC House located? <br> *Answer:* The NNOC House is located in the heart of the city, on a busy street. It is a three-story building with a large front door and a small side entrance. The building is made of brick and has a red roof. The windows are small and the walls are white. The building is surrounded by a small garden. | *Question:* Where is the NNOC House located? <br> *Answer:* The NNOC House is located in Pittsburgh, Pennsylvania. | *Question:* Where is the NNOC House located? <br> *Answer:* 7101 Apple Street at Pittsburgh, PA. |
213
+ | *Question:* What is the theme of Tim Pearce's jokes? <br> *Answer:* The theme of Tim Pearce's jokes is that he is a very funny person. | *Question:* What is the theme of Tim Pearce's jokes? <br> *Answer:* His jokes are about his life and experiences. | *Question:* What is the theme of Tim Pearce's jokes? <br> *Answer:* They are primarily snail-related jokes. |
214
+
215
+
216
+ ### Sophisticated retrieval techniques help
217
+
218
+ <div align="left">
219
+ <img src="assets/fig-1.jpg" width=400>
220
+ </div>
221
+
222
+ Reranking of retrieved documents using a cross-encoder
223
+ model as well as query rewording using HyDE lead to
224
+ significant performance gains. All results reported with
225
+ k = 5.
226
+
227
+ ### Higher k's lead to better recall; sparse retrieval is noisy
228
+
229
+ <div align="left">
230
+ <img src="assets/fig-2.jpg" width=400>
231
+ </div>
232
+
233
+ The x-axis label corresponds to dense k - sparse k - reranking k. The first two sets of bars show that dense retrieval significantly beats sparse entity-based retrieval. Sets 2 and 3 show the benefit of using a larger k for dense retrieval – even if the mean reciprocal rank (MRR) goes down, the overall recall rate improves. Finally, sets 3, 4, and 5 show that the recall rate improves by using larger k’s for reranking, without hurting the MRR.
234
+
235
+ ### Mistral 7B is the best performing model*
236
+
237
+ <div align="left">
238
+ <img src="assets/fig-3.jpg" width=500>
239
+ </div>
240
+
241
+ Mistral 7B performs the best on our test set according to the SQuAD exact match metric, which Gemma 2B performs better according to the SQuAD F1 metric, hence the star. The fine-tuned models perform significantly better than off-the-shelf models (which manage to get a 0 on the exact match metric).
242
+
243
+ ## Demo app screenshot
244
 
245
+ ![demo screenshot](assets/app.jpg)
assets/app.jpg ADDED
assets/fig-1.jpg ADDED
assets/fig-2.jpg ADDED
assets/fig-3.jpg ADDED
pittsburgh.webp → assets/pittsburgh.webp RENAMED
File without changes
wandb.png → assets/wandb.png RENAMED
File without changes
src/requirements.txt ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ accelerate==1.1.1
2
+ aiohappyeyeballs==2.4.3
3
+ aiohttp==3.11.2
4
+ aiosignal==1.3.1
5
+ altair==5.4.1
6
+ annotated-types==0.7.0
7
+ antlr4-python3-runtime==4.9.3
8
+ anyio==4.6.2.post1
9
+ attrs==24.2.0
10
+ bitsandbytes==0.44.1
11
+ blinker==1.9.0
12
+ blis==0.7.11
13
+ cachetools==5.5.0
14
+ catalogue==2.0.10
15
+ certifi==2024.8.30
16
+ charset-normalizer==3.4.0
17
+ click==8.1.7
18
+ cloudpathlib==0.20.0
19
+ confection==0.1.5
20
+ cymem==2.0.8
21
+ dataclasses-json==0.6.7
22
+ datasets==3.1.0
23
+ date-spacy==0.0.1
24
+ dateparser==1.2.0
25
+ dill==0.3.8
26
+ docker-pycreds==0.4.0
27
+ docstring_parser==0.16
28
+ einops==0.8.0
29
+ evaluate==0.4.3
30
+ faiss-cpu==1.9.0
31
+ fastapi==0.115.5
32
+ filelock==3.16.1
33
+ frozenlist==1.5.0
34
+ fsspec==2024.9.0
35
+ gitdb==4.0.11
36
+ GitPython==3.1.43
37
+ greenlet==3.1.1
38
+ h11==0.14.0
39
+ hf_transfer==0.1.8
40
+ httpcore==1.0.7
41
+ httpx==0.27.2
42
+ httpx-sse==0.4.0
43
+ huggingface-hub==0.23.5
44
+ hydra-core==1.3.2
45
+ idna==3.10
46
+ Jinja2==3.1.4
47
+ joblib==1.4.2
48
+ jsonpatch==1.33
49
+ jsonpointer==3.0.0
50
+ jsonschema==4.23.0
51
+ jsonschema-specifications==2024.10.1
52
+ langchain==0.3.7
53
+ langchain-community==0.3.7
54
+ langchain-core==0.3.19
55
+ langchain-huggingface==0.1.2
56
+ langchain-text-splitters==0.3.2
57
+ langcodes==3.4.1
58
+ langsmith==0.1.143
59
+ language_data==1.2.0
60
+ marisa-trie==1.2.1
61
+ markdown-it-py==3.0.0
62
+ MarkupSafe==3.0.2
63
+ marshmallow==3.23.1
64
+ mdurl==0.1.2
65
+ mpmath==1.3.0
66
+ multidict==6.1.0
67
+ multiprocess==0.70.16
68
+ murmurhash==1.0.10
69
+ mypy-extensions==1.0.0
70
+ narwhals==1.13.5
71
+ networkx==3.4.2
72
+ numpy==1.26.4
73
+ nvidia-cublas-cu12==12.1.3.1
74
+ nvidia-cuda-cupti-cu12==12.1.105
75
+ nvidia-cuda-nvrtc-cu12==12.1.105
76
+ nvidia-cuda-runtime-cu12==12.1.105
77
+ nvidia-cudnn-cu12==9.1.0.70
78
+ nvidia-cufft-cu12==11.0.2.54
79
+ nvidia-curand-cu12==10.3.2.106
80
+ nvidia-cusolver-cu12==11.4.5.107
81
+ nvidia-cusparse-cu12==12.1.0.106
82
+ nvidia-nccl-cu12==2.20.5
83
+ nvidia-nvjitlink-cu12==12.6.77
84
+ nvidia-nvtx-cu12==12.1.105
85
+ omegaconf==2.3.0
86
+ orjson==3.10.11
87
+ packaging==24.2
88
+ pandas==2.2.2
89
+ peft==0.13.2
90
+ pillow==11.0.0
91
+ pip==24.0
92
+ pip3-autoremove==1.2.2
93
+ platformdirs==4.3.6
94
+ preshed==3.0.9
95
+ propcache==0.2.0
96
+ protobuf==3.20.3
97
+ psutil==6.1.0
98
+ pyarrow==18.0.0
99
+ pydantic==2.9.2
100
+ pydantic_core==2.23.4
101
+ pydantic-settings==2.6.1
102
+ pydeck==0.9.1
103
+ Pygments==2.18.0
104
+ python-dateutil==2.9.0.post0
105
+ python-dotenv==1.0.1
106
+ pytz==2024.2
107
+ PyYAML==6.0.2
108
+ referencing==0.35.1
109
+ regex==2024.11.6
110
+ requests==2.32.3
111
+ requests-toolbelt==1.0.0
112
+ rich==13.9.4
113
+ rpds-py==0.21.0
114
+ safetensors==0.4.5
115
+ scikit-learn==1.5.2
116
+ scipy==1.14.1
117
+ sentence-transformers==3.3.0
118
+ sentencepiece==0.2.0
119
+ sentry-sdk==2.18.0
120
+ seqeval==1.2.2
121
+ setproctitle==1.3.3
122
+ setuptools==65.5.1
123
+ shellingham==1.5.4
124
+ shtab==1.7.1
125
+ six==1.16.0
126
+ smart-open==7.0.5
127
+ smmap==5.0.1
128
+ sniffio==1.3.1
129
+ spacy==3.7.5
130
+ spacy-legacy==3.0.12
131
+ spacy-loggers==1.0.5
132
+ span-marker==1.5.0
133
+ SQLAlchemy==2.0.35
134
+ srsly==2.4.8
135
+ starlette==0.41.2
136
+ streamlit==1.40.1
137
+ sympy==1.13.3
138
+ tenacity==9.0.0
139
+ thinc==8.2.5
140
+ threadpoolctl==3.5.0
141
+ tokenizers==0.20.3
142
+ toml==0.10.2
143
+ torch==2.4.0
144
+ torchaudio==2.4.0
145
+ torchvision==0.19.0
146
+ tornado==6.4.1
147
+ tqdm==4.67.0
148
+ transformers==4.46.2
149
+ triton==3.0.0
150
+ trl==0.12.1
151
+ typer==0.13.0
152
+ typing_extensions==4.12.2
153
+ typing-inspect==0.9.0
154
+ tyro==0.8.14
155
+ tzdata==2024.2
156
+ tzlocal==5.2
157
+ unsloth==2024.11.7
158
+ unsloth_zoo==2024.11.5
159
+ urllib3==2.2.3
160
+ wandb==0.18.7
161
+ wasabi==1.1.3
162
+ watchdog==6.0.0
163
+ weasel==0.4.1
164
+ wheel==0.45.0
165
+ wrapt==1.16.0
166
+ xformers==0.0.27.post2
167
+ xxhash==3.5.0
168
+ yarl==1.17.1