Spaces:

towardsai-tutors
/

ai-tutor-chatbot

Sleeping

App Files Files Community

Omar Solano commited on May 16

Commit

3d774c7

•

1 Parent(s): be9c1c3

re-add scripts

Browse files

Files changed (29) hide show

.gitattributes +4 -0
.github/workflows/deploy_hf.yaml +21 -0
.gitignore +164 -0
README.md +55 -0
notebooks/01-Basic_Tutor.ipynb +291 -0
notebooks/02-Basic_RAG.ipynb +1083 -0
notebooks/03-RAG_with_LlamaIndex.ipynb +360 -0
notebooks/04-RAG_with_VectorStore.ipynb +449 -0
notebooks/05-Improve_Prompts_+_Add_Source.ipynb +1420 -0
notebooks/06-Evaluate_RAG.ipynb +1491 -0
notebooks/07-RAG_Improve_Chunking.ipynb +0 -0
notebooks/08-Finetune_Embedding.ipynb +0 -0
notebooks/09-Better_Embedding_Model.ipynb +1575 -0
notebooks/10-Adding_Reranking.ipynb +1462 -0
notebooks/11-Adding_Hybrid_Search.ipynb +1645 -0
notebooks/12-Improve_Query.ipynb +1786 -0
notebooks/13-Adding_Router.ipynb +0 -0
notebooks/14-Adding_Chat.ipynb +1618 -0
notebooks/15-Use_OpenSource_Models.ipynb +0 -0
notebooks/17-Using_LLMs_to_rank_chunks_as_the_Judge.ipynb +830 -0
notebooks/Crawl_a_Website.ipynb +574 -0
notebooks/Web_Search_API.ipynb +491 -0
requirements.txt +18 -0
scripts/basic_tutor.py +60 -0
scripts/call_openai.py +79 -0
scripts/create_db.ipynb +380 -0
scripts/gradio-ui.py +295 -0
scripts/tutor_prompts.py +100 -0
scripts/utils.py +16 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,4 @@

+scripts/ai-tutor-db/** filter=lfs diff=lfs merge=lfs -text
+*.csv filter=lfs diff=lfs merge=lfs -text
+*.json filter=lfs diff=lfs merge=lfs -text
+*.jsonl filter=lfs diff=lfs merge=lfs -text

.github/workflows/deploy_hf.yaml ADDED Viewed

	@@ -0,0 +1,21 @@

+name: Sync to Hugging Face hub
+on:
+  push:
+    branches: [main]
+  # to run this workflow manually from the Actions tab
+  workflow_dispatch:
+jobs:
+  sync-to-hub:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+        with:
+          fetch-depth: 0
+          lfs: true
+      - name: Push to hub
+        env:
+          HF_TOKEN: ${{ secrets.HF_TOKEN }}
+          HF_USERNAME: ${{ secrets.HF_USERNAME }}
+        run: git push --force https://$HF_USERNAME:$HF_TOKEN@huggingface.co/spaces/towardsai-buster/ai-tutor-chatbot main:main

.gitignore ADDED Viewed

	@@ -0,0 +1,164 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/#use-with-ide
+.pdm.toml
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+ai-tutor/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+notebooks/mini-llama-articles/
+.vscode/

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+title: AI Tutor Chatbot
+emoji: 🧑🏻‍🏫
+colorFrom: gray
+colorTo: pink
+sdk: gradio
+sdk_version: 4.19.2
+app_file: scripts/gradio-ui.py
+pinned: false
+---
+---
+This project creates a helpful and accurate AI Tutor chatbot, leveraging GPT-3.5-Turbo and a RAG system. We design it to address student questions about AI with precision and clarity.
+### Installation
+1. **Create a new Python environment:**
+    ```bash
+    python -m venv .venv
+    ```
+    This command creates a virtual environment named `.venv`.
+2. **Activate the environment:**
+    For macOS and Linux:
+    ```bash
+    source .venv/bin/activate
+    ```
+3. **Install the dependencies:**
+    ```bash
+    pip install -r requirements.txt
+    ```
+### Usage
+1. **Set environment variables:**
+    Before running the application, you need to set up your OpenAI API key and MongoDB URI as environment variables:
+    ```bash
+    export OPENAI_API_KEY=your_openai_api_key_here
+    export MONGODB_URI=your_mongodb_uri_here
+    ```
+2. **Run the application:**
+    ```bash
+    python scripts/gradio-ui.py
+    ```
+    This command starts the Gradio interface for the AI Tutor chatbot.

notebooks/01-Basic_Tutor.ipynb ADDED Viewed

	@@ -0,0 +1,291 @@

+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "authorship_tag": "ABX9TyOUuEM41HPKH6uCJFqocvSD",
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/01-Basic_Tutor.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Install Packages and Setup Variables"
+      ],
+      "metadata": {
+        "id": "DMXyyXD0xix9"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "o4Q0N2omkAoZ",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "703fe996-2acf-4e90-92c1-252041ba7d7a"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m225.4/225.4 kB\u001b[0m \u001b[31m3.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m51.7/51.7 kB\u001b[0m \u001b[31m1.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m8.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.6/75.6 kB\u001b[0m \u001b[31m2.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.1/3.1 MB\u001b[0m \u001b[31m17.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.8/77.8 kB\u001b[0m \u001b[31m6.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m5.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h"
+          ]
+        }
+      ],
+      "source": [
+        "!pip install -q openai==1.12.0 cohere==4.47 tiktoken==0.6.0"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\""
+      ],
+      "metadata": {
+        "id": "xxK7EAAvr2aT"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Load the API client"
+      ],
+      "metadata": {
+        "id": "68RbStS-xpbL"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from openai import OpenAI\n",
+        "\n",
+        "# Defining the \"client\" object that enables\n",
+        "# us to connect to OpenAI API endpoints.\n",
+        "client = OpenAI()"
+      ],
+      "metadata": {
+        "id": "La8hdWqJkFkh"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Query the API"
+      ],
+      "metadata": {
+        "id": "CC-sa_uv6J2C"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Define two questions: 1) Related to AI, 2) Unrelated topic.\n",
+        "# These questions will be used to evaluate model's performance.\n",
+        "QUESTION_AI = \"List a number of famous artificial intelligence frameworks?\"\n",
+        "QUESTION_NOT_AI = \"What is the name of the highest mountain in the world and its height?\""
+      ],
+      "metadata": {
+        "id": "7JRrn0uIsBfg"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Defining a function to answer a question using \"gpt-3.5-turbo-16k\" model.\n",
+        "def ask_ai_tutor(question):\n",
+        "    try:\n",
+        "        # Formulating the system prompt and condition the model to answer only AI-related questions.\n",
+        "        system_prompt = (\n",
+        "            \"You are an AI tutor specialized in answering artificial intelligence-related questions. \"\n",
+        "            \"Only answer AI-related question, else say that you cannot answer this question.\"\n",
+        "        )\n",
+        "\n",
+        "        # Create a user prompt with the user's question\n",
+        "        prompt = f\"Please provide an informative and accurate answer to the following question.\\nQuestion: {question}\\nAnswer:\"\n",
+        "\n",
+        "        # Call the OpenAI API\n",
+        "        response = client.chat.completions.create(\n",
+        "                model='gpt-3.5-turbo-16k',\n",
+        "                temperature=0.0,\n",
+        "                messages=[\n",
+        "                    {\"role\": \"system\", \"content\": system_prompt},\n",
+        "                    {\"role\": \"user\", \"content\": prompt}\n",
+        "                ]\n",
+        "            )\n",
+        "\n",
+        "        # Return the AI's response\n",
+        "        return response.choices[0].message.content.strip()\n",
+        "\n",
+        "    except Exception as e:\n",
+        "        return f\"An error occurred: {e}\""
+      ],
+      "metadata": {
+        "id": "CcP26IauuBuV"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Ask the AI-related question.\n",
+        "RES_AI = ask_ai_tutor( QUESTION_AI )\n",
+        "print( RES_AI )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "W_dbwURpufR7",
+        "outputId": "3cd84fb9-fe6f-4561-e9ee-ed606a983629"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Sure! There are several famous artificial intelligence frameworks that are widely used in the field. Some of the popular ones include:\n",
+            "\n",
+            "1. TensorFlow: Developed by Google, TensorFlow is an open-source framework that is widely used for machine learning and deep learning tasks. It provides a comprehensive ecosystem of tools, libraries, and resources for building and deploying AI models.\n",
+            "\n",
+            "2. PyTorch: Developed by Facebook's AI Research lab, PyTorch is another popular open-source framework for deep learning. It is known for its dynamic computational graph, which allows for more flexibility and ease of use compared to other frameworks.\n",
+            "\n",
+            "3. Keras: Keras is a high-level neural networks API written in Python. It is built on top of TensorFlow and provides a user-friendly interface for building and training deep learning models. Keras is known for its simplicity and ease of use, making it a popular choice for beginners.\n",
+            "\n",
+            "4. Caffe: Caffe is a deep learning framework developed by Berkeley AI Research (BAIR). It is known for its speed and efficiency, particularly for convolutional neural networks (CNNs). Caffe has been widely used in computer vision tasks and has a large community of users and contributors.\n",
+            "\n",
+            "5. Theano: Theano is a Python library that allows for efficient mathematical computations, particularly for deep learning tasks. It provides a high-level interface for defining and optimizing mathematical expressions, making it a popular choice for researchers and developers.\n",
+            "\n",
+            "These are just a few examples of famous AI frameworks, and there are many others available depending on specific needs and preferences.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Ask the unrelated question.\n",
+        "RES_NOT_AI = ask_ai_tutor( QUESTION_NOT_AI )\n",
+        "print( RES_NOT_AI )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "37YuVJQquhpN",
+        "outputId": "4550c44d-2150-4cca-f23e-c89ea43e2040"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "I'm sorry, but I cannot answer that question as it is not related to artificial intelligence.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# History"
+      ],
+      "metadata": {
+        "id": "NRBgk6WToIK0"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "response = client.chat.completions.create(\n",
+        "              model='gpt-3.5-turbo-16k',\n",
+        "              temperature=0.0,\n",
+        "              messages=[\n",
+        "                  {\"role\": \"system\", \"content\": \"You are an AI tutor specialized in answering artificial intelligence-related questions. Only answer AI-related question, else say that you cannot answer this question.\"},\n",
+        "                  {\"role\": \"user\", \"content\": \"Please provide an informative and accurate answer to the following question.\\nQuestion: List a number of famous artificial intelligence frameworks?\\nAnswer:\"},\n",
+        "                  {\"role\": \"assistant\", \"content\": RES_AI},\n",
+        "                  {\"role\": \"user\", \"content\": \"Please provide an informative and accurate answer to the following question.\\nQuestion: What is the name of the highest mountain in the world and its height?\\nAnswer:\"},\n",
+        "                  {\"role\": \"assistant\", \"content\": RES_NOT_AI},\n",
+        "                  {\"role\": \"user\", \"content\": \"Please provide an informative and accurate answer to the following question.\\nQuestion: Can you write a summary of the first suggested AI framework in the first question?\\nAnswer:\"}\n",
+        "              ]\n",
+        "            )\n",
+        "\n",
+        "print( response.choices[0].message.content.strip() )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "0_6GN2XsoEyM",
+        "outputId": "3e66a833-a552-4bcc-9808-7b9f6b539310"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Certainly! The first suggested AI framework in the previous question was TensorFlow. TensorFlow is an open-source framework developed by Google that has gained significant popularity in the field of artificial intelligence. It is primarily used for building and training machine learning and deep learning models.\n",
+            "\n",
+            "TensorFlow provides a comprehensive ecosystem of tools, libraries, and resources that make it easier for developers to create and deploy AI models. It offers a flexible architecture that allows for efficient computation on both CPUs and GPUs, enabling faster training and inference.\n",
+            "\n",
+            "One of the key features of TensorFlow is its ability to construct and execute computational graphs. These graphs represent the flow of data through a series of mathematical operations, making it easier to visualize and understand the model's structure. TensorFlow also supports automatic differentiation, which simplifies the process of calculating gradients for training neural networks.\n",
+            "\n",
+            "Moreover, TensorFlow has a vast community of users and contributors, which means there is extensive documentation, tutorials, and pre-trained models available. This makes it easier for developers to get started and leverage the collective knowledge of the community.\n",
+            "\n",
+            "Overall, TensorFlow is a powerful and versatile AI framework that has been widely adopted in various domains, including computer vision, natural language processing, and reinforcement learning. Its flexibility, scalability, and extensive community support make it a popular choice for both researchers and practitioners in the field of artificial intelligence.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [],
+      "metadata": {
+        "id": "ET_l06LiojaN"
+      },
+      "execution_count": null,
+      "outputs": []
+    }
+  ]
+}

notebooks/02-Basic_RAG.ipynb ADDED Viewed

	@@ -0,0 +1,1083 @@

+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "authorship_tag": "ABX9TyMiGemqWYAYHaqF1t8bElQ/",
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    },
+    "widgets": {
+      "application/vnd.jupyter.widget-state+json": {
+        "46a91770024e4802acd3e64e9bc46f32": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_613898a418d64df3b18d35083f0bb36d",
+              "IPY_MODEL_9f9427eb6a644166906bb321f13eaf48",
+              "IPY_MODEL_a4a232c5b5e1493897e9acdd25b8efd4"
+            ],
+            "layout": "IPY_MODEL_b2e91819e1c94f28b7bbad66918cb797"
+          }
+        },
+        "613898a418d64df3b18d35083f0bb36d": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_010cbcb0f1364576b15f792f4d11f605",
+            "placeholder": "",
+            "style": "IPY_MODEL_f51d5da0f39e4c1885357d3d4c9964d9",
+            "value": ""
+          }
+        },
+        "9f9427eb6a644166906bb321f13eaf48": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_c4ceff5437e0470089c161e21488d2a7",
+            "max": 1,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_6aafd52b0e3e4e0183b1666ad1e8a448",
+            "value": 1
+          }
+        },
+        "a4a232c5b5e1493897e9acdd25b8efd4": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_80137fc11d4b4e518d8c8957ca5461b1",
+            "placeholder": "",
+            "style": "IPY_MODEL_c4236d507b354bff830620a8bde32191",
+            "value": " 174/? [00:31&lt;00:00,  6.30it/s]"
+          }
+        },
+        "b2e91819e1c94f28b7bbad66918cb797": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "010cbcb0f1364576b15f792f4d11f605": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "f51d5da0f39e4c1885357d3d4c9964d9": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "c4ceff5437e0470089c161e21488d2a7": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": "20px"
+          }
+        },
+        "6aafd52b0e3e4e0183b1666ad1e8a448": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "80137fc11d4b4e518d8c8957ca5461b1": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "c4236d507b354bff830620a8bde32191": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        }
+      }
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/02-Basic_RAG.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Install Packages and Setup Variables"
+      ],
+      "metadata": {
+        "id": "4Tw3tvMs6R-Y"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "HaB4G9zr0BYm",
+        "outputId": "2a76e676-6fae-44df-ae8c-e4869bfbbc2d"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m225.4/225.4 kB\u001b[0m \u001b[31m2.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m51.7/51.7 kB\u001b[0m \u001b[31m3.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m17.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.6/75.6 kB\u001b[0m \u001b[31m6.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━���━━━━━━━\u001b[0m \u001b[32m3.1/3.1 MB\u001b[0m \u001b[31m17.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.8/77.8 kB\u001b[0m \u001b[31m3.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m5.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h"
+          ]
+        }
+      ],
+      "source": [
+        "!pip install -q openai==1.12.0 cohere==4.47 tiktoken==0.6.0"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\""
+      ],
+      "metadata": {
+        "id": "MYvUA6CF2Le6"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# False: Generate the embedding for the dataset. (Associated cost with using OpenAI endpoint)\n",
+        "# True: Load the dataset that already has the embedding vectors.\n",
+        "load_embedding = False"
+      ],
+      "metadata": {
+        "id": "0ViVXXIqXBai"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Load Dataset"
+      ],
+      "metadata": {
+        "id": "D8Nzx-cN_bDz"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Download Dataset (JSON)"
+      ],
+      "metadata": {
+        "id": "5JpI7GiZ--Gw"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model."
+      ],
+      "metadata": {
+        "id": "NT68BDYt-GkG"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "!wget https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv\n",
+        "!wget https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles-with_embeddings.csv"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "p6NEJT9S2OoH",
+        "outputId": "fd3aa19c-a644-4635-9838-2c20526c4da2"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "--2024-03-20 16:18:39--  https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv\n",
+            "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
+            "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
+            "HTTP request sent, awaiting response... 200 OK\n",
+            "Length: 173646 (170K) [text/plain]\n",
+            "Saving to: ‘mini-llama-articles.csv’\n",
+            "\n",
+            "\rmini-llama-articles   0%[                    ]       0  --.-KB/s               \rmini-llama-articles 100%[===================>] 169.58K  --.-KB/s    in 0.02s   \n",
+            "\n",
+            "2024-03-20 16:18:40 (6.91 MB/s) - ‘mini-llama-articles.csv’ saved [173646/173646]\n",
+            "\n",
+            "--2024-03-20 16:18:40--  https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles-with_embeddings.csv\n",
+            "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
+            "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
+            "HTTP request sent, awaiting response... 200 OK\n",
+            "Length: 11868176 (11M) [text/plain]\n",
+            "Saving to: ‘mini-llama-articles-with_embeddings.csv’\n",
+            "\n",
+            "mini-llama-articles 100%[===================>]  11.32M  --.-KB/s    in 0.1s    \n",
+            "\n",
+            "2024-03-20 16:18:40 (103 MB/s) - ‘mini-llama-articles-with_embeddings.csv’ saved [11868176/11868176]\n",
+            "\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Read File"
+      ],
+      "metadata": {
+        "id": "oYDd03Qn_clh"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Split the input text into chunks of specified size.\n",
+        "def split_into_chunks(text, chunk_size=1024):\n",
+        "  chunks = []\n",
+        "  for i in range(0, len(text), chunk_size):\n",
+        "    chunks.append( text[i:i+chunk_size] )\n",
+        "\n",
+        "  return chunks"
+      ],
+      "metadata": {
+        "id": "_bfhs5NMYr4N"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import csv\n",
+        "\n",
+        "chunks = []\n",
+        "\n",
+        "# Load the file as a CSV\n",
+        "with open(\"./mini-llama-articles.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for idx, row in enumerate( csv_reader ):\n",
+        "    if idx == 0: continue; # Skip header row\n",
+        "    chunks.extend( split_into_chunks(row[1]) )"
+      ],
+      "metadata": {
+        "id": "UcQ7Ge_XCuXa"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import pandas as pd\n",
+        "\n",
+        "# Convert the JSON list to a Pandas Dataframe\n",
+        "df = pd.DataFrame(chunks, columns=['chunk'])\n",
+        "\n",
+        "df.keys()"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "JKdFSOb0NXjx",
+        "outputId": "ce43c97f-2083-49b5-837d-62cc427fe848"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "Index(['chunk'], dtype='object')"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 8
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Generate Embedding"
+      ],
+      "metadata": {
+        "id": "21pFDgNdW9rO"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from openai import OpenAI\n",
+        "\n",
+        "client = OpenAI()\n",
+        "\n",
+        "# Defining a function that converts a text to embedding vector using OpenAI's Ada model.\n",
+        "def get_embedding(text):\n",
+        "  try:\n",
+        "    # Remove newlines\n",
+        "    text = text.replace(\"\\n\", \" \")\n",
+        "    res = client.embeddings.create(input = [text], model=\"text-embedding-ada-002\")\n",
+        "\n",
+        "    return res.data[0].embedding\n",
+        "\n",
+        "  except:\n",
+        "        return None"
+      ],
+      "metadata": {
+        "id": "AfS9w9eQAKyu"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from tqdm.notebook import tqdm\n",
+        "import numpy as np\n",
+        "\n",
+        "# Generate embedding\n",
+        "if not load_embedding:\n",
+        "  print(\"Generating embeddings...\")\n",
+        "  embeddings = []\n",
+        "  for index, row in tqdm( df.iterrows() ):\n",
+        "    # df.at[index, 'embedding'] = get_embedding( row['chunk'] )\n",
+        "    embeddings.append( get_embedding( row['chunk'] ) )\n",
+        "\n",
+        "  embeddings_values = pd.Series(embeddings)\n",
+        "  df.insert(loc=1, column='embedding', value=embeddings_values)\n",
+        "\n",
+        "# Or, load the embedding from the file.\n",
+        "else:\n",
+        "  print(\"Loaded the embedding file.\")\n",
+        "  # Load the file as a CSV\n",
+        "  df = pd.read_csv('mini-llama-articles-with_embeddings.csv')\n",
+        "  # Convert embedding column to an array\n",
+        "  df['embedding'] = df['embedding'].apply(lambda x: np.array(eval(x)), 0)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 67,
+          "referenced_widgets": [
+            "46a91770024e4802acd3e64e9bc46f32",
+            "613898a418d64df3b18d35083f0bb36d",
+            "9f9427eb6a644166906bb321f13eaf48",
+            "a4a232c5b5e1493897e9acdd25b8efd4",
+            "b2e91819e1c94f28b7bbad66918cb797",
+            "010cbcb0f1364576b15f792f4d11f605",
+            "f51d5da0f39e4c1885357d3d4c9964d9",
+            "c4ceff5437e0470089c161e21488d2a7",
+            "6aafd52b0e3e4e0183b1666ad1e8a448",
+            "80137fc11d4b4e518d8c8957ca5461b1",
+            "c4236d507b354bff830620a8bde32191"
+          ]
+        },
+        "id": "qC6aeFr3Rmi2",
+        "outputId": "7f54333f-fcb9-44ce-d4a0-94a9a8d822d5"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Generating embeddings...\n"
+          ]
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "0it [00:00, ?it/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "46a91770024e4802acd3e64e9bc46f32"
+            }
+          },
+          "metadata": {}
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# df.to_csv('mini-llama-articles-with_embeddings.csv')"
+      ],
+      "metadata": {
+        "id": "jyX9M_n9o2ve"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# User Question"
+      ],
+      "metadata": {
+        "id": "E_qrXwImXrXJ"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Define the user question, and convert it to embedding.\n",
+        "QUESTION = \"How many parameters LLaMA2 model has?\"\n",
+        "QUESTION_emb = get_embedding( QUESTION )\n",
+        "\n",
+        "len( QUESTION_emb )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "xGTa7cqCX97q",
+        "outputId": "6ae836e3-1a65-4447-b732-88758378e9dd"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "1536"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 15
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Test Cosine Similarity"
+      ],
+      "metadata": {
+        "id": "BXNzNWrJYWhU"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Calculating the similarity of embedding representations can help us to find pieces of text that are close to each other. In the following sample you see how the Cosine Similarity metric can identify which sentence could be a possible answer for the given user question. Obviously, the unrelated answer will score lower."
+      ],
+      "metadata": {
+        "id": "Vxaq-FgLIhIj"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "BAD_SOURCE_emb = get_embedding( \"The sky is blue.\" )\n",
+        "GOOD_SOURCE_emb = get_embedding( \"LLaMA2 model has a total of 2B parameters.\" )"
+      ],
+      "metadata": {
+        "id": "LqDWcPd4b-ZI"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from sklearn.metrics.pairwise import cosine_similarity\n",
+        "\n",
+        "# A sample that how a good piece of text can achieve high similarity score compared\n",
+        "# to a completely unrelated text.\n",
+        "print(\"> Bad Response Score:\", cosine_similarity([QUESTION_emb], [BAD_SOURCE_emb]) )\n",
+        "print(\"> Good Response Score:\", cosine_similarity([QUESTION_emb], [GOOD_SOURCE_emb]) )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "OI00eN86YZKB",
+        "outputId": "0d06c9ea-7de2-48a0-e6d8-3fc6e428914b"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "> Bad Response Score: [[0.69953438]]\n",
+            "> Good Response Score: [[0.93126147]]\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Calculate Cosine Similarities"
+      ],
+      "metadata": {
+        "id": "kdJlEtaaJC4I"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# The similarity between the questions and each part of the essay.\n",
+        "cosine_similarities = cosine_similarity( [QUESTION_emb], df['embedding'].tolist() )\n",
+        "\n",
+        "print( cosine_similarities )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "PNPN7OAXemmH",
+        "outputId": "54beed07-04de-4696-b513-f49a935d6820"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "[[0.82047387 0.79858187 0.74135248 0.73226232 0.72406104 0.75608299\n",
+            "  0.76808965 0.77621683 0.80498431 0.71399955 0.69822549 0.67532971\n",
+            "  0.72473021 0.73449361 0.69998132 0.73749561 0.68490681 0.75076836\n",
+            "  0.72540663 0.70675593 0.76047822 0.73849418 0.78103858 0.75189435\n",
+            "  0.73619013 0.76962672 0.71289635 0.76996122 0.7827543  0.77959332\n",
+            "  0.82716952 0.77719335 0.80172766 0.76301732 0.78111546 0.75179235\n",
+            "  0.74741505 0.7576328  0.78998865 0.77283347 0.79180172 0.78170323\n",
+            "  0.80264132 0.79923073 0.76146584 0.75199024 0.8341403  0.74460259\n",
+            "  0.76259332 0.73693499 0.78469623 0.81698455 0.8254561  0.77921093\n",
+            "  0.75351863 0.79319721 0.73098248 0.71716001 0.73210099 0.74684248\n",
+            "  0.75760574 0.71070101 0.71507394 0.70847896 0.72395535 0.77801292\n",
+            "  0.75446732 0.75100258 0.7361131  0.78430831 0.74170516 0.71862961\n",
+            "  0.76792911 0.76471996 0.78551313 0.80846857 0.79231644 0.79505895\n",
+            "  0.76910825 0.78341548 0.74952152 0.7849115  0.80407507 0.82641741\n",
+            "  0.77074756 0.7356681  0.77452715 0.76224969 0.79906149 0.84520641\n",
+            "  0.82301383 0.8362749  0.81676624 0.8035085  0.80532594 0.81186134\n",
+            "  0.69082726 0.72587048 0.70070204 0.7155819  0.71758016 0.74945217\n",
+            "  0.72555195 0.7356198  0.73695714 0.75553407 0.77502366 0.71438692\n",
+            "  0.75846916 0.79831901 0.78600515 0.7601161  0.78696534 0.80404804\n",
+            "  0.85209549 0.77037783 0.76985195 0.75062239 0.69339426 0.7108229\n",
+            "  0.72051435 0.75137579 0.71168549 0.72276919 0.77669437 0.7726572\n",
+            "  0.74774188 0.73290677 0.70262553 0.72831247 0.7525444  0.7495277\n",
+            "  0.75188765 0.71491865 0.74460111 0.73599028 0.76314747 0.71318814\n",
+            "  0.70723754 0.73098562 0.72745902 0.76077793 0.72614335 0.72636887\n",
+            "  0.77770561 0.69882456 0.72396024 0.70349095 0.70541201 0.76424393\n",
+            "  0.72785191 0.74371405 0.67802651 0.7353597  0.69916559 0.70605271\n",
+            "  0.71477477 0.71021711 0.77423355 0.70897606 0.74946665 0.70971011\n",
+            "  0.72360056 0.72906996 0.76590153 0.74469991 0.73669136 0.71547661\n",
+            "  0.6958848  0.71459824 0.74863434 0.71430407 0.75165385 0.74221148]]\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import numpy as np\n",
+        "\n",
+        "number_of_chunks_to_retrieve = 3\n",
+        "\n",
+        "# Sort the scores\n",
+        "highest_index = np.argmax( cosine_similarities )\n",
+        "\n",
+        "# Pick the N highest scored chunks\n",
+        "indices = np.argsort(cosine_similarities[0])[::-1][:number_of_chunks_to_retrieve]\n",
+        "print( indices )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "1-XI1_7mhlw4",
+        "outputId": "9598da10-ab61-45e9-e0bb-3e1d7046b657"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "[114  89  91]\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Look at the highest scored retrieved pieces of text\n",
+        "for idx, item in enumerate( df.chunk[indices] ):\n",
+        "  print(f\"> Chunk {idx+1}\")\n",
+        "  print(item)\n",
+        "  print(\"----\")"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "JPmhCb9kfB0w",
+        "outputId": "5089b207-a65a-4856-c065-56b3b9bbba72"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "> Chunk 1\n",
+            "by Meta that ventures into both the AI and academic spaces. The model aims to help researchers, scientists, and engineers advance their work in exploring AI applications. It will be released under a non-commercial license to prevent misuse, and access will be granted to academic researchers, individuals, and organizations affiliated with the government, civil society, academia, and industry research facilities on a selective case-by-case basis. The sharing of codes and weights allows other researchers to test new approaches in LLMs. The LLaMA models have a range of 7 billion to 65 billion parameters. LLaMA-65B can be compared to DeepMind's Chinchilla and Google's PaLM. Publicly available unlabeled data was used to train these models, and training smaller foundational models require less computing power and resources. LLaMA 65B and 33B have been trained on 1.4 trillion tokens in 20 different languages, and according to the Facebook Artificial Intelligence Research (FAIR) team, the model's performance varies ac\n",
+            "----\n",
+            "> Chunk 2\n",
+            "I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annota\n",
+            "----\n",
+            "> Chunk 3\n",
+            "vely address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. \n",
+            "----\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Augment the Prompt"
+      ],
+      "metadata": {
+        "id": "7uvQACqAkHg4"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Use the OpenAI API to answer the questions based on the retrieved pieces of text.\n",
+        "try:\n",
+        "    # Formulating the system prompt and condition the model to answer only AI-related questions.\n",
+        "    system_prompt = (\n",
+        "      \"You are an assistant and expert in answering questions from a chunks of content. \"\n",
+        "      \"Only answer AI-related question, else say that you cannot answer this question.\"\n",
+        "    )\n",
+        "\n",
+        "    # Create a user prompt with the user's question\n",
+        "    prompt = (\n",
+        "      \"Read the following informations that might contain the context you require to answer the question. You can use the informations starting from the <START_OF_CONTEXT> tag and end with the <END_OF_CONTEXT> tag. Here is the content:\\n\\n<START_OF_CONTEXT>\\n{}\\n<END_OF_CONTEXT>\\n\\n\"\n",
+        "      \"Please provide an informative and accurate answer to the following question based on the avaiable context. Be concise and take your time. \\nQuestion: {}\\nAnswer:\"\n",
+        "    )\n",
+        "    # Add the retrieved pieces of text to the prompt.\n",
+        "    prompt = prompt.format( \"\".join( df.chunk[indices] ), QUESTION )\n",
+        "\n",
+        "    # Call the OpenAI API\n",
+        "    response = client.chat.completions.create(\n",
+        "      model='gpt-3.5-turbo-16k',\n",
+        "      temperature=0.0,\n",
+        "      messages=[\n",
+        "        {\"role\": \"system\", \"content\": system_prompt},\n",
+        "        {\"role\": \"user\", \"content\": prompt}\n",
+        "      ]\n",
+        "    )\n",
+        "\n",
+        "    # Return the AI's response\n",
+        "    res = response.choices[0].message.content.strip()\n",
+        "\n",
+        "except Exception as e:\n",
+        "    print( f\"An error occurred: {e}\" )"
+      ],
+      "metadata": {
+        "id": "MXRdzta5kJ3V"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "print( res )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "9tBvJ8oMucha",
+        "outputId": "418c0220-c2ee-43cf-a9bc-0ea755f7a04e"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "The LLaMA2 model has four different sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Without Augmentation"
+      ],
+      "metadata": {
+        "id": "pW-BNCAC2JzE"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Test the OpenAI API to answer the same question without the addition of retrieved documents. Basically, the LLM will use its knowledge to answer the question."
+      ],
+      "metadata": {
+        "id": "tr5zXEGIMwJu"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Formulating the system prompt\n",
+        "system_prompt = (\n",
+        "  \"You are an assistant and expert in answering questions.\"\n",
+        ")\n",
+        "\n",
+        "# Combining the system prompt with the user's question\n",
+        "prompt = (\n",
+        "  \"Be concise and take your time to answer the following question. \\nQuestion: {}\\nAnswer:\"\n",
+        ")\n",
+        "prompt = prompt.format( QUESTION )\n",
+        "\n",
+        "# Call the OpenAI API\n",
+        "response = client.chat.completions.create(\n",
+        "  model='gpt-3.5-turbo-16k',\n",
+        "  temperature=.9,\n",
+        "  messages=[\n",
+        "    {\"role\": \"system\", \"content\": system_prompt},\n",
+        "    {\"role\": \"user\", \"content\": prompt}\n",
+        "  ]\n",
+        ")\n",
+        "\n",
+        "# Return the AI's response\n",
+        "res = response.choices[0].message.content.strip()"
+      ],
+      "metadata": {
+        "id": "RuyXjzZyuecE"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "print( res )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "YAy34tPTzGbh",
+        "outputId": "54041329-dd5f-4cdd-db38-f1440ae77181"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "The LLaMA2 model has a total of [insert number] parameters.\n"
+          ]
+        }
+      ]
+    }
+  ]
+}

notebooks/03-RAG_with_LlamaIndex.ipynb ADDED Viewed

	@@ -0,0 +1,360 @@

+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "authorship_tag": "ABX9TyO9EXKHngvJa9fUydE3Tlen",
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/03-RAG_with_LlamaIndex.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Install Packages and Setup Variables"
+      ],
+      "metadata": {
+        "id": "v9bpz99INAc1"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "BeuFJKlj9jKz",
+        "outputId": "a14a78f4-e43e-4aef-bc69-4ced559df34e"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m226.7/226.7 kB\u001b[0m \u001b[31m2.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m52.3/52.3 kB\u001b[0m \u001b[31m3.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.8/1.8 MB\u001b[0m \u001b[31m9.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m15.4/15.4 MB\u001b[0m \u001b[31m26.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m21.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.6/75.6 kB\u001b[0m \u001b[31m4.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.1/3.1 MB\u001b[0m \u001b[31m35.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.9/77.9 kB\u001b[0m \u001b[31m3.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m2.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m136.0/136.0 kB\u001b[0m \u001b[31m10.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.9/3.9 MB\u001b[0m \u001b[31m27.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m290.4/290.4 kB\u001b[0m \u001b[31m22.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m30.8/30.8 MB\u001b[0m \u001b[31m32.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.4/49.4 kB\u001b[0m \u001b[31m2.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h"
+          ]
+        }
+      ],
+      "source": [
+        "!pip install -q llama-index==0.10.30 openai==1.12.0 cohere==4.47 tiktoken==0.6.0"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\""
+      ],
+      "metadata": {
+        "id": "XuzgSNqcABpV"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Load Dataset"
+      ],
+      "metadata": {
+        "id": "f5eV5EnvNCMM"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Download"
+      ],
+      "metadata": {
+        "id": "q-7mRQ-mNJlm"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model."
+      ],
+      "metadata": {
+        "id": "3PsdOdMUNmEi"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "!wget https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "3ImRCP7pACaI",
+        "outputId": "c782f06a-5fcb-4134-e197-e2a9c3193ce9"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "--2024-04-09 18:54:34--  https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv\n",
+            "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
+            "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
+            "HTTP request sent, awaiting response... 200 OK\n",
+            "Length: 173646 (170K) [text/plain]\n",
+            "Saving to: ‘mini-llama-articles.csv’\n",
+            "\n",
+            "mini-llama-articles 100%[===================>] 169.58K  --.-KB/s    in 0.09s   \n",
+            "\n",
+            "2024-04-09 18:54:35 (1.89 MB/s) - ‘mini-llama-articles.csv’ saved [173646/173646]\n",
+            "\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Read File"
+      ],
+      "metadata": {
+        "id": "bZZLK_wyEc-L"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import csv\n",
+        "\n",
+        "rows = []\n",
+        "\n",
+        "# Load the CSV file\n",
+        "with open(\"./mini-llama-articles.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for idx, row in enumerate( csv_reader ):\n",
+        "    if idx == 0: continue; # Skip header row\n",
+        "    rows.append( row )\n",
+        "\n",
+        "# The number of characters in the dataset.\n",
+        "print( \"number of articles:\", len( rows ) )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "miUqycqAEfr7",
+        "outputId": "911985c6-6884-48ff-fa24-869d44a1a012"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "number of articles: 14\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Generate Embedding"
+      ],
+      "metadata": {
+        "id": "f86yksB9K571"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core import Document\n",
+        "\n",
+        "# Convert the texts to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row[1]) for row in rows]"
+      ],
+      "metadata": {
+        "id": "iXrr5-tnEfm9"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "from llama_index.core.node_parser import SentenceSplitter\n",
+        "\n",
+        "# Build index / generate embeddings using OpenAI.\n",
+        "index = VectorStoreIndex.from_documents(\n",
+        "    documents,\n",
+        "    transformations=[SentenceSplitter(chunk_size=768, chunk_overlap=64)],\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "Bsa7Q-DoNWBk"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Save the generated embeddings.\n",
+        "# index.storage_context.persist(persist_dir=\"indexes\")"
+      ],
+      "metadata": {
+        "id": "xxB0A9ZYM-OD"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Query Dataset"
+      ],
+      "metadata": {
+        "id": "3DoUxd8KK--Q"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "query_engine = index.as_query_engine()"
+      ],
+      "metadata": {
+        "id": "bUaNH97dEfh9"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "response = query_engine.query(\n",
+        "    \"How many parameters LLaMA2 model has?\"\n",
+        ")\n",
+        "print(response)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "KHK4V_GRR6ZG",
+        "outputId": "8d656836-622a-4261-e24a-9cadf857b376"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "The Llama 2 model has 7 billion, 13 billion, 34 billion, and 70 billion parameters.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "response = query_engine.query(\n",
+        "    \"When will Llama3 will be released?\"\n",
+        ")\n",
+        "print(response)"
+      ],
+      "metadata": {
+        "id": "S-BmyTBbNd9y",
+        "outputId": "c6a4ec79-7555-4b4d-f212-0b5864c7bded",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        }
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "The release date for Llama3 is not provided in the given context information.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Test with smaller chunk size\n",
+        "# transformations=[SentenceSplitter(chunk_size=512, chunk_overlap=20)]\n",
+        "\n",
+        "response = query_engine.query(\n",
+        "    \"How many parameters LLaMA2 model has?\"\n",
+        ")\n",
+        "print(response)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "tEgFx_aeFS5e",
+        "outputId": "0353f9e4-0f63-4739-eb5b-717bf19572ef"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "The LLaMA2 model has a range of 7 billion to 65 billion parameters.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [],
+      "metadata": {
+        "id": "oZt_sG86RwZ3"
+      },
+      "execution_count": null,
+      "outputs": []
+    }
+  ]
+}

notebooks/04-RAG_with_VectorStore.ipynb ADDED Viewed

	@@ -0,0 +1,449 @@

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/04-RAG_with_VectorStore.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "5BGJ3fxhOk2V"
+      },
+      "source": [
+        "# Install Packages and Setup Variables"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "id": "QPJzr-I9XQ7l"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -q llama-index==0.10.5 llama-index-vector-stores-chroma==0.1.7 langchain==0.1.17 langchain-chroma==0.1.0 langchain_openai==0.1.5 openai==1.12.0 cohere==4.47 tiktoken==0.6.0 chromadb==0.4.22"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {
+        "id": "riuXwpSPcvWC"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\""
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "I9JbAzFcjkpn"
+      },
+      "source": [
+        "# Load the Dataset (CSV)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_Tif8-JoRH68"
+      },
+      "source": [
+        "## Download"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "4fQaa1LN1mXL"
+      },
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model. Read the dataset as a long string."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "-QTUkdfJjY4N",
+        "outputId": "a88b2f8a-0c84-45a0-9b32-5088fe596612"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+            "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+            "100  169k  100  169k    0     0   277k      0 --:--:-- --:--:-- --:--:--  281k\n"
+          ]
+        }
+      ],
+      "source": [
+        "!curl -o ./mini-dataset.csv https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "zk-4alIxROo8"
+      },
+      "source": [
+        "## Read File"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "7CYwRT6R0o0I",
+        "outputId": "351f170f-9a00-4b09-ae08-b45c3c48fce5"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "841"
+            ]
+          },
+          "execution_count": 4,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "import csv\n",
+        "\n",
+        "text = \"\"\n",
+        "\n",
+        "# Load the file as a JSON\n",
+        "with open(\"./mini-dataset.csv\", mode=\"r\", encoding=\"ISO-8859-1\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for row in csv_reader:\n",
+        "    text += row[0]\n",
+        "\n",
+        "# The number of characters in the dataset.\n",
+        "len( text )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "S17g2RYOjmf2"
+      },
+      "source": [
+        "# Chunking"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "STACTMUR1z9N",
+        "outputId": "15a61eac-8774-4cdb-db8d-e2eb5b07e517"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "2"
+            ]
+          },
+          "execution_count": 5,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "chunk_size = 512\n",
+        "chunks = []\n",
+        "\n",
+        "# Split the long text into smaller manageable chunks of 512 characters.\n",
+        "for i in range(0, len(text), chunk_size):\n",
+        "    chunks.append(text[i:i + chunk_size])\n",
+        "\n",
+        "len( chunks )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "9fOomeMGqu10"
+      },
+      "source": [
+        "#Interface of Chroma with LlamaIndex"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "id": "CtdsIUQ81_hT"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=t) for t in chunks]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OWaT6rL7ksp8"
+      },
+      "source": [
+        "Save on Chroma\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "id": "mXi56KTXk2sp"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "\n",
+        "# create client and a new collection\n",
+        "# chromadb.EphemeralClient saves data in-memory.\n",
+        "chroma_client = chromadb.PersistentClient(path=\"./mini-chunked-dataset\")\n",
+        "chroma_collection = chroma_client.create_collection(\"mini-chunked-dataset\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {
+        "id": "jKXURvLtkuTS"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "from llama_index.core import StorageContext\n",
+        "# Define a storage context object using the created vector database.\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)\n",
+        "storage_context = StorageContext.from_defaults(vector_store=vector_store)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "id": "WsD52wtrlESi"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "\n",
+        "# Add the documents to the database and create Index / embeddings\n",
+        "index = VectorStoreIndex.from_documents(\n",
+        "    documents, storage_context=storage_context\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8JPD8yAinVSq"
+      },
+      "source": [
+        "Query Dataset"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "metadata": {
+        "id": "mzS13x1ZlZ5X"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.llms.openai import OpenAI\n",
+        "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "\n",
+        "llm = OpenAI(temperature=0, model=\"gpt-3.5-turbo-0125\", max_tokens=512)\n",
+        "query_engine = index.as_query_engine(llm=llm)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 11,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "AYsQ4uLN_Oxg",
+        "outputId": "5066a06c-77ff-48a2-ee61-3abe2e9755e2"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "The LLaMA2 model has 7 billion parameters.\n"
+          ]
+        }
+      ],
+      "source": [
+        "response = query_engine.query(\n",
+        "    \"How many parameters LLaMA2 model has?\"\n",
+        ")\n",
+        "print(response)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kWK571VNg-qR"
+      },
+      "source": [
+        "#Interface of Chroma with LangChain"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "metadata": {
+        "id": "SMPAniL2e4NP"
+      },
+      "outputs": [],
+      "source": [
+        "from langchain.schema.document import Document\n",
+        "# Convert the chunks to Document objects so the LangChain framework can process them.\n",
+        "documents = [Document(page_content=t) for t in chunks]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "QBt8qGxArUPD"
+      },
+      "source": [
+        "Save on Chroma"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {
+        "id": "2xas7HkuhJ8A"
+      },
+      "outputs": [],
+      "source": [
+        "from langchain_chroma import Chroma\n",
+        "from langchain_openai import OpenAIEmbeddings\n",
+        "# Add the documents to chroma DB and create Index / embeddings\n",
+        "\n",
+        "embeddings = OpenAIEmbeddings(model=\"text-embedding-ada-002\")\n",
+        "chroma_db = Chroma.from_documents(\n",
+        "    documents=documents,\n",
+        "    embedding=embeddings,\n",
+        "    persist_directory=\"./mini-chunked-dataset\",\n",
+        "    collection_name=\"mini-chunked-dataset\"\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "P8AXJJyBrZWF"
+      },
+      "source": [
+        "Query Dataset"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {
+        "id": "-H64YLxshM2b"
+      },
+      "outputs": [],
+      "source": [
+        "from langchain_openai import ChatOpenAI\n",
+        "# Initializing the LLM model\n",
+        "llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0125\", max_tokens=512)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 16,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "AxBqPNtthPaa",
+        "outputId": "93c9ad64-1cd1-4f52-c51e-6f3ec5d6542d"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "The LLaMA-2 model has 7 billion parameters.\n"
+          ]
+        }
+      ],
+      "source": [
+        "from langchain.chains import RetrievalQA\n",
+        "query = \"How many parameters LLaMA2 model has?\"\n",
+        "retriever = chroma_db.as_retriever(search_kwargs={\"k\": 2})\n",
+        "# Define a RetrievalQA chain that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "chain = RetrievalQA.from_chain_type(llm=llm,\n",
+        "                                    chain_type=\"stuff\",\n",
+        "                                    retriever=retriever)\n",
+        "\n",
+        "response = chain(query)\n",
+        "print(response[\"result\"])"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.8"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

notebooks/05-Improve_Prompts_+_Add_Source.ipynb ADDED Viewed

	@@ -0,0 +1,1420 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "view-in-github"
+   },
+   "source": [
+    "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/05-Improve_Prompts_%2B_Add_Source.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "5BGJ3fxhOk2V"
+   },
+   "source": [
+    "# Install Packages and Setup Variables"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "QPJzr-I9XQ7l",
+    "outputId": "b6cb3d46-9ad9-4658-be9c-a24bcab98c7c"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install -q llama-index==0.10.9 openai==1.12.0 cohere==4.47 tiktoken==0.6.0 chromadb==0.4.22"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "id": "riuXwpSPcvWC"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+    "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\"\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "id": "km-KQOrgr3VB"
+   },
+   "outputs": [],
+   "source": [
+    "# Allows running asyncio in environments with an existing event loop, like Jupyter notebooks.\n",
+    "\n",
+    "import nest_asyncio\n",
+    "\n",
+    "nest_asyncio.apply()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "Bkgi2OrYzF7q"
+   },
+   "source": [
+    "# Load a Model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "id": "9oGT6crooSSj"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.llms.openai import OpenAI\n",
+    "\n",
+    "llm = OpenAI(temperature=0.9, model=\"gpt-3.5-turbo-0125\", max_tokens=512)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "0BwVuJXlzHVL"
+   },
+   "source": [
+    "# Create a VectoreStore"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "id": "SQP87lHczHKc"
+   },
+   "outputs": [],
+   "source": [
+    "import chromadb\n",
+    "\n",
+    "# create client and a new collection\n",
+    "# chromadb.EphemeralClient saves data in-memory.\n",
+    "chroma_client = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+    "chroma_collection = chroma_client.create_collection(\"mini-llama-articles\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "id": "zAaGcYMJzHAN"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+    "\n",
+    "# Define a storage context object using the created vector database.\n",
+    "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "I9JbAzFcjkpn"
+   },
+   "source": [
+    "# Load the Dataset (CSV)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "_Tif8-JoRH68"
+   },
+   "source": [
+    "## Download"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "4fQaa1LN1mXL"
+   },
+   "source": [
+    "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "fQtpDvUzKNzI",
+    "outputId": "829f8e63-7767-43a1-b3c9-95ae099012e7"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+      "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+      "100  169k  100  169k    0     0  1044k      0 --:--:-- --:--:-- --:--:-- 1040k\n"
+     ]
+    }
+   ],
+   "source": [
+    "!curl -o ./mini-dataset.csv https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "zk-4alIxROo8"
+   },
+   "source": [
+    "## Load the Articles"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "_WER5lt0N7c5",
+    "outputId": "2e4eae71-fa3a-4faf-a4e2-d3efaeaa591a"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "14"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import csv\n",
+    "\n",
+    "rows = []\n",
+    "\n",
+    "# Load the file as a JSON\n",
+    "with open(\"./mini-dataset.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+    "  csv_reader = csv.reader(file)\n",
+    "\n",
+    "  for idx, row in enumerate( csv_reader ):\n",
+    "    if idx == 0: continue; # Skip header row\n",
+    "    rows.append( row )\n",
+    "\n",
+    "# The number of characters in the dataset.\n",
+    "len( rows )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "wxEStggPdxYs"
+   },
+   "source": [
+    "# Convert to Document obj"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "id": "lFvW_886dxKX"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.core import Document\n",
+    "\n",
+    "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+    "documents = [Document(text=row[1], metadata={\"title\": row[0], \"url\": row[2], \"source_name\": row[3]}) for row in rows]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "Njoc3XEVkKkf",
+    "outputId": "bab3878d-252d-4f9a-8a65-d2933e8dc891"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "14"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "len( documents )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "S17g2RYOjmf2"
+   },
+   "source": [
+    "# Transforming"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "id": "STACTMUR1z9N"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.core.node_parser import TokenTextSplitter\n",
+    "\n",
+    "# Define the splitter object that split the text into segments with 512 tokens,\n",
+    "# with a 128 overlap between the segments.\n",
+    "text_splitter = TokenTextSplitter(\n",
+    "    separator=\" \", chunk_size=512, chunk_overlap=128\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 331,
+     "referenced_widgets": [
+      "9b38fd520d1a4700bbc596b260a9a96f",
+      "5320a84d7a00443e86af8f031d71685d",
+      "4f3f1f990d244eb290482be55525daec",
+      "9a4eb44d43dc42d9acdb606b6d55ad9f",
+      "51de9732c1e04961b16351d3f410ac1d",
+      "b40ee74dabec45ce842bcfb983d3fa75",
+      "0c0ba53346954abc85f0921b682e7279",
+      "9372c35dcfc04e16a97c0eb63003520e",
+      "c6f3cd2404ef4a3096a61c1fcdbddd8f",
+      "181bd6b10e9e4ec693ece948fd432302",
+      "0c55e54063ea44ab8ea83466d9603a6d",
+      "739a7d470a024bc2806e2ea998bf1dac",
+      "299757dc40394c3287beea74c40dec27",
+      "6c111aa1d43a4af9b04355a65c8fccb2",
+      "4926bed77e464729b902c20bd7874a03",
+      "5c1eaae6cf2840ab96f1a1d6a1f91881",
+      "d4b409c70f3f4398ad88ede8f438e32a",
+      "85fa4db33aa8427ba18d43f9a529529b",
+      "a9e8371d627a48e69c7a725646f689d5",
+      "e8a00080ca684fcc97189f5f3ea325e3",
+      "d7213ef5bbb7409cbe40437bde51b5c9",
+      "652d2e07d8be4f1f87c2f258cf288f1a"
+     ]
+    },
+    "id": "CtdsIUQ81_hT",
+    "outputId": "6a48a887-be9e-4bf3-d54d-3e0575a24e52"
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/louis/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+      "  from .autonotebook import tqdm as notebook_tqdm\n",
+      "Parsing nodes: 100%|██████████| 14/14 [00:00<00:00, 38.44it/s]\n",
+      "Generating embeddings: 100%|██████████| 108/108 [00:01<00:00, 79.43it/s]\n"
+     ]
+    }
+   ],
+   "source": [
+    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+    "from llama_index.core.ingestion import IngestionPipeline\n",
+    "\n",
+    "# Create the pipeline to apply the transformation (splitting and embedding) on each chunk,\n",
+    "# and store the transformed text in the chroma vector store.\n",
+    "pipeline = IngestionPipeline(\n",
+    "    transformations=[\n",
+    "        text_splitter,\n",
+    "        OpenAIEmbedding(),\n",
+    "    ],\n",
+    "    vector_store=vector_store\n",
+    ")\n",
+    "\n",
+    "# Run the transformation pipeline.\n",
+    "b = pipeline.run(documents=documents, show_progress=True);"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "EV0ll57p46Dc"
+   },
+   "source": [
+    "# Load Indexes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "id": "PS215gCGkGD-"
+   },
+   "outputs": [],
+   "source": [
+    "# Load the vector store from the local storage.\n",
+    "db = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+    "chroma_collection = db.get_or_create_collection(\"mini-llama-articles\")\n",
+    "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {
+    "id": "HbT3-kRO4Qpt"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.core import VectorStoreIndex\n",
+    "\n",
+    "# Create the index based on the vector store.\n",
+    "index = VectorStoreIndex.from_vector_store(vector_store)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {
+    "id": "sb61DWU84bHP"
+   },
+   "outputs": [],
+   "source": [
+    "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+    "# and using a LLM to formulate the final answer.\n",
+    "query_engine = index.as_query_engine()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {
+    "id": "G32W2LMMCmnv"
+   },
+   "outputs": [],
+   "source": [
+    "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 35
+    },
+    "id": "obc20cU5Cxf2",
+    "outputId": "6f89e848-da19-40db-90bb-777a5483af04"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'The Llama 2 model is available in four different sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.'"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "res.response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "oIAO-saJCzYe",
+    "outputId": "985a5eca-9e1c-45e7-e650-63f90f7df964"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Node ID\t de7de537-c87d-44e3-ac43-5180a95acb90\n",
+      "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+      "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+      "Score\t 0.7122361910421624\n",
+      "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+      "Node ID\t 1dfbee1d-1073-4f89-a286-1f0321729e58\n",
+      "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+      "Text\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+      "Score\t 0.7047493574957753\n",
+      "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Show the retrieved nodes\n",
+    "for src in res.source_nodes:\n",
+    "  print(\"Node ID\\t\", src.node_id)\n",
+    "  print(\"Title\\t\", src.metadata['title'])\n",
+    "  print(\"Text\\t\", src.text)\n",
+    "  print(\"Score\\t\", src.score)\n",
+    "  print(\"-_\"*20)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "pVJif4uhPNXM"
+   },
+   "source": [
+    "# Response Modes\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "ykZOaQYvPWMj"
+   },
+   "source": [
+    "The behavior of the query engine during response generation can be adjusted. Several modes are available for consideration, including the following:\n",
+    "\n",
+    "- compact (default): Concatenate all the retrieved chunks and use them in the prompt to generate an answer.\n",
+    "- refine: Generate an answer based on the first retrieved chunk, then improve the answer based on the other retrieved chunks one at a time. (will send one request for each chunk to refine the response)\n",
+    "- tree summarize: concatenate the retrieved chunks until they fit the context window and summarize them. The summaized chunks will then recusively fed back to the LLM for summarization until one chunk remains which would be the final answer.\n",
+    "\n",
+    "\n",
+    "Refer to [documentation](https://docs.llamaindex.ai/en/stable/module_guides/querying/response_synthesizers/root.html#configuring-the-response-mode) for a comprehensive list.\n",
+    "\n",
+    "Due to the limited size of the sample dataset, the examples provided will yield identical responses. It's crucial to evaluate these methods in the context of your specific use case and cost considerations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {
+    "id": "d4xxZHbdN0lK"
+   },
+   "outputs": [],
+   "source": [
+    "query_engine = index.as_query_engine(response_mode=\"refine\")\n",
+    "# query_engine = index.as_query_engine(response_mode=\"tree_summarize\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "metadata": {
+    "id": "uNKJfIn-SDLm"
+   },
+   "outputs": [],
+   "source": [
+    "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 35
+    },
+    "id": "Z1XmLBEoSFzB",
+    "outputId": "53ee59b9-a2ad-4700-e8c9-7f450d650242"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'The Llama 2 model is available in four different sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.'"
+      ]
+     },
+     "execution_count": 20,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "res.response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "pZUgM-mSST4X",
+    "outputId": "6803179b-95f5-46d1-ad98-d799ea1b6289"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Node ID\t de7de537-c87d-44e3-ac43-5180a95acb90\n",
+      "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+      "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+      "Score\t 0.7122361910421624\n",
+      "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+      "Node ID\t 1dfbee1d-1073-4f89-a286-1f0321729e58\n",
+      "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+      "Text\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+      "Score\t 0.7047493574957753\n",
+      "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Show the retrieved nodes\n",
+    "for src in res.source_nodes:\n",
+    "  print(\"Node ID\\t\", src.node_id)\n",
+    "  print(\"Title\\t\", src.metadata['title'])\n",
+    "  print(\"Text\\t\", src.text)\n",
+    "  print(\"Score\\t\", src.score)\n",
+    "  print(\"-_\"*20)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "697hg9YWTAoq"
+   },
+   "source": [
+    "The `no_text` mode will retrieve the documents, but will not send the request to the API to synthesize the final response. It is a great approach to debug the retrieved documents."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "H2x55KW0S1Jg",
+    "outputId": "39e8924c-c445-4658-d39f-7a300e8d516f"
+   },
+   "outputs": [],
+   "source": [
+    "query_engine = index.as_query_engine(response_mode=\"no_text\")\n",
+    "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 35
+    },
+    "id": "gvvtYQcBS-Ug",
+    "outputId": "85dd7301-6d12-4758-86b0-652396d6fe39"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "''"
+      ]
+     },
+     "execution_count": 23,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "res.response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "o9ijBEkXS5LC",
+    "outputId": "616c8315-15c5-47cd-a9ed-2830b2f88d5d"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Node ID\t de7de537-c87d-44e3-ac43-5180a95acb90\n",
+      "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+      "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+      "Score\t 0.7122361910421624\n",
+      "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+      "Node ID\t 1dfbee1d-1073-4f89-a286-1f0321729e58\n",
+      "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+      "Text\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+      "Score\t 0.7047493574957753\n",
+      "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Show the retrieved nodes\n",
+    "for src in res.source_nodes:\n",
+    "  print(\"Node ID\\t\", src.node_id)\n",
+    "  print(\"Title\\t\", src.metadata['title'])\n",
+    "  print(\"Text\\t\", src.text)\n",
+    "  print(\"Score\\t\", src.score)\n",
+    "  print(\"-_\"*20)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "authorship_tag": "ABX9TyPHUCVR9OPVGnLj3XoIzKS4",
+   "include_colab_link": true,
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.8"
+  },
+  "widgets": {
+   "application/vnd.jupyter.widget-state+json": {
+    "0c0ba53346954abc85f0921b682e7279": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "0c55e54063ea44ab8ea83466d9603a6d": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "181bd6b10e9e4ec693ece948fd432302": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "299757dc40394c3287beea74c40dec27": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_d4b409c70f3f4398ad88ede8f438e32a",
+      "placeholder": "",
+      "style": "IPY_MODEL_85fa4db33aa8427ba18d43f9a529529b",
+      "value": "Generating embeddings: 100%"
+     }
+    },
+    "4926bed77e464729b902c20bd7874a03": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_d7213ef5bbb7409cbe40437bde51b5c9",
+      "placeholder": "",
+      "style": "IPY_MODEL_652d2e07d8be4f1f87c2f258cf288f1a",
+      "value": " 108/108 [00:05&lt;00:00, 28.51it/s]"
+     }
+    },
+    "4f3f1f990d244eb290482be55525daec": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "FloatProgressModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "FloatProgressModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "ProgressView",
+      "bar_style": "success",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_9372c35dcfc04e16a97c0eb63003520e",
+      "max": 14,
+      "min": 0,
+      "orientation": "horizontal",
+      "style": "IPY_MODEL_c6f3cd2404ef4a3096a61c1fcdbddd8f",
+      "value": 14
+     }
+    },
+    "51de9732c1e04961b16351d3f410ac1d": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "5320a84d7a00443e86af8f031d71685d": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_b40ee74dabec45ce842bcfb983d3fa75",
+      "placeholder": "",
+      "style": "IPY_MODEL_0c0ba53346954abc85f0921b682e7279",
+      "value": "Parsing nodes: 100%"
+     }
+    },
+    "5c1eaae6cf2840ab96f1a1d6a1f91881": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "652d2e07d8be4f1f87c2f258cf288f1a": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "6c111aa1d43a4af9b04355a65c8fccb2": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "FloatProgressModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "FloatProgressModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "ProgressView",
+      "bar_style": "success",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_a9e8371d627a48e69c7a725646f689d5",
+      "max": 108,
+      "min": 0,
+      "orientation": "horizontal",
+      "style": "IPY_MODEL_e8a00080ca684fcc97189f5f3ea325e3",
+      "value": 108
+     }
+    },
+    "739a7d470a024bc2806e2ea998bf1dac": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HBoxModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HBoxModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HBoxView",
+      "box_style": "",
+      "children": [
+       "IPY_MODEL_299757dc40394c3287beea74c40dec27",
+       "IPY_MODEL_6c111aa1d43a4af9b04355a65c8fccb2",
+       "IPY_MODEL_4926bed77e464729b902c20bd7874a03"
+      ],
+      "layout": "IPY_MODEL_5c1eaae6cf2840ab96f1a1d6a1f91881"
+     }
+    },
+    "85fa4db33aa8427ba18d43f9a529529b": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "9372c35dcfc04e16a97c0eb63003520e": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "9a4eb44d43dc42d9acdb606b6d55ad9f": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_181bd6b10e9e4ec693ece948fd432302",
+      "placeholder": "",
+      "style": "IPY_MODEL_0c55e54063ea44ab8ea83466d9603a6d",
+      "value": " 14/14 [00:01&lt;00:00, 15.95it/s]"
+     }
+    },
+    "9b38fd520d1a4700bbc596b260a9a96f": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HBoxModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HBoxModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HBoxView",
+      "box_style": "",
+      "children": [
+       "IPY_MODEL_5320a84d7a00443e86af8f031d71685d",
+       "IPY_MODEL_4f3f1f990d244eb290482be55525daec",
+       "IPY_MODEL_9a4eb44d43dc42d9acdb606b6d55ad9f"
+      ],
+      "layout": "IPY_MODEL_51de9732c1e04961b16351d3f410ac1d"
+     }
+    },
+    "a9e8371d627a48e69c7a725646f689d5": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "b40ee74dabec45ce842bcfb983d3fa75": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "c6f3cd2404ef4a3096a61c1fcdbddd8f": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "ProgressStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "ProgressStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "bar_color": null,
+      "description_width": ""
+     }
+    },
+    "d4b409c70f3f4398ad88ede8f438e32a": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "d7213ef5bbb7409cbe40437bde51b5c9": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "e8a00080ca684fcc97189f5f3ea325e3": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "ProgressStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "ProgressStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "bar_color": null,
+      "description_width": ""
+     }
+    }
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}

notebooks/06-Evaluate_RAG.ipynb ADDED Viewed

	@@ -0,0 +1,1491 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "colab_type": "text",
+    "id": "view-in-github"
+   },
+   "source": [
+    "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/05-Improve_Prompts_%2B_Add_Source.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "5BGJ3fxhOk2V"
+   },
+   "source": [
+    "# Install Packages and Setup Variables"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "QPJzr-I9XQ7l",
+    "outputId": "809b17a0-5b45-4e3c-9d3f-72ad8b0e0d9d"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install -q llama-index==0.10.9 openai==1.12.0 cohere==4.47 tiktoken==0.6.0 chromadb==0.4.22"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "id": "riuXwpSPcvWC"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+    "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "id": "km-KQOrgr3VB"
+   },
+   "outputs": [],
+   "source": [
+    "# Allows running asyncio in environments with an existing event loop, like Jupyter notebooks.\n",
+    "\n",
+    "import nest_asyncio\n",
+    "\n",
+    "nest_asyncio.apply()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "Bkgi2OrYzF7q"
+   },
+   "source": [
+    "# Load a Model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "id": "9oGT6crooSSj"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.llms.openai import OpenAI\n",
+    "\n",
+    "llm = OpenAI(temperature=0.9, model=\"gpt-3.5-turbo-0125\", max_tokens=512)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "0BwVuJXlzHVL"
+   },
+   "source": [
+    "# Create a VectoreStore"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "id": "SQP87lHczHKc"
+   },
+   "outputs": [],
+   "source": [
+    "import chromadb\n",
+    "\n",
+    "# create client and a new collection\n",
+    "# chromadb.EphemeralClient saves data in-memory.\n",
+    "chroma_client = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+    "chroma_collection = chroma_client.create_collection(\"mini-llama-articles\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "id": "zAaGcYMJzHAN"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+    "\n",
+    "# Define a storage context object using the created vector database.\n",
+    "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "I9JbAzFcjkpn"
+   },
+   "source": [
+    "# Load the Dataset (CSV)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "_Tif8-JoRH68"
+   },
+   "source": [
+    "## Download"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "4fQaa1LN1mXL"
+   },
+   "source": [
+    "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "fQtpDvUzKNzI",
+    "outputId": "f170fb33-8edc-4993-8025-b2bc5c0d0e99"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+      "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+      "100  169k  100  169k    0     0   743k      0 --:--:-- --:--:-- --:--:--  743k\n"
+     ]
+    }
+   ],
+   "source": [
+    "!curl -o ./mini-dataset.csv https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "zk-4alIxROo8"
+   },
+   "source": [
+    "## Load the Articles"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "_WER5lt0N7c5",
+    "outputId": "521f21f1-c84d-4e1b-9983-8ea17e80ea6c"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "14"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import csv\n",
+    "\n",
+    "rows = []\n",
+    "\n",
+    "# Load the file as a JSON\n",
+    "with open(\"./mini-dataset.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+    "    csv_reader = csv.reader(file)\n",
+    "\n",
+    "    for idx, row in enumerate(csv_reader):\n",
+    "        if idx == 0:\n",
+    "            continue\n",
+    "            # Skip header row\n",
+    "        rows.append(row)\n",
+    "\n",
+    "# The number of characters in the dataset.\n",
+    "len(rows)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "wxEStggPdxYs"
+   },
+   "source": [
+    "# Convert to Document obj"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "id": "lFvW_886dxKX"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.core import Document\n",
+    "from llama_index.core.schema import TextNode\n",
+    "\n",
+    "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+    "documents = [\n",
+    "    Document(\n",
+    "        text=row[1], metadata={\"title\": row[0], \"url\": row[2], \"source_name\": row[3]}, \n",
+    "    )\n",
+    "    for row in rows\n",
+    "]\n",
+    "# By default, the node/chunks ids are set to random uuids. To ensure same id's per run, we manually set them.\n",
+    "for idx, doc in enumerate(documents):\n",
+    "    doc.id_ = f\"doc_{idx}\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "Njoc3XEVkKkf",
+    "outputId": "8dec6077-4301-44ed-ad9b-95d943e00af6"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Document(id_='doc_0', embedding=None, metadata={'title': \"Beyond GPT-4: What's New?\", 'url': 'https://pub.towardsai.net/beyond-gpt-4-whats-new-cbd61a448eb9#dda8', 'source_name': 'towards_ai'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='LLM Variants and Meta\\'s Open Source Before shedding light on four major trends, I\\'d share the latest Meta\\'s Llama 2 and Code Llama. Meta\\'s Llama 2 represents a sophisticated evolution in LLMs. This suite spans models pretrained and fine-tuned across a parameter spectrum of 7 billion to 70 billion. A specialized derivative, Llama 2-Chat, has been engineered explicitly for dialogue-centric applications. Benchmarking revealed Llama 2\\'s superior performance over most extant open-source chat models. Human-centric evaluations, focusing on safety and utility metrics, positioned Llama 2-Chat as a potential contender against proprietary, closed-source counterparts. The development trajectory of Llama 2 emphasized rigorous fine-tuning methodologies. Meta\\'s transparent delineation of these processes aims to catalyze community-driven advancements in LLMs, underscoring a commitment to collaborative and responsible AI development. Code Llama is built on top of Llama 2 and is available in three models: Code Llama, the foundational code model;Codel Llama - Python specialized for Python;and Code Llama - Instruct, which is fine-tuned for understanding natural language instructions. Based on its benchmark testing, Code Llama outperformed state-of-the-art publicly available LLMs (except GPT-4) on code tasks. Llama 2, Llama 2-Chat, and Code Llama are key steps in LLM development but still have a way to go compared to GPT-4. Meta\\'s open access and commitment to improving these models promise transparent and faster LLM progress in the future. Please refer to the LLM and Llama variants below:  From LLMs to Multimodal LLMs, like OpenAI\\'s ChatGPT (GPT-3.5), primarily focus on understanding and generating human language. They\\'ve been instrumental in tasks like text generation, translation, and even creative writing. However, their scope is limited to text. Enter multimodal models like GPT-4. These are a new breed of AI models that can understand and generate not just text, but also images, sounds, and potentially other types of data. The term \"multimodal\" refers to their ability to process multiple modes or types of data simultaneously. This is a game-changer. Imagine an AI that can not only read a description of a dress but also visualize it or even design it! Multimodal AI models are moving us towards more holistic AI systems. These systems can potentially understand our world in a more comprehensive manner, bridging the gap between different forms of data and providing richer, more integrated solutions. As we stand on the cusp of this new era, it\\'s exciting to envision the myriad of applications and innovations that Multimodal models will bring to the table. The future of AI looks more integrated and versatile than ever before.  From Connections to Vector DB The AI landscape is witnessing a fascinating transition: from Language Model (LLM) connections or integrations, e.g., LangChain and LlamaIndex, to the rise of Vector Databases (Vector DB) such as Weaviate, Milvus, Pinecone, Chroma, and Vespa.ai. But what\\'s driving this shift, and why does it matter? LLM connections, like the LlamaIndex, primarily focus on linking and understanding vast amounts of external data. They\\'ve been pivotal in creating semantic connections, enabling more intuitive search experiences, and enhancing data accessibility. However, as the volume and variety of data grow, the need for more advanced storage and retrieval mechanisms becomes evident. This is where Vector DBs come into play. Unlike traditional databases that store data in rows and columns, Vector DBs store data in high-dimensional space, allowing for more efficient and accurate similarity searches. Tools like Weaviate and Milvus are designed to handle massive datasets, making them ideal for tasks like image recognition, recommendation systems, and more. The rise of Vector DBs represents a broader trend in AI: the quest for more efficient, scalable, and versatile data handling solutions. As we navigate this evolution, it\\'s clear that the combination of LLMs and Vector DBs will redefine how we store, access, and understand data in the AI-driven future.  From Agents to OS The AI realm is abuzz with innovations, and one of the most intriguing shifts we\\'re witnessing is the transition from LLM agents to using LLMs as Operating Systems (OS). Let\\'s delve into this evolution and its implications. LLM agents, like AutoGPT, AgentGPT, BabyAGI, and HuggingGPT, have been groundbreaking in automating tasks based on user requests. These agents leverage the power of Language Models (LLMs) to understand and execute commands, making them invaluable in tasks ranging from content generation to data analysis. Their adaptability and intelligence have made them a staple in many AI toolkits. However, the vision for AI doesn\\'t stop there. The concept of LLM as an OS is emerging as the next big thing. Imagine an operating system where the core is a language model, orchestrating everything around it. Such a system would not just execute tasks but would understand context, anticipate needs, and offer solutions in real time. It\\'s like turning the LLM into the brain of the digital ecosystem, making devices and applications more intuitive and responsive than ever. The move towards LLM as OS signifies a paradigm shift in how we perceive and utilize AI. It\\'s not just about automation anymore; it\\'s about creating a seamless, intelligent interface between humans and technology. As we stand on the brink of this transformation, the potential for LLM-driven OS to revolutionize our digital interactions is immense.  From Fine-tuning to Plugins The world of LLMs is undergoing a transformative shift, moving from intricate fine-tuning processes to the more dynamic realm of plugins. Let\\'s unpack this evolution. Historically, fine-tuning has been the cornerstone of LLM optimization. There are two primary ways to fine-tune LLMs: feeding data into the LLM in real-time and directly fine-tuning on the LLM. From a technical standpoint, this involves three methods: Transfer Learning: Adapting a pre-trained model to new tasks.Sequential Fine-tuning: Refining models in stages for specific tasks.Task-specific Fine-tuning: Tailoring models for a particular function. Moreover, LLM techniques like In-context learning, Few-shot learning, and Zero-shot learning have further enhanced the model\\'s adaptability, allowing them to understand and generate content with minimal data. However, the future of LLMs is leaning towards plugins. With the introduction of tools like GPT-4 Plugins, the focus is on extending LLMs seamlessly. Instead of running LLMs as a service, they\\'re envisioned as platforms. This means integrating LLMs with various tools, enhancing their capabilities, and offering a more modular and scalable approach to AI applications. The journey from fine-tuning to plugins represents a move from static optimization to dynamic adaptability, ensuring that LLMs remain at the forefront of AI innovation.  In a Nutshell The AI domain is witnessing rapid shifts, with LLMs playing a central role. Initially, the move was from LLMs to Multimodal models, expanding from text to include images and sounds. Simultaneously, the trend shifted from LLM connections, which linked external data, to Vector Databases for efficient high-dimensional storage. Another evolution saw LLM agents, which automated tasks, transitioning towards LLMs as Operating Systems. This change aims for more intuitive, context-aware devices and applications. Furthermore, the traditional fine-tuning processes of LLMs are now being replaced by dynamic plugins, turning LLMs into platforms integrated with various tools. Leading this LLM revolution are OpenAI\\'s GPT-4 and Meta\\'s LLaMA2. Their pioneering efforts are setting the stage for an AI future that\\'s more integrated, responsive, and attuned to human interactions.  More Readings Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond: https://arxiv.org/abs/2304.13712Sparks of Artificial General Intelligence: Early experiments with GPT-4: https://arxiv.org/abs/2303.12712GPT4All-J: https://huggingface.co/nomic-ai/gpt4all-jIntroducing Code Llama, a state-of-the-art large language model for coding: https://ai.meta.com/blog/code-llama-large-language-model-coding/Llama 2: Open Foundation and Fine-Tuned Chat Models: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n')"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "documents[0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "S17g2RYOjmf2"
+   },
+   "source": [
+    "# Transforming"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "id": "STACTMUR1z9N"
+   },
+   "outputs": [],
+   "source": [
+    "from llama_index.core.node_parser import TokenTextSplitter\n",
+    "from llama_index.core.schema import BaseNode\n",
+    "import hashlib\n",
+    "\n",
+    "def deterministic_id_func(i: int, doc: BaseNode) -> str:\n",
+    "    \"\"\"Deterministic ID function for the text splitter.\n",
+    "    This will be used to generate a unique repeatable identifier for each node.\"\"\"\n",
+    "    unique_identifier = doc.id_ + str(i)\n",
+    "    hasher = hashlib.sha256()\n",
+    "    hasher.update(unique_identifier.encode('utf-8')) \n",
+    "    return hasher.hexdigest()\n",
+    "\n",
+    "text_splitter = TokenTextSplitter(separator=\" \", chunk_size=512, chunk_overlap=128, id_func=deterministic_id_func)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 331,
+     "referenced_widgets": [
+      "76fea2dabfea42aa8bc7ae719f2a22ee",
+      "6c575687c8f1468a803b88eea3d26b7b",
+      "c266531dafcf4624af5fe9bcbc9d8df9",
+      "e20a27a2f7764cb4b9537e34a3659c9a",
+      "bba307f545cd4533be6f0489f95b9895",
+      "eb057e56f0f94e4993b8ae960c78b0ad",
+      "2073b65c0db045aa8e86d91a4fea2e2b",
+      "8141417665024172a4baa78c497acb69",
+      "01d27fdbe86a4ca2830b9bf3ccbf1ae9",
+      "e4fe85a095e64d52b6a53c2a4bba8aeb",
+      "70e17db8fc2f490f85b7af8aa664f0c7",
+      "c0a70bcdf3fb4bbfb2675b8012b2ef24",
+      "665b9b5e85a34be8a20d40c51e57cfe0",
+      "b604cef3deca4847afcc459e5c8a9e0f",
+      "076728d713254b49935c7938d18014f2",
+      "be591abb84a24c4b9903087501ebb0e5",
+      "85f23ab21c3b404aaa146cfcaefc85d8",
+      "10340f8e7c8e482c8d35047a3e43ee7f",
+      "1095efa793804a3fb625855e715a5317",
+      "b43a5a6a65034a16927700e442dde52a",
+      "121dbf44a222434cbc57ebe6beb83e2a",
+      "2af0821ebb7e47988d134d4ec2776e87"
+     ]
+    },
+    "id": "CtdsIUQ81_hT",
+    "outputId": "325e8cd3-ce27-4ab0-e542-cbfdb7a0debb"
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/louis/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+      "  from .autonotebook import tqdm as notebook_tqdm\n",
+      "Parsing nodes:   0%|          | 0/14 [00:00<?, ?it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Parsing nodes: 100%|██████████| 14/14 [00:00<00:00, 38.11it/s]\n",
+      "Generating embeddings: 100%|██████████| 108/108 [00:01<00:00, 75.25it/s]\n"
+     ]
+    }
+   ],
+   "source": [
+    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+    "from llama_index.core.ingestion import IngestionPipeline\n",
+    "\n",
+    "pipeline = IngestionPipeline(\n",
+    "    transformations=[\n",
+    "        text_splitter,\n",
+    "        OpenAIEmbedding(),\n",
+    "    ],\n",
+    "    vector_store=vector_store\n",
+    ")\n",
+    "\n",
+    "nodes = pipeline.run(documents=documents, show_progress=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "TextNode(id_='4ab5bd897f01474fc9b0049f95e31edae3ccd9e74d0f0acd3932b50a74d608b6', embedding=[-0.022741511464118958, 0.010871483013033867, -0.017776913940906525, -0.013163917697966099, 0.004405552521348, 0.013564742170274258, -0.02842337265610695, 0.025638697668910027, -0.03861978277564049, -0.02869058959186077, 0.02842337265610695, 0.028282733634114265, -0.028310861438512802, -0.014127302914857864, 0.008079776540398598, 0.01933801919221878, 0.014879727736115456, 0.0029657490085810423, 0.004658704623579979, -0.004802860785275698, -0.0027108388021588326, 8.63068999024108e-05, -0.006613602861762047, -0.01984432525932789, 0.004848569165915251, 0.026398155838251114, 0.025976235046982765, -0.028887486085295677, -0.017312802374362946, 0.001968962140381336, 0.01291076559573412, 0.014056982472538948, -0.029225021600723267, -0.00135805644094944, -0.013853054493665695, -0.017256546765565872, 0.01682056114077568, -0.0057416339404881, 0.035750724375247955, -0.010927739553153515, 0.014296070672571659, 0.007974295876920223, 0.006483510602265596, -0.030462656170129776, -0.027888940647244453, -8.394458563998342e-05, 0.022572742775082588, -0.02655285969376564, -0.025498058646917343, 0.0010969931026920676, -0.004036372061818838, 0.04545489326119423, -0.03299417719244957, 0.019858388230204582, 0.0024524126201868057, -0.004117240197956562, 0.006311226636171341, -0.0013053163420408964, 0.02604655548930168, 0.013824926689267159, -0.0024770244490355253, -0.004141852259635925, -0.017819106578826904, 0.021278854459524155, -0.010730843059718609, -0.00561505788937211, -0.030575167387723923, 0.033022306859493256, 0.008930649608373642, -0.008635304868221283, -0.0006724356790073216, 0.01545635238289833, 0.008473568595945835, -0.022910280153155327, 0.028831230476498604, 0.007833655923604965, -0.018578562885522842, -0.02040688507258892, -0.024935496971011162, 0.006392094772309065, 0.017003392800688744, 0.003584565594792366, -0.001132153207436204, 0.03456934913992882, 0.017383122816681862, -0.005024369340389967, 0.02116634137928486, -0.019155187532305717, -0.011982540600001812, -0.027087291702628136, -0.0009071289678104222, -0.0011550071649253368, 0.05105237290263176, 0.022249270230531693, -0.031644031405448914, 0.0063604507595300674, -0.01480940729379654, 0.014000726863741875, -0.020899126306176186, -0.021827351301908493, -0.025287097319960594, -0.019112994894385338, -0.018086323514580727, -0.019731812179088593, -0.015400095842778683, 0.010189378634095192, 0.01698932982981205, 0.021672645583748817, 0.0048942770808935165, 0.03127836808562279, -0.01703152246773243, 0.045567408204078674, 0.005386517383158207, -0.04013869911432266, -0.017354993149638176, 0.0065186708234250546, 0.027720171958208084, -0.010751939378678799, -0.009275217540562153, 0.022010182961821556, 0.02680601179599762, 0.02210863120853901, 0.00830480083823204, -0.00379376788623631, 0.021025702357292175, 9.32290349737741e-05, -0.016398640349507332, -0.003577533643692732, -0.020055284723639488, 0.0017799768829718232, 0.023543160408735275, 0.024190105497837067, 0.03380989283323288, 0.004201624542474747, -0.03794471174478531, 0.02441512979567051, -0.02019592560827732, -0.013227205723524094, -0.02594810724258423, -0.01770659349858761, -0.0036144517362117767, 0.02594810724258423, 0.003022005083039403, 0.013613966293632984, -0.020055284723639488, 0.017987875267863274, 0.021278854459524155, 0.014401551336050034, 0.026398155838251114, 0.0005067440215498209, 0.005400581751018763, -0.03347235545516014, -0.021967990323901176, 0.011806740425527096, 0.002165858168154955, 0.014893791638314724, 0.019225507974624634, 0.006919495295733213, -0.01608923263847828, -0.0027723689563572407, -0.014992239885032177, 0.014253878965973854, -0.013473326340317726, 0.006068622227758169, 0.0272701233625412, 0.03181280195713043, 0.02984383888542652, -0.018128514289855957, 0.0013457504101097584, -0.017903489992022514, -0.03108147159218788, 0.013234238140285015, -0.044245388358831406, 0.02099757455289364, -0.0010732600931078196, 0.011982540600001812, 0.003305043326690793, -0.005488481838256121, -0.014978175051510334, -0.020294373854994774, 0.0017544857691973448, 0.001155886217020452, 0.0035634697414934635, 0.007165615446865559, -0.02210863120853901, -0.011391852051019669, 0.0019619299564510584, 0.010646458715200424, 0.0017035037744790316, -0.010899611748754978, 0.02902812696993351, 0.01720028929412365, -0.002190470229834318, -0.023754119873046875, -0.618816614151001, -0.032122209668159485, -0.0021482782904058695, -0.03226285055279732, -0.0014064015122130513, -0.01592046394944191, -0.01878952421247959, -0.005463869776576757, -0.02334626391530037, 0.03850727155804634, -0.021067893132567406, -0.003493149532005191, 0.010449563153088093, -0.0165674090385437, 0.002985086990520358, -0.023149367421865463, 0.0019021580228582025, -0.023121239617466927, 0.019689619541168213, 0.007320319768041372, -0.011398883536458015, -0.0023627544287592173, 0.0028514789883047342, -0.007242967374622822, -0.01711590588092804, -0.0023170465137809515, -0.01265761349350214, 0.00934553798288107, 0.009514305740594864, 0.01250994112342596, -0.04587681591510773, 0.019436467438936234, 0.004739572759717703, -0.026116875931620598, 0.04058874770998955, -0.008860329166054726, -0.01150436419993639, 0.01831134781241417, -0.0053126816637814045, 0.013993694446980953, -0.02372599206864834, -0.015779824927449226, 0.013262365944683552, 0.013494421727955341, -0.01517507154494524, 0.029337534680962563, 0.02411978505551815, -0.006427254527807236, -0.021714838221669197, -0.014049950987100601, 0.0036566436756402254, -5.878318552277051e-05, 0.020772550255060196, -0.008543889038264751, 0.001970720011740923, 0.012439620681107044, 0.04013869911432266, -0.011293403804302216, 0.003962535876780748, 0.005804921966046095, -0.0010213990462943912, 0.010632394813001156, -0.032544128596782684, -0.02804364450275898, -0.02646847628057003, 0.017622210085392, 0.006578442640602589, 0.013332685455679893, 0.0073695434257388115, 0.0006047526258043945, 0.00031116630998440087, 0.027607660740613937, 0.013093597255647182, -0.016243936493992805, -0.002934104995802045, 0.01480940729379654, 0.01035111490637064, -0.00815009605139494, 0.014092142693698406, 0.03189718350768089, 0.015779824927449226, -0.01521726418286562, 0.004880213178694248, -0.009225993417203426, 0.03718525543808937, 0.01163094025105238, -0.002315288409590721, -0.011497331783175468, 0.0270591638982296, 0.011201987974345684, 0.018902035430073738, 0.012179436162114143, -0.038141608238220215, -0.032769154757261276, 0.015386031940579414, 0.021321045234799385, -0.01732686534523964, 0.012109116651117802, 0.018930163234472275, -0.03200969845056534, -0.015245391987264156, -0.016961202025413513, 0.032206594944000244, 0.008782977238297462, 0.03366925194859505, 0.02770610898733139, -0.03808535262942314, -0.008248544298112392, 0.0160470400005579, -0.03400678560137749, -0.01009796280413866, 0.0051861051470041275, -0.016061104834079742, -0.016764305531978607, 0.019183315336704254, -0.02514645829796791, -0.0013334443792700768, -0.016975264996290207, -0.003433377481997013, -0.008297768421471119, 0.0320940800011158, -0.013698350638151169, 0.009036129340529442, -0.017144033685326576, 0.01900048367679119, 0.02634189836680889, -0.008965808898210526, -0.024808920919895172, -0.014049950987100601, 0.018887972459197044, -0.014739086851477623, -0.01082225888967514, 0.012481813319027424, -0.01566731184720993, 0.003106389194726944, 0.01310766115784645, 0.044245388358831406, 0.005010304972529411, 0.007320319768041372, -0.013803830370306969, -0.026876332238316536, -0.009127545170485973, 0.01860669068992138, -0.004475872498005629, -0.03915421664714813, -0.031193984672427177, -0.01916925236582756, 0.008107904344797134, 0.007063651457428932, -0.006574926897883415, -0.014795343391597271, -0.008993937633931637, 0.009148641489446163, 0.018986418843269348, 0.0015171555569395423, -0.011820804327726364, -0.005783826112747192, -0.030068863183259964, -0.0043879724107682705, -0.01642676815390587, 0.008368088863790035, 4.3263327825115994e-05, -0.006859723012894392, 0.0019759940914809704, 0.004169980529695749, -0.010442530736327171, -0.022896215319633484, 0.028029581531882286, -0.025498058646917343, -0.021096020936965942, -0.004581352695822716, -0.03518816456198692, 0.006782371085137129, 0.011961444281041622, -0.014007758349180222, 0.02420416846871376, -0.003804316045716405, -0.00504898140206933, -0.0074961199425160885, -0.001010851003229618, 0.003296253504231572, 0.031109599396586418, 0.0004518064670264721, -0.02177109383046627, 0.0158360805362463, 0.017622210085392, 0.03760717436671257, 0.014457806944847107, -0.021053830161690712, 0.010850387625396252, 0.016511153429746628, 0.01686275377869606, -0.022994663566350937, 0.03375363349914551, -0.017214354127645493, 0.011623907834291458, 0.0070601352490484715, -0.01805819384753704, 0.013156885281205177, 0.0377478152513504, 0.00894471351057291, 0.0156251210719347, -0.016722112894058228, -0.010238602757453918, 0.010533946566283703, -0.030153246596455574, 0.012306013144552708, -0.019014548510313034, -0.010393306612968445, -0.005608025938272476, 0.003994180355221033, -0.00656437873840332, -0.008740784600377083, -0.012207564897835255, 0.0011330321431159973, 0.031475264579057693, -0.005491997580975294, 0.007183195557445288, -0.02642628364264965, 0.010674587450921535, 0.003213627263903618, 0.016919009387493134, -0.01376867014914751, 0.012678708881139755, -0.010801163502037525, 0.004704413004219532, -0.019689619541168213, 0.020378757268190384, -0.007545343600213528, -0.03144713491201401, 0.004500484559684992, 0.00932444166392088, 0.0327128991484642, 0.004528612829744816, 0.023107176646590233, -0.017833169549703598, 0.022769639268517494, 0.0011602812446653843, 0.044414158910512924, -0.005952594336122274, -0.00727812759578228, 0.003642579773440957, -4.436207746039145e-05, -0.03068768046796322, 0.012629484757781029, -0.01033001858741045, 0.038141608238220215, -0.014471870847046375, -0.017312802374362946, -0.005414645653218031, -0.036482054740190506, 0.011680164374411106, -0.0024383484851568937, 0.00471496069803834, 0.029309406876564026, -0.009830745868384838, 0.004349296446889639, 0.0031169371213763952, 0.015287583693861961, 0.0036671918351203203, -0.013086565770208836, 0.0012965262867510319, -0.0029358630999922752, 0.014978175051510334, 0.021883606910705566, -0.005231813527643681, -0.00420514028519392, -0.011427012272179127, -0.007165615446865559, -0.0137897664681077, -0.020842868834733963, -0.01005577016621828, 0.024612026289105415, -0.040532488375902176, 0.042838986963033676, 0.020856933668255806, 0.004560256842523813, 0.014725022949278355, -0.003726963885128498, 0.03170028701424599, -0.024851113557815552, -0.03752278909087181, 0.015076623298227787, -0.00843137688934803, -0.032037824392318726, -0.019577108323574066, -0.018705138936638832, 0.007657855749130249, -0.0017035037744790316, 0.00044235720997676253, -0.009092384949326515, -0.008635304868221283, -0.01237633265554905, 0.012460717000067234, 0.00033292159787379205, 0.008093840442597866, 0.015146943740546703, -0.0065995389595627785, 0.00830480083823204, -0.020983509719371796, 0.02028030902147293, 0.011834868229925632, -0.00966900959610939, -0.005361905321478844, 0.01197550818324089, -0.01579388789832592, -0.03364112228155136, 0.0001978850777959451, 0.0003425906179472804, -0.03347235545516014, 0.003646095748990774, -0.007545343600213528, 0.008157128468155861, -0.04098253697156906, 0.015822015702724457, 0.012481813319027424, 0.020603781566023827, 0.0033683315850794315, 0.019239572808146477, 0.013185014016926289, -0.008129000663757324, 0.001795798889361322, -0.010787099599838257, 0.01933801919221878, 0.04838021099567413, 0.01873326674103737, 0.0039273761212825775, 0.0011312741553410888, -0.005878758151084185, 0.003296253504231572, -0.024837050586938858, 0.0017369057750329375, 0.0009800860425457358, 0.010836322791874409, -0.0165674090385437, -0.019323956221342087, 0.018241027370095253, 0.001310590305365622, 0.04008243978023529, 0.0030817771330475807, 0.010301890783011913, -0.014239815063774586, -0.009514305740594864, -0.012974053621292114, 0.014570319093763828, -0.002651066752150655, 0.009929194115102291, 0.024358872324228287, 0.011729388497769833, -0.009739330038428307, 0.008143064565956593, 0.02847963012754917, -0.006339354440569878, -0.02168671041727066, 0.01212318055331707, 0.004612996708601713, 0.008768913336098194, 0.008614208549261093, -0.016792433336377144, 0.01146217156201601, -0.0003208353300578892, -0.0036918038967996836, 0.01391634251922369, 0.015090687200427055, 0.004380940459668636, 0.02403540164232254, 0.008192288689315319, 0.013262365944683552, 0.009619786404073238, -0.014950047247111797, -0.003923859912902117, 0.010154218412935734, -0.006958171259611845, -0.03935111314058304, 0.0036812557373195887, 0.004398520570248365, -0.04084189981222153, -0.001738663762807846, 0.028451502323150635, 0.00656437873840332, 0.0013360814191401005, -0.011019155383110046, -0.004669252783060074, -0.03513190895318985, -0.006300678476691246, -0.03051891177892685, 0.007559407968074083, -0.015315711498260498, -0.003642579773440957, -0.0036953198723495007, -0.003934408072382212, 0.0012437863042578101, -0.016511153429746628, -0.0004693864902947098, -0.01644083298742771, -0.010871483013033867, -0.05805625393986702, -0.013649126514792442, -0.0014090384356677532, -0.004268428310751915, 0.010885546915233135, -0.002598326653242111, 0.0035740176681429148, 0.021799223497509956, -0.008677496574819088, -0.02057565376162529, 0.002466476522386074, -0.019999029114842415, 0.0057416339404881, -0.023275943472981453, -0.003797283861786127, -0.020674102008342743, -0.012531036511063576, 0.022558679804205894, -0.008881425485014915, -0.014092142693698406, -0.020097477361559868, 0.0024207686074078083, 0.005583413876593113, 0.02420416846871376, 0.015990784391760826, 0.006757759023457766, 0.02330407127737999, -0.023191560059785843, -0.0009449259960092604, -0.018044130876660347, -0.019956836476922035, -0.035835109651088715, 0.0031257271766662598, 0.008550920523703098, 0.03538506105542183, 0.008515761233866215, 0.010147186927497387, -0.020645974203944206, 0.0007199017563834786, -0.014120270498096943, 0.01212318055331707, -0.0017773398431017995, 0.01248884480446577, -0.014106206595897675, 0.01186299603432417, -0.003447441617026925, -0.004848569165915251, -0.029900094494223595, 0.017003392800688744, -0.03018137440085411, 0.020392820239067078, 0.01030892226845026, 0.010140154510736465, 0.017186226323246956, 0.022657128050923347, 0.001765912864357233, -0.045398637652397156, 0.0003348993486724794, 0.001233238261193037, 0.014155430719256401, -0.003814863972365856, -0.011419979855418205, -0.0023838505148887634, -0.014570319093763828, -0.015231328085064888, 0.009099417366087437, -0.02487924136221409, 0.0063604507595300674, -0.015118815936148167, -0.004324684385210276, -0.009317409247159958, -0.01492191944271326, 0.004757152870297432, -0.02919689379632473, -0.009401793591678143, 0.029309406876564026, 0.017383122816681862, 0.031137729063630104, -0.013494421727955341, 0.010386275127530098, -0.03811347857117653, -0.016412705183029175, 0.0005243240157142282, -0.02361348085105419, -0.010744906961917877, -0.005970173981040716, 0.011722356081008911, 0.016539281234145164, 0.021785158663988113, 0.006036978214979172, 0.018283218145370483, 0.01575169712305069, -0.001937318011187017, -0.0064307707361876965, -0.009929194115102291, 0.00021964035113342106, -0.02001309208571911, -0.013466293923556805, 0.012650581076741219, -0.0034861175809055567, 0.009844809770584106, 0.004764184821397066, -0.0019654459320008755, 0.002165858168154955, -0.015118815936148167, -0.00407504802569747, -0.0183535385876894, -0.04098253697156906, -0.021335110068321228, 0.008550920523703098, -0.0065010907128453255, -0.002301224274560809, -0.04643937572836876, -0.017790978774428368, 0.01856449991464615, 0.008438408374786377, 0.014626574702560902, 0.011912220157682896, 0.03704461455345154, -0.028887486085295677, -0.0025860206224024296, 0.030378270894289017, 0.016975264996290207, -0.00828370451927185, -0.007063651457428932, -0.043907854706048965, 0.013909310102462769, 0.015203199349343777, 0.007179679349064827, 0.040448106825351715, 0.02629970759153366, -0.015639184042811394, 0.016876816749572754, 0.014141366817057133, 0.0032487872522324324, 0.010231570340692997, -0.004451260436326265, -0.010259699076414108, 0.0035828077234327793, -0.012263820506632328, -0.025118330493569374, -0.023768184706568718, -0.019239572808146477, 0.011047283187508583, 0.01329752616584301, 0.030631422996520996, -0.024921434000134468, -0.020730357617139816, 0.02372599206864834, 0.008958777412772179, 0.050827350467443466, 0.013311590068042278, 0.008396216668188572, 0.02378224954009056, 0.009549465961754322, -0.01113869994878769, 0.01109650731086731, 0.01238336507230997, -0.014106206595897675, 0.020645974203944206, 0.015822015702724457, 0.002637002617120743, -0.009788554161787033, 0.012446653097867966, 0.010315954685211182, -0.03935111314058304, -0.04860523343086243, 0.010034674778580666, 0.02129291743040085, 0.0055060614831745625, -0.03589136525988579, -0.0300969909876585, -0.02510426566004753, -0.0009765700669959188, -0.02535741776227951, 0.023163432255387306, 0.009992482140660286, -0.008185256272554398, 0.010998059064149857, 0.008881425485014915, 0.010119058191776276, -0.0005753060686402023, -0.004873181227594614, 0.021714838221669197, 0.004651672672480345, 0.0014406824484467506, -0.0032030793372541666, 0.010168282315135002, -0.006128394510596991, 0.03760717436671257, -0.008930649608373642, 0.011968476697802544, 0.010428466834127903, -0.0013633304042741656, 0.0061811343766748905, -0.008192288689315319, 0.004426648374646902, 0.03693210333585739, -0.03552570194005966, -0.011110571213066578, -0.008241512812674046, -0.016187680885195732, 0.016243936493992805, -0.015892336145043373, 0.014049950987100601, -0.004612996708601713, -0.01374757383018732, 0.0036777397617697716, 0.023571288213133812, 0.024021336808800697, -0.03181280195713043, 0.006944107357412577, 0.0028690588660538197, -0.03240348771214485, -0.027002908289432526, 0.005797890014946461, 0.03257225826382637, -0.0371289998292923, 0.007854752242565155, 0.008916584774851799, -0.0213913656771183, 0.021278854459524155, 0.021025702357292175, -0.003814863972365856, -0.029421918094158173, 0.03231910616159439, -0.03386614844202995, 0.02189766988158226, 0.0010591960744932294, -0.010400339029729366, -0.026651307940483093, -0.001455625519156456, -0.015273519791662693, -0.029253149405121803, 0.004468840546905994, -0.025413675233721733, -0.022094566375017166, -0.011448107659816742, 0.01690494641661644, 0.0065714106895029545, -0.010217506438493729, 0.01355067826807499, 0.003635547822341323, 0.0031116632744669914, -0.001038100104779005, -0.01575169712305069, -0.00142222351860255, 0.023191560059785843, 0.000530477031134069, 0.003885183949023485, 0.030575167387723923, -0.003380637615919113, 0.011926284059882164, -0.013958534225821495, -0.00555880181491375, -0.009486177936196327, -0.057606205344200134, -0.020674102008342743, 0.009493209421634674, 0.001775581855326891, -7.636320515302941e-05, 0.001283341320231557, -0.01648302562534809, -0.01020344253629446, -0.01263651717454195, -0.0020234601106494665, 0.010372210294008255, 0.0027477568946778774, 0.007390639744699001, 0.023360328748822212, -0.00031160583603195846, 0.008614208549261093, -0.01801600307226181, -0.02074442058801651, -0.019014548510313034, -0.003157371189445257, -0.03189718350768089, -0.018620755523443222, -0.03366925194859505, 0.05063045397400856, -0.006374514661729336, -0.03876042366027832, -0.02122259885072708, -0.014992239885032177, -0.03825411945581436, -0.020730357617139816, 0.002598326653242111, 0.018114451318979263, 0.012531036511063576, 0.016933074221014977, 0.0025719567202031612, 0.036003876477479935, 0.006339354440569878, 0.0050630453042685986, -0.027481084689497948, 0.012685741297900677, -0.000674193724989891, -0.012917797081172466, 0.01278418954461813, 0.01776285097002983, -0.02103976532816887, 0.018536372110247612, 0.012031764723360538, -0.02783268503844738, -0.024429192766547203, 0.02701697126030922, -0.01521726418286562, -0.009901066310703754, 0.022038310766220093, -0.008867361582815647, 0.007046071346849203, -0.012650581076741219, 0.020435012876987457, -0.03116585686802864, -0.009493209421634674, 0.026398155838251114, -0.006409674417227507, 0.016272064298391342, -0.014781279489398003, 0.0174112506210804, 0.0093314740806818, 0.008804073557257652, 0.016314256936311722, -0.012594325467944145, 0.00619871448725462, 0.004686832893639803, 0.043823469430208206, 0.01959117315709591, 0.01073787547647953, 0.029393790289759636, -0.01634238474071026, -0.0015250665601342916, -0.007678952068090439, 0.015090687200427055, 0.0007809923263266683, -0.00855795294046402, 0.04354218766093254, -0.016511153429746628, 0.00981668196618557, -0.010133122093975544, 0.002937620971351862, -0.02250242419540882, -0.017228417098522186, -0.016272064298391342, -0.0027917069382965565, -0.022685255855321884, 0.014246846549212933, 0.019872453063726425, -0.022164886817336082, -0.0031608871649950743, -0.012931860983371735, 0.02258680760860443, 0.0036707078106701374, -0.01404291857033968, -0.005818985868245363, -0.0012341173132881522, -0.003450957592576742, 0.019239572808146477, 0.010126090608537197, -0.006184650585055351, 0.014324198476970196, 0.003595113754272461, -0.022136759012937546, 0.0158360805362463, 0.199258953332901, -0.031222112476825714, 0.013909310102462769, 0.02873278222978115, 0.01715809851884842, -0.016637729480862617, 0.04435790330171585, 0.007981328293681145, 0.001445077476091683, -0.004553224891424179, 0.006673374678939581, 0.005931498017162085, -0.016328321769833565, 0.00015118815645109862, 0.01912705972790718, -0.026327835395932198, -0.021588262170553207, -0.035919494926929474, -0.017861299216747284, -0.00420514028519392, 0.005949078127741814, 0.0009370149928145111, -0.00689488323405385, -0.022572742775082588, -0.0030677132308483124, 0.005235329270362854, 3.282519173808396e-05, -0.0031485813669860363, 0.01869107596576214, 0.0013018003664910793, -0.01660960167646408, 0.005207201465964317, -0.008368088863790035, 0.0019197380170226097, 0.00042521668365225196, -0.00966900959610939, 0.010379242710769176, -0.0004133501788601279, 0.006100266240537167, 0.024738602340221405, 0.02189766988158226, 0.022136759012937546, 0.0036812557373195887, -0.025301162153482437, 0.01545635238289833, 0.011363723315298557, -0.003892216132953763, 0.008593113161623478, 0.008009456098079681, 0.007341415621340275, -0.022558679804205894, 0.022657128050923347, 0.023233752697706223, 0.020842868834733963, -0.006497574504464865, 0.0011752241989597678, -0.01963336393237114, 0.015090687200427055, 0.00044389546383172274, -0.004852084908634424, -0.027115419507026672, -0.008501696400344372, 0.00033907461329363286, 0.02399320900440216, -0.010442530736327171, 0.012242725118994713, -0.007510183844715357, -0.023922888562083244, 0.007875848561525345, -0.02911251038312912, -0.011954412795603275, -0.014865663833916187, 0.00011613799870247021, -0.011574683710932732, -0.019830260425806046, -0.03887293487787247, 0.021841414272785187, 0.028015516698360443, 0.0007084747194312513, 0.04874587431550026, -0.003790251910686493, -0.03906983137130737, 0.004268428310751915, -0.012038796208798885, 0.005245877429842949, -0.023669736459851265, 0.009394762106239796, -0.015273519791662693, -0.021616389974951744, -0.011546555906534195, -0.016722112894058228, -0.0095424335449934, 0.004212172236293554, 0.025160521268844604, -0.00016404355119448155, 0.004493452608585358, 0.007671920116990805, 0.005734601989388466, -0.010660522617399693, -0.03116585686802864, -0.007249999325722456, 0.05923762917518616, 0.021714838221669197, 0.0031749513000249863, -0.0006869392236694694, 0.01933801919221878, -0.002934104995802045, 0.000356215110514313, -0.0023064983543008566, 0.0006966082146391273, 0.009640881791710854, -0.027903005480766296, 0.011201987974345684, 0.003617967711761594, -0.0031151792500168085, 0.011989572085440159, 0.010927739553153515, -0.009753393940627575, 0.016159553080797195, -0.009992482140660286, -0.007200775668025017, -0.022052375599741936, -0.005903370212763548, 0.011427012272179127, -0.00012185150262666866, -0.02714354731142521, 0.0069792671129107475, 0.0008552678627893329, -0.027860812842845917, 0.017186226323246956, 0.0003729161398950964, -0.03982928767800331, 0.009605721570551395, 0.003660159884020686, 0.0006271671736612916, -0.008593113161623478, 0.014654703438282013, -0.006374514661729336, -0.02860620617866516, 0.013628030195832253, -0.008782977238297462, 0.024597961455583572, 0.004169980529695749, -0.021757030859589577, 0.014324198476970196, -0.014106206595897675, 0.0022766124457120895, 0.01530164759606123, -0.013044373132288456, -0.020125605165958405, -0.01980213262140751, 0.007995392195880413, 0.005274005234241486, 0.009443986229598522, -0.0011945621808990836, 0.024133849889039993, -0.011968476697802544, -0.0006983662024140358, 0.022980600595474243, 0.008607177063822746, -0.028578078374266624, 0.00297278119251132, 0.01558292843401432, 0.007042555138468742, -0.016032977029681206, -0.006543282885104418, -0.180806964635849, -0.014753151684999466, 0.011553588323295116, -0.04022308066487312, 0.018381666392087936, 0.005629121791571379, 0.012967021204531193, 0.008325896225869656, -0.011187923140823841, 0.001034584129229188, 0.021714838221669197, -0.0183535385876894, -0.0046270606108009815, -0.005984238348901272, -0.0009106449433602393, 0.00826260820031166, -0.008438408374786377, 0.009809650480747223, 0.011884092353284359, 0.0008056043297983706, 0.03127836808562279, -0.026876332238316536, 0.00981668196618557, -0.009465081617236137, 0.017523761838674545, 0.012334140948951244, 0.009190833196043968, 0.042276427149772644, -0.01736905798316002, -0.03099708817899227, -0.011265275999903679, -0.015034431591629982, 0.028999997302889824, 0.006212778389453888, 0.030968960374593735, 0.031193984672427177, 0.011490300297737122, -0.01967555657029152, -0.018578562885522842, -0.015653248876333237, 0.022375846281647682, 0.013424102216959, 0.023979144170880318, -0.008593113161623478, -0.032122209668159485, 0.007573471870273352, 0.007573471870273352, -0.021503878757357597, -0.0015022126026451588, -0.01291076559573412, 0.016398640349507332, 0.009718233719468117, 0.014654703438282013, -0.004286008421331644, 0.024865178391337395, 0.03085644729435444, -0.005695926025509834, 0.003632031846791506, -0.007123423274606466, -0.020224053412675858, -0.00035885212128050625, -0.0001596485381014645, 0.0007027612300589681, -0.0007317682611756027, 0.00857904925942421, -0.03496313840150833, -0.007819592021405697, 0.005207201465964317, -0.04025121033191681, 0.0018617239547893405, -0.03338797017931938, 0.003080019261687994, -0.028057709336280823, -0.013986662030220032, 0.027818620204925537, 0.038788553327322006, -0.030490783974528313, 0.01736905798316002, 0.04427351802587509, 0.008459504693746567, -0.019984964281320572, 0.0027477568946778774, -0.01874733157455921, 0.02129291743040085, -0.004099660087376833, 0.005516609642654657, 0.015934528782963753, 0.0254839938133955, -0.015245391987264156, -0.009183801710605621, 0.019619300961494446, -0.009844809770584106, 0.017397185787558556, -0.011827835813164711, -0.0007642912678420544, 0.01374757383018732, 0.010780067183077335, -0.03479437157511711, -0.0058717261999845505, -0.016468960791826248, 0.0074679916724562645, 0.0060123661532998085, -0.009289281442761421, -0.011012122966349125, 0.019956836476922035, 0.022136759012937546, -0.022952470928430557, 0.021025702357292175, 0.028324924409389496, -0.003278673393651843, -0.01950678788125515, -0.00892361719161272, 0.0023539643734693527, 0.003345477394759655, 0.0018441439606249332, -0.0009686590055935085, -0.018817652016878128, -0.028676524758338928, 0.03248787298798561, -0.0020093959756195545, 0.05136178061366081, -0.007967264391481876, 0.0026440348010510206, 0.02185547910630703, -0.009774490259587765, -0.03456934913992882, -0.10452375560998917, -0.03563821315765381, 0.018902035430073738, 0.03150339424610138, -0.016581473872065544, 0.03282541036605835, -0.005140397232025862, 0.04115130752325058, -0.00771411182358861, 0.03926672786474228, -0.005210717208683491, -0.004187560174614191, 0.023965081200003624, 0.016145488247275352, 0.014471870847046375, -0.011982540600001812, 0.000530477031134069, -0.015822015702724457, -0.027888940647244453, 0.029478173702955246, 0.017045585438609123, -0.01020344253629446, 0.01061129942536354, -0.03217846527695656, 0.01365615800023079, -0.020772550255060196, -0.038225989788770676, 0.019408339634537697, 0.005414645653218031, -0.003378879511728883, 0.012291948311030865, -0.0156251210719347, 0.008986905217170715, -0.016792433336377144, 0.011180891655385494, -0.004261396359652281, 0.015245391987264156, -0.019816195592284203, 0.015526671893894672, -0.00015261652879416943, -0.010252666659653187, 0.023022791370749474, 0.01214427687227726, -0.008775944821536541, -0.02531522698700428, 0.004595416598021984, -0.009310377761721611, 0.019070804119110107, 0.003340203547850251, -0.028156157582998276, -0.040194954723119736, -0.0027407249435782433, -0.048295825719833374, 0.008958777412772179, 0.013030309230089188, -0.010344082489609718, -0.0016797707648947835, 0.02539961040019989, -0.011005091480910778, -0.009261153638362885, -0.000408515683375299, -0.018423859030008316, -0.014078078791499138, 0.0028409308288246393, -0.014366391114890575, 0.006553830578923225, -0.008642337284982204, -0.024612026289105415, 0.02594810724258423, -0.004859116859734058, 0.00039313314482569695, 0.023374391719698906, -0.011834868229925632, 0.0035458896309137344, -0.01686275377869606, -0.006156522314995527, -0.000893064949195832, -0.016665857285261154, 0.0112230833619833, -0.0014740845654159784, -0.01797381043434143, -0.011441076174378395, 0.016848688945174217, -0.013613966293632984, -0.005066561046987772, 0.011427012272179127, -0.014710959047079086, -0.008396216668188572, 0.01022453885525465, -0.054062072187662125, -0.007320319768041372, 0.032544128596782684, 0.007116391323506832, -0.018030066043138504, 0.023177495226264, 0.0007840688340365887, -0.011623907834291458, 0.004841537214815617, 0.006086202338337898, 0.0004056589095853269, -0.02120853401720524, -0.02198205515742302, -0.07212026417255402, 0.019577108323574066, 0.0016832867404446006, -0.010147186927497387, 0.014028854668140411, -0.012812317349016666, 0.007517215795814991, -0.010491754859685898, 0.009753393940627575, 0.026524731889367104, -0.028099901974201202, 0.011933316476643085, -0.022938407957553864, -0.03403491526842117, -0.008853296749293804, -0.009366633370518684, 0.0188739076256752, -0.00919786561280489, 0.005080625414848328, -0.0015373725909739733, 0.032937921583652496, -0.025849658995866776, 0.025723082944750786, 0.018550435081124306, -0.0004966355045326054, -0.002341658342629671, -0.043907854706048965, 0.01406401488929987, -0.009633850306272507, -0.01587827317416668, 0.0070882635191082954, -0.014317166991531849, 0.013388941995799541, 0.014795343391597271, 0.001852933899499476, -0.016497088596224785, 8.790009451331571e-05, 0.026187194511294365, 0.007826624438166618, 0.03428806737065315, -0.013206109404563904, -0.04238893836736679, 0.015104752033948898, -0.005713506136089563, -0.0024524126201868057, -0.00817822478711605, -0.02246023155748844, -0.00890252087265253, 0.04624247923493385, 0.010533946566283703, 0.04199514910578728, 0.0017738238675519824, -0.019858388230204582, -0.050517939031124115, -0.011778612621128559, 0.0018582079792395234, 0.024654217064380646, -0.009999514557421207, 0.014204654842615128, 0.012685741297900677, 0.0377478152513504, 0.008037583902478218, 0.020603781566023827, -0.015020367689430714, -0.0038640880957245827, -0.014260910451412201, -0.004500484559684992, -0.008255576714873314, -0.0011620392324402928, -0.03234723210334778, 0.010140154510736465, -0.004039888270199299, 0.014851599000394344, -0.0011708291713148355, 0.0030677132308483124, -0.010062802582979202, 0.0013018003664910793, 0.014401551336050034, -0.01665179245173931, 0.01720028929412365, 0.018831714987754822, -0.024949561804533005, -0.002348690526559949, -0.0004386214422993362, 0.025596506893634796, 0.0001995332131627947, -0.019155187532305717, 0.007693015970289707, -0.015132879838347435, -0.0011708291713148355, 0.0023609965573996305, 0.0018248058622702956, -0.007042555138468742, 0.0190848670899868, 0.0018142579356208444, -0.018072258681058884, -0.0017878878861665726, -0.008508728817105293, 0.027438892051577568, -0.009043161757290363, 0.00010284310701536015, -0.010491754859685898, 0.00427897647023201, -0.007091779261827469, -0.008192288689315319, -0.009528369642794132, -0.0448923334479332, -0.021967990323901176, 0.03974490612745285, 0.03853540122509003, 0.015639184042811394, -0.009296313859522343, -0.012474780902266502, 0.0040680160745978355, -0.012664644978940487, 0.017045585438609123, -0.004521580878645182, -0.015132879838347435, -0.0247526653110981, 0.03341609984636307, 0.028718717396259308, 0.015596992336213589, 0.05344325676560402, 0.0001705261820461601, 0.020097477361559868, 0.006620634812861681, 0.030743936076760292, -0.0026739207096397877, 0.025554314255714417, 0.020252181217074394, -0.005562317557632923, -0.010815227404236794, -0.015104752033948898, -0.010147186927497387, -0.0056748297065496445, -0.013213141821324825, -0.020927254110574722, 0.016708049923181534, 0.012221628800034523, 0.09467894583940506, 0.01810038648545742, -0.012566196732223034, 0.009633850306272507, -0.009451017715036869, 0.010920707136392593, 0.008691561408340931, 0.022769639268517494, 0.0076648881658911705, -0.010878515429794788, -0.002415494527667761, -0.011659068055450916, -0.012559165246784687, -0.015132879838347435, 0.020800678059458733, -0.003934408072382212, -0.01071677915751934, 0.015639184042811394, 0.005646701902151108, -0.0022396943531930447, 0.04199514910578728, -0.012594325467944145, 0.006195198278874159, 0.005344325676560402, -0.012052860110998154, 0.022178951650857925, 0.029421918094158173, 0.0042367842979729176, 0.0032259332947432995, -0.018114451318979263, 0.03265664353966713, 0.018466051667928696, -0.05015227571129799, -0.019197380170226097, -0.009493209421634674, -0.008550920523703098, 0.004859116859734058, -0.013311590068042278, -0.005752182099968195, 0.005586929619312286, 0.005833050236105919, 0.0020709261298179626, -0.014338262379169464, -0.026918523013591766, 0.024991754442453384, -0.013072501868009567, -0.015146943740546703, -0.02002715691924095, -0.028057709336280823], metadata={'title': \"Beyond GPT-4: What's New?\", 'url': 'https://pub.towardsai.net/beyond-gpt-4-whats-new-cbd61a448eb9#dda8', 'source_name': 'towards_ai'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='doc_0', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'title': \"Beyond GPT-4: What's New?\", 'url': 'https://pub.towardsai.net/beyond-gpt-4-whats-new-cbd61a448eb9#dda8', 'source_name': 'towards_ai'}, hash='3b095b0e25cdf965d950cdbd7feb8024030e7645998c1a33dc4427affca624ab'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='e470fa0d001e50b3ec3088022462a94ea7c87dd80106411b7d120f90b379e977', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='71418de3d50e604c2581574f1abf2248e5cc3ab7c74a3182c37cb1152d0cfd21')}, text='LLM Variants and Meta\\'s Open Source Before shedding light on four major trends, I\\'d share the latest Meta\\'s Llama 2 and Code Llama. Meta\\'s Llama 2 represents a sophisticated evolution in LLMs. This suite spans models pretrained and fine-tuned across a parameter spectrum of 7 billion to 70 billion. A specialized derivative, Llama 2-Chat, has been engineered explicitly for dialogue-centric applications. Benchmarking revealed Llama 2\\'s superior performance over most extant open-source chat models. Human-centric evaluations, focusing on safety and utility metrics, positioned Llama 2-Chat as a potential contender against proprietary, closed-source counterparts. The development trajectory of Llama 2 emphasized rigorous fine-tuning methodologies. Meta\\'s transparent delineation of these processes aims to catalyze community-driven advancements in LLMs, underscoring a commitment to collaborative and responsible AI development. Code Llama is built on top of Llama 2 and is available in three models: Code Llama, the foundational code model;Codel Llama - Python specialized for Python;and Code Llama - Instruct, which is fine-tuned for understanding natural language instructions. Based on its benchmark testing, Code Llama outperformed state-of-the-art publicly available LLMs (except GPT-4) on code tasks. Llama 2, Llama 2-Chat, and Code Llama are key steps in LLM development but still have a way to go compared to GPT-4. Meta\\'s open access and commitment to improving these models promise transparent and faster LLM progress in the future. Please refer to the LLM and Llama variants below:  From LLMs to Multimodal LLMs, like OpenAI\\'s ChatGPT (GPT-3.5), primarily focus on understanding and generating human language. They\\'ve been instrumental in tasks like text generation, translation, and even creative writing. However, their scope is limited to text. Enter multimodal models like GPT-4. These are a new breed of AI models that can understand and generate not just text, but also images, sounds, and potentially other types of data. The term \"multimodal\" refers to their ability to process multiple modes or', start_char_idx=0, end_char_idx=2117, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n')"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "nodes[0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "EV0ll57p46Dc"
+   },
+   "source": [
+    "# Load Indexes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {
+    "id": "PS215gCGkGD-"
+   },
+   "outputs": [],
+   "source": [
+    "# Create your index\n",
+    "db = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+    "chroma_collection = db.get_or_create_collection(\"mini-llama-articles\")\n",
+    "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {
+    "id": "HbT3-kRO4Qpt"
+   },
+   "outputs": [],
+   "source": [
+    "# Create your index\n",
+    "from llama_index.core import VectorStoreIndex\n",
+    "\n",
+    "index = VectorStoreIndex.from_vector_store(vector_store)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {
+    "id": "sb61DWU84bHP"
+   },
+   "outputs": [],
+   "source": [
+    "query_engine = index.as_query_engine()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {
+    "id": "G32W2LMMCmnv"
+   },
+   "outputs": [],
+   "source": [
+    "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 53
+    },
+    "id": "obc20cU5Cxf2",
+    "outputId": "837babce-9edf-4a3f-f996-c0c407ae027c"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'The Llama 2 model is available in four different sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.'"
+      ]
+     },
+     "execution_count": 17,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "res.response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "oIAO-saJCzYe",
+    "outputId": "bce85c7c-502c-4a7b-f3e2-f721f3d6b5a4"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Node ID\t f707756065d1f788b41fb97fcef81979e1fd241dbfa4034a24bec8e57b648482\n",
+      "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+      "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+      "Score\t 0.7122361910421624\n",
+      "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+      "Node ID\t 636f98cf8754c3a4759da02aa11a3f2aa7cdeb848a4980ec99300ece4a2e92fd\n",
+      "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+      "Text\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+      "Score\t 0.7047493574957753\n",
+      "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+     ]
+    }
+   ],
+   "source": [
+    "for src in res.source_nodes:\n",
+    "  print(\"Node ID\\t\", src.node_id)\n",
+    "  print(\"Title\\t\", src.metadata['title'])\n",
+    "  print(\"Text\\t\", src.text)\n",
+    "  print(\"Score\\t\", src.score)\n",
+    "  print(\"-_\"*20)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "d4xxZHbdN0lK"
+   },
+   "source": [
+    "# Evaluate the retrieval process and quality of answers\n",
+    "\n",
+    "We can evaluate our RAG system with a dataset of questions and associated chunks. Given a question, we can see if the RAG system retrieves the correct chunks of text that can answer the question.\n",
+    "\n",
+    "You can generate a synthetic dataset with an LLM such as `gpt-3.5-turbo` or create an authentic and manually curated dataset. \n",
+    "\n",
+    "Note that a **well curated dataset will always be a better option**, especially for a specific domain or use case.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In our example, we will generate a synthetic dataset using `gpt-3.5-turbo` to make it simple.\n",
+    "\n",
+    "This is the default prompt that the `generate_question_context_pairs` function will uses:\n",
+    "\n",
+    "```python\n",
+    "DEFAULT_QA_GENERATE_PROMPT_TMPL = \"\"\"\\\n",
+    "Context information is below.\n",
+    "\n",
+    "---------------------\n",
+    "{context_str}\n",
+    "---------------------\n",
+    "\n",
+    "Given the context information and no prior knowledge,\n",
+    "generate only questions based on the below query.\n",
+    "\n",
+    "You are a Teacher/Professor. Your task is to setup \\\n",
+    "{num_questions_per_chunk} questions for an upcoming \\\n",
+    "quiz/examination. The questions should be diverse in nature \\\n",
+    "across the document. Restrict the questions to the \\\n",
+    "context information provided.\"\n",
+    "\"\"\"\n",
+    "```\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 108/108 [05:59<00:00,  3.33s/it]\n"
+     ]
+    }
+   ],
+   "source": [
+    "from llama_index.core.evaluation import generate_question_context_pairs\n",
+    "from llama_index.llms.openai import OpenAI\n",
+    "\n",
+    "llm = OpenAI(model=\"gpt-3.5-turbo-0125\")\n",
+    "rag_eval_dataset = generate_question_context_pairs(\n",
+    "    nodes,\n",
+    "    llm=llm,\n",
+    "    num_questions_per_chunk=1\n",
+    ")\n",
+    "# We can save the dataset as a json file for later use.\n",
+    "rag_eval_dataset.save_json(\"./rag_eval_dataset.json\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# We can also load the dataset from a previously saved json file.\n",
+    "from llama_index.core.evaluation import EmbeddingQAFinetuneDataset\n",
+    "rag_eval_dataset = EmbeddingQAFinetuneDataset.from_json(\n",
+    "    \"./rag_eval_dataset.json\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Evaluation for Hit Rate and Mean Reciprocal Rank (MRR)\n",
+    "\n",
+    "We will make use of `RetrieverEvaluator` available in Llama-index. We will measure the Hit Rate and Mean Reciprocal Rank (MRR).\n",
+    "\n",
+    "**Hit Rate:**\n",
+    "\n",
+    "Think of the Hit Rate like playing a game of guessing. You're given a question and you need to guess the correct answer from a list of options. The Hit Rate measures how often you guess the correct answer by only looking at your top few guesses. If you often find the right answer in your first few guesses, you have a high Hit Rate. So, in the context of a retrieval system, it's about how frequently the system finds the correct document within its top 'k' picks (where 'k' is a number you decide, like top 5 or top 10).\n",
+    "\n",
+    "**Mean Reciprocal Rank (MRR):**\n",
+    "\n",
+    "MRR is a bit like measuring how quickly you can find a treasure in a list of boxes. Imagine you have a row of boxes and only one of them has a treasure. The MRR calculates how close to the start of the row the treasure box is, on average. If the treasure is always in the first box you open, you're doing great and have an MRR of 1. If it's in the second box, the score is 1/2, since you took two tries to find it. If it's in the third box, your score is 1/3, and so on. MRR averages these scores across all your searches. So, for a retrieval system, MRR looks at where the correct document ranks in the system's guesses. If it's usually near the top, the MRR will be high, indicating good performance.\n",
+    "In summary, Hit Rate tells you how often the system gets it right in its top guesses, and MRR tells you how close to the top the right answer usually is. Both metrics are useful for evaluating the effectiveness of a retrieval system, like how well a search engine or a recommendation system works."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "def display_results_retriever(name, eval_results):\n",
+    "    \"\"\"Display results from evaluate.\"\"\"\n",
+    "\n",
+    "    metric_dicts = []\n",
+    "    for eval_result in eval_results:\n",
+    "        metric_dict = eval_result.metric_vals_dict\n",
+    "        metric_dicts.append(metric_dict)\n",
+    "\n",
+    "    full_df = pd.DataFrame(metric_dicts)\n",
+    "\n",
+    "    hit_rate = full_df[\"hit_rate\"].mean()\n",
+    "    mrr = full_df[\"mrr\"].mean()\n",
+    "\n",
+    "    metric_df = pd.DataFrame(\n",
+    "        {\"Retriever Name\": [name], \"Hit Rate\": [hit_rate], \"MRR\": [mrr]}\n",
+    "    )\n",
+    "\n",
+    "    return metric_df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "    Retriever Name  Hit Rate       MRR\n",
+      "0  Retriever top_2  0.703557  0.570158\n",
+      "    Retriever Name  Hit Rate       MRR\n",
+      "0  Retriever top_4  0.822134  0.606884\n",
+      "    Retriever Name  Hit Rate       MRR\n",
+      "0  Retriever top_6  0.857708  0.613472\n",
+      "    Retriever Name  Hit Rate       MRR\n",
+      "0  Retriever top_8  0.883399  0.616937\n",
+      "     Retriever Name  Hit Rate      MRR\n",
+      "0  Retriever top_10  0.901186  0.61904\n"
+     ]
+    }
+   ],
+   "source": [
+    "from llama_index.core.evaluation import RetrieverEvaluator\n",
+    "\n",
+    "# We can evaluate the retievers with different top_k values.\n",
+    "for i in [2, 4, 6, 8, 10]:\n",
+    "    retriever = index.as_retriever(similarity_top_k=i)\n",
+    "    retriever_evaluator = RetrieverEvaluator.from_metric_names(\n",
+    "        [\"mrr\", \"hit_rate\"], retriever=retriever\n",
+    "    )\n",
+    "    eval_results = await retriever_evaluator.aevaluate_dataset(rag_eval_dataset, workers=32)\n",
+    "    print(display_results_retriever(f\"Retriever top_{i}\", eval_results))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Evaluation using Relevance and Faithfulness metrics.\n",
+    "\n",
+    "Here, we evaluate the answer generated by the LLM. Is the answer using the correct context? Is the answer faithful to the context? Is the answer relevant to the question?\n",
+    "\n",
+    "An LLM will answer these questions, more specifically `gpt-4-0125-preview`.\n",
+    "\n",
+    "**`FaithfulnessEvaluator`**\n",
+    "Evaluates if the answer is faithful to the retrieved contexts (in other words, whether there's an hallucination).\n",
+    "\n",
+    "**`RelevancyEvaluator`**\n",
+    "Evaluates whether the retrieved context and answer are relevant to the user question.\n",
+    "\n",
+    "\n",
+    "Now, let's see how the top_k value affects these two metrics."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "top_2 faithfulness_score: 1.0\n",
+      "top_2 relevancy_score: 1.0\n",
+      "top_4 faithfulness_score: 1.0\n",
+      "top_4 relevancy_score: 0.95\n",
+      "top_6 faithfulness_score: 1.0\n",
+      "top_6 relevancy_score: 0.95\n",
+      "top_8 faithfulness_score: 0.65\n",
+      "top_8 relevancy_score: 0.7\n",
+      "top_10 faithfulness_score: 0.45\n",
+      "top_10 relevancy_score: 0.5\n"
+     ]
+    }
+   ],
+   "source": [
+    "from llama_index.core.evaluation import RelevancyEvaluator, FaithfulnessEvaluator, BatchEvalRunner\n",
+    "from llama_index.llms.openai import OpenAI\n",
+    "\n",
+    "llm_gpt4 = OpenAI(temperature=0, model=\"gpt-4-0125-preview\")\n",
+    "\n",
+    "faithfulness_evaluator = FaithfulnessEvaluator(llm=llm_gpt4)\n",
+    "relevancy_evaluator = RelevancyEvaluator(llm=llm_gpt4)\n",
+    "\n",
+    "# Run evaluation\n",
+    "queries = list(rag_eval_dataset.queries.values())\n",
+    "batch_eval_queries = queries[:20]\n",
+    "\n",
+    "runner = BatchEvalRunner(\n",
+    "{\"faithfulness\": faithfulness_evaluator, \"relevancy\": relevancy_evaluator},\n",
+    "workers=32,\n",
+    ")\n",
+    "\n",
+    "for i in [2, 4, 6, 8, 10]:\n",
+    "    # Set Faithfulness and Relevancy evaluators\n",
+    "    query_engine = index.as_query_engine(similarity_top_k=i)\n",
+    "\n",
+    "    eval_results = await runner.aevaluate_queries(\n",
+    "        query_engine, queries=batch_eval_queries\n",
+    "    )\n",
+    "    faithfulness_score = sum(result.passing for result in eval_results['faithfulness']) / len(eval_results['faithfulness'])\n",
+    "    print(f\"top_{i} faithfulness_score: {faithfulness_score}\")\n",
+    "\n",
+    "    relevancy_score = sum(result.passing for result in eval_results['relevancy']) / len(eval_results['relevancy'])\n",
+    "    print(f\"top_{i} relevancy_score: {relevancy_score}\")\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "authorship_tag": "ABX9TyOnRtEA1r5V6nZnTDjOEHPs",
+   "include_colab_link": true,
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.8"
+  },
+  "widgets": {
+   "application/vnd.jupyter.widget-state+json": {
+    "01d27fdbe86a4ca2830b9bf3ccbf1ae9": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "ProgressStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "ProgressStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "bar_color": null,
+      "description_width": ""
+     }
+    },
+    "076728d713254b49935c7938d18014f2": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_121dbf44a222434cbc57ebe6beb83e2a",
+      "placeholder": "",
+      "style": "IPY_MODEL_2af0821ebb7e47988d134d4ec2776e87",
+      "value": " 108/108 [00:34&lt;00:00,  3.66it/s]"
+     }
+    },
+    "10340f8e7c8e482c8d35047a3e43ee7f": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "1095efa793804a3fb625855e715a5317": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "121dbf44a222434cbc57ebe6beb83e2a": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "2073b65c0db045aa8e86d91a4fea2e2b": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "2af0821ebb7e47988d134d4ec2776e87": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "665b9b5e85a34be8a20d40c51e57cfe0": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_85f23ab21c3b404aaa146cfcaefc85d8",
+      "placeholder": "",
+      "style": "IPY_MODEL_10340f8e7c8e482c8d35047a3e43ee7f",
+      "value": "Generating embeddings: 100%"
+     }
+    },
+    "6c575687c8f1468a803b88eea3d26b7b": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_eb057e56f0f94e4993b8ae960c78b0ad",
+      "placeholder": "",
+      "style": "IPY_MODEL_2073b65c0db045aa8e86d91a4fea2e2b",
+      "value": "Parsing nodes: 100%"
+     }
+    },
+    "70e17db8fc2f490f85b7af8aa664f0c7": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "76fea2dabfea42aa8bc7ae719f2a22ee": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HBoxModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HBoxModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HBoxView",
+      "box_style": "",
+      "children": [
+       "IPY_MODEL_6c575687c8f1468a803b88eea3d26b7b",
+       "IPY_MODEL_c266531dafcf4624af5fe9bcbc9d8df9",
+       "IPY_MODEL_e20a27a2f7764cb4b9537e34a3659c9a"
+      ],
+      "layout": "IPY_MODEL_bba307f545cd4533be6f0489f95b9895"
+     }
+    },
+    "8141417665024172a4baa78c497acb69": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "85f23ab21c3b404aaa146cfcaefc85d8": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "b43a5a6a65034a16927700e442dde52a": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "ProgressStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "ProgressStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "bar_color": null,
+      "description_width": ""
+     }
+    },
+    "b604cef3deca4847afcc459e5c8a9e0f": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "FloatProgressModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "FloatProgressModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "ProgressView",
+      "bar_style": "success",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_1095efa793804a3fb625855e715a5317",
+      "max": 108,
+      "min": 0,
+      "orientation": "horizontal",
+      "style": "IPY_MODEL_b43a5a6a65034a16927700e442dde52a",
+      "value": 108
+     }
+    },
+    "bba307f545cd4533be6f0489f95b9895": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "be591abb84a24c4b9903087501ebb0e5": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "c0a70bcdf3fb4bbfb2675b8012b2ef24": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HBoxModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HBoxModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HBoxView",
+      "box_style": "",
+      "children": [
+       "IPY_MODEL_665b9b5e85a34be8a20d40c51e57cfe0",
+       "IPY_MODEL_b604cef3deca4847afcc459e5c8a9e0f",
+       "IPY_MODEL_076728d713254b49935c7938d18014f2"
+      ],
+      "layout": "IPY_MODEL_be591abb84a24c4b9903087501ebb0e5"
+     }
+    },
+    "c266531dafcf4624af5fe9bcbc9d8df9": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "FloatProgressModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "FloatProgressModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "ProgressView",
+      "bar_style": "success",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_8141417665024172a4baa78c497acb69",
+      "max": 14,
+      "min": 0,
+      "orientation": "horizontal",
+      "style": "IPY_MODEL_01d27fdbe86a4ca2830b9bf3ccbf1ae9",
+      "value": 14
+     }
+    },
+    "e20a27a2f7764cb4b9537e34a3659c9a": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_e4fe85a095e64d52b6a53c2a4bba8aeb",
+      "placeholder": "",
+      "style": "IPY_MODEL_70e17db8fc2f490f85b7af8aa664f0c7",
+      "value": " 14/14 [00:00&lt;00:00, 26.60it/s]"
+     }
+    },
+    "e4fe85a095e64d52b6a53c2a4bba8aeb": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "eb057e56f0f94e4993b8ae960c78b0ad": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    }
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}

notebooks/07-RAG_Improve_Chunking.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

notebooks/08-Finetune_Embedding.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

notebooks/09-Better_Embedding_Model.ipynb ADDED Viewed

	@@ -0,0 +1,1575 @@

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "view-in-github"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/09-Better_Embedding_Model.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-zE1h0uQV7uT"
+      },
+      "source": [
+        "# Install Packages and Setup Variables"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "QPJzr-I9XQ7l",
+        "outputId": "8e808cc4-4c21-474b-c5b7-f6841ee08020"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -q llama-index==0.10.11 openai==1.12.0 llama-index-finetuning llama-index-embeddings-huggingface llama-index-embeddings-cohere llama-index-readers-web cohere==4.47 tiktoken==0.6.0 chromadb==0.4.22 pandas==2.2.0 html2text sentence_transformers pydantic"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 11,
+      "metadata": {
+        "id": "riuXwpSPcvWC"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" and the \"CO_API_KEY\" (Cohere) in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\"\n",
+        "os.environ[\"CO_API_KEY\"] = \"<YOUR_COHERE_KEY>\""
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {
+        "id": "jIEeZzqLbz0J"
+      },
+      "outputs": [],
+      "source": [
+        "# Allows running asyncio in environments with an existing event loop, like Jupyter notebooks.\n",
+        "\n",
+        "import nest_asyncio\n",
+        "\n",
+        "nest_asyncio.apply()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Bkgi2OrYzF7q"
+      },
+      "source": [
+        "# Load a Model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "id": "9oGT6crooSSj"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/Users/louis/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+            "  from .autonotebook import tqdm as notebook_tqdm\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "llm = OpenAI(temperature=0.9, model=\"gpt-3.5-turbo-0125\", max_tokens=512)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0BwVuJXlzHVL"
+      },
+      "source": [
+        "# Create a VectoreStore"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "id": "SQP87lHczHKc"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "\n",
+        "# create client and a new collection\n",
+        "# chromadb.EphemeralClient saves data in-memory.\n",
+        "chroma_client = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = chroma_client.create_collection(\"mini-llama-articles\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "id": "zAaGcYMJzHAN"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Define a storage context object using the created vector database.\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "I9JbAzFcjkpn"
+      },
+      "source": [
+        "# Load the Dataset (CSV)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ceveDuYdWCYk"
+      },
+      "source": [
+        "## Download"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "eZwf6pv7WFmD"
+      },
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model. Read the dataset as a long string."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "wl_pbPvMlv1h",
+        "outputId": "bc9a0415-a1fb-4e89-a2b4-165420106b34"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+            "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+            "100  169k  100  169k    0     0   856k      0 --:--:-- --:--:-- --:--:--  860k\n"
+          ]
+        }
+      ],
+      "source": [
+        "!curl -o ./mini-llama-articles.csv https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "VWBLtDbUWJfA"
+      },
+      "source": [
+        "## Read File"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "0Q9sxuW0g3Gd",
+        "outputId": "a8361aa6-522d-4def-e49b-ed08d9c8e7d1"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "14"
+            ]
+          },
+          "execution_count": 7,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "import csv\n",
+        "\n",
+        "rows = []\n",
+        "\n",
+        "# Load the file as a JSON\n",
+        "with open(\"./mini-llama-articles.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for idx, row in enumerate( csv_reader ):\n",
+        "    if idx == 0: continue; # Skip header row\n",
+        "    rows.append( row )\n",
+        "\n",
+        "# The number of characters in the dataset.\n",
+        "len( rows )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "S17g2RYOjmf2"
+      },
+      "source": [
+        "# Convert to Document obj"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {
+        "id": "YizvmXPejkJE"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row[1], metadata={\"title\": row[0], \"url\": row[2], \"source_name\": row[3]}) for row in rows]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "qjuLbmFuWsyl"
+      },
+      "source": [
+        "# Transforming"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "id": "9z3t70DGWsjO"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.text_splitter import TokenTextSplitter\n",
+        "\n",
+        "# Define the splitter object that split the text into segments with 512 tokens,\n",
+        "# with a 128 overlap between the segments.\n",
+        "text_splitter = TokenTextSplitter(\n",
+        "    separator=\" \", chunk_size=512, chunk_overlap=128\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "y28yMy0GxfGR"
+      },
+      "source": [
+        "There are two options to use the Cohere embeddings:\n",
+        "\n",
+        "- input_type=\"search_document\": Employ this option for texts (documents) intended for storage in your vector database.\n",
+        "\n",
+        "- input_type=\"search_query\": Use this when issuing search queries to locate the most related documents within your vector database."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 385,
+          "referenced_widgets": [
+            "2b1095050bb847c48855e3b74ae18b19",
+            "a0a1c543115c4764b4150c5d0216370c",
+            "23675bffa00749849ec944f84986ff52",
+            "9e86b288110f4d418fd9761f59f5637f",
+            "d6a4fd2a9cf7431b8bf738d9da0e2a7c",
+            "700a1ffb298c4dd799c44fcee540b74c",
+            "06e7a0370c8c46dd9a47c72a474212d1",
+            "268f6f0800164e0ab7f8f31718f7f9be",
+            "4001b95bd48147fb876b37a644e70dec",
+            "22024efa09cb4330ab68a8c2bdbf92ac",
+            "c14678e2b8c546fc9123c94fa47b924d",
+            "9dda1537424142e0b7f2fdd5f9c1b98d",
+            "1db171d1920d432283f9e1795c4c0c80",
+            "23e0caeaf15546f0b5c62aa263c99e09",
+            "03b8aded009343f288f0945b64d1f41c",
+            "4d922a99035d45c59ce9868a4ef73d68",
+            "aea6b63cbced40619bf32b1a2c350259",
+            "c89c9dd46b454181aadaf82c7296cdae",
+            "bec71553390b44879accb638a5b4873f",
+            "97e4316196e84c7a82a2dd3e4698bc55",
+            "b2ab2dc287a9421ca812074389ee31a7",
+            "fa5c2f509ec54c5695a406160ab0626a"
+          ]
+        },
+        "id": "P9LDJ7o-Wsc-",
+        "outputId": "cd49bff2-b0da-4722-8baa-6a07f1023b39"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Parsing nodes: 100%|██████████| 14/14 [00:00<00:00, 30.35it/s]\n",
+            "100%|██████████| 108/108 [01:01<00:00,  1.76it/s]\n",
+            "100%|██████████| 108/108 [01:13<00:00,  1.47it/s]\n",
+            "100%|██████████| 108/108 [00:30<00:00,  3.59it/s]\n",
+            "Generating embeddings: 100%|██████████| 108/108 [00:04<00:00, 26.11it/s]\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.extractors import (\n",
+        "    SummaryExtractor,\n",
+        "    QuestionsAnsweredExtractor,\n",
+        "    KeywordExtractor,\n",
+        ")\n",
+        "from llama_index.embeddings.cohere import CohereEmbedding\n",
+        "from llama_index.core.ingestion import IngestionPipeline\n",
+        "\n",
+        "# Create the pipeline to apply the transformation on each chunk,\n",
+        "# and store the transformed text in the chroma vector store.\n",
+        "pipeline = IngestionPipeline(\n",
+        "    transformations=[\n",
+        "        text_splitter,\n",
+        "        QuestionsAnsweredExtractor(questions=3, llm=llm),\n",
+        "        SummaryExtractor(summaries=[\"prev\", \"self\"], llm=llm),\n",
+        "        KeywordExtractor(keywords=10, llm=llm),\n",
+        "        CohereEmbedding(model_name=\"embed-english-v3.0\", input_type=\"search_document\"),\n",
+        "    ],\n",
+        "    vector_store=vector_store\n",
+        ")\n",
+        "\n",
+        "# Run the transformation pipeline.\n",
+        "nodes = pipeline.run(documents=documents, show_progress=True);"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "mPGa85hM2P3P",
+        "outputId": "9d7811ba-1e10-4098-b6eb-77a4e7d37457"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "108"
+            ]
+          },
+          "execution_count": 13,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "len( nodes )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "jjnmscmq2cXK",
+        "outputId": "5f6fa176-4e09-4cc7-bd17-8236b061ad17"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "1024"
+            ]
+          },
+          "execution_count": 14,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "len( nodes[0].embedding )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 15,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "hV9G0lSUJJSa",
+        "outputId": "453a4ea3-dfda-4da1-ac29-929834c83b40"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "  adding: mini-llama-articles/ (stored 0%)\n",
+            "  adding: mini-llama-articles/63fe4276-8624-43c7-8c23-32dbfedb2285/ (stored 0%)\n",
+            "  adding: mini-llama-articles/63fe4276-8624-43c7-8c23-32dbfedb2285/data_level0.bin (deflated 100%)\n",
+            "  adding: mini-llama-articles/63fe4276-8624-43c7-8c23-32dbfedb2285/length.bin (deflated 25%)\n",
+            "  adding: mini-llama-articles/63fe4276-8624-43c7-8c23-32dbfedb2285/link_lists.bin (stored 0%)\n",
+            "  adding: mini-llama-articles/63fe4276-8624-43c7-8c23-32dbfedb2285/header.bin (deflated 61%)\n",
+            "  adding: mini-llama-articles/chroma.sqlite3 (deflated 70%)\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Compress the vector store directory to a zip file to be able to download and use later.\n",
+        "!zip -r vectorstore_cohere.zip mini-llama-articles"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OWaT6rL7ksp8"
+      },
+      "source": [
+        "# Load Indexes"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "B4w8xP2Ggrvf"
+      },
+      "source": [
+        "If you have already uploaded the zip file for the vector store checkpoint, please uncomment the code in the following cell block to extract its contents. After doing so, you will be able to load the dataset from local storage."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 16,
+      "metadata": {
+        "id": "EF-wobGAJRgL"
+      },
+      "outputs": [],
+      "source": [
+        "# !unzip vectorstore_cohere.zip"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 17,
+      "metadata": {
+        "id": "mXi56KTXk2sp"
+      },
+      "outputs": [],
+      "source": [
+        "# Load the vector store from the local storage.\n",
+        "db = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = db.get_or_create_collection(\"mini-llama-articles\")\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 19,
+      "metadata": {
+        "id": "9l0PaY230syE"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/var/folders/l7/9qcp7g5x5rl9x8ltw0t85qym0000gn/T/ipykernel_74455/3981499771.py:11: DeprecationWarning: Call to deprecated class method from_defaults. (ServiceContext is deprecated, please use `llama_index.settings.Settings` instead.) -- Deprecated since version 0.10.0.\n",
+            "  service_context = ServiceContext.from_defaults(\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core import ServiceContext\n",
+        "\n",
+        "# Define the Cohere Embedding Model\n",
+        "embed_model = CohereEmbedding(\n",
+        "    model_name=\"embed-english-v3.0\",\n",
+        "    input_type=\"search_query\",\n",
+        ")\n",
+        "\n",
+        "# Define the ServiceCotext object to tie the LLM for generating final answer,\n",
+        "# and the embedding model to help with retrieving related nodes.\n",
+        "service_context = ServiceContext.from_defaults(\n",
+        "    llm=llm, embed_model=embed_model\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 21,
+      "metadata": {
+        "id": "jKXURvLtkuTS"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "\n",
+        "# Create the index based on the vector store.\n",
+        "index = VectorStoreIndex.from_vector_store(vector_store, service_context=service_context)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8JPD8yAinVSq"
+      },
+      "source": [
+        "# Query Dataset"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 22,
+      "metadata": {
+        "id": "b0gue7cyctt1"
+      },
+      "outputs": [],
+      "source": [
+        "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "query_engine = index.as_query_engine()\n",
+        "\n",
+        "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 23,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        },
+        "id": "VKK3jMprctre",
+        "outputId": "cb85d598-d1bc-49e9-818f-c7bbde465864"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'LLaMA2 model has a total of 2 trillion parameters.'"
+            ]
+          },
+          "execution_count": 23,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "res.response"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 24,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "465dH4yQc7Ct",
+        "outputId": "3d2b3ce2-7705-41bb-80e3-4fe6b390dcef"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Node ID\t 0a3368de-02cc-4cb2-8579-3379e9c68101\n",
+            "Title\t Fine-Tuning a Llama-2 7B Model for Python Code Generation\n",
+            "Text\t New Llama-2 model In mid-July, Meta released its new family of pre-trained and finetuned models called Llama-2, with an open source and commercial character to facilitate its use and expansion. The base model was released with a chat version and sizes 7B, 13B, and 70B. Together with the models, the corresponding papers were published describing their characteristics and relevant points of the learning process, which provide very interesting information on the subject. For pre-training, 40% more tokens were used, reaching 2T, the context length was doubled and the grouped-query attention (GQA) technique was applied to speed up inference on the heavier 70B model. On the standard transformer architecture, RMSNorm normalization, SwiGLU activation, and rotatory positional embedding are used, the context length reaches 4096 tokens, and an Adam optimizer is applied with a cosine learning rate schedule, a weight decay of 0.1 and gradient clipping.  The dataset for tuning For our tuning process, we will take a dataset containing about 18,000 examples where the model is asked to build a Python code that solves a given task. This is an extraction of the original dataset [2], where only the Python language examples are selected. Each row contains the description of the task to be solved, an example of data input to the task if applicable, and the generated code fragment that solves the task is provided [3].  Creating the prompt To carry out an instruction fine-tuning, we must transform each one of our data examples as if it were an instruction, outlining its main sections as follows: Output:  Fine-tuning the model To carry out this stage, we have used the Google Colab environment, where we have developed a notebook that allows us to run the training in an interactive way and also a Python script to run the training in unattended mode. For the first test runs, a T4 instance with a high RAM capacity is enough, but when it comes to running the whole dataset and epochs, we have opted to use an A100 instance in order to speed up the training and ensure that its execution time is reasonable. In order to be able to\n",
+            "Score\t 0.4173821910560196\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t b2b33887-2da1-4838-903e-8e126224095d\n",
+            "Title\t Fine-Tuning a Llama-2 7B Model for Python Code Generation\n",
+            "Text\t if it were an instruction, outlining its main sections as follows: Output:  Fine-tuning the model To carry out this stage, we have used the Google Colab environment, where we have developed a notebook that allows us to run the training in an interactive way and also a Python script to run the training in unattended mode. For the first test runs, a T4 instance with a high RAM capacity is enough, but when it comes to running the whole dataset and epochs, we have opted to use an A100 instance in order to speed up the training and ensure that its execution time is reasonable. In order to be able to share the model, we will log in to the Huggingface hub using the appropriate token, so that at the end of the whole process, we will upload the model files so that they can be shared with the rest of the users.  Fine-tuning techniques: PEFT, Lora, and QLora In recent months, some papers have appeared showing how PEFT techniques can be used to train large language models with a drastic reduction of RAM requirements and consequently allowing fine-tuning of these models on a single GPU of reasonable size. The usual steps to train an LLM consist, first, an intensive pre-training on billions or trillions of tokens to obtain a foundation model, and then a fine-tuning is performed on this model to specialize it on a downstream task. In this fine-tuning phase is where the PEFT technique has its purpose. Parameter Efficient Fine-Tuning (PEFT) allows us to considerably reduce RAM and storage requirements by only fine-tuning a small number of additional parameters, with virtually all model parameters remaining frozen. PEFT has been found to produce good generalization with relatively low-volume datasets. Furthermore, it enhances the reusability and portability of the model, as the small checkpoints obtained can be easily added to the base model, and the base model can be easily fine-tuned and reused in multiple scenarios by adding the PEFT parameters. Finally, since the base model is not adjusted, all the knowledge acquired in the pre-training phase is preserved, thus avoiding catastrophic forgetting. Most widely used PEFT techniques aim to keep the pre-trained base model untouched\n",
+            "Score\t 0.4013547787636657\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Show the retrieved nodes\n",
+        "for src in res.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Title\\t\", src.metadata['title'])\n",
+        "  print(\"Text\\t\", src.text)\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "iMkpzH7vvb09"
+      },
+      "source": [
+        "# Evaluate"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 26,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "H8a3eKgKvckU",
+        "outputId": "85b0765e-5a42-4f60-ccff-fc4bc688f65a"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "100%|██████████| 108/108 [06:43<00:00,  3.74s/it]\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.evaluation import generate_question_context_pairs\n",
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "# Create questions for each segment. These questions will be used to\n",
+        "# assess whether the retriever can accurately identify and return the\n",
+        "# corresponding segment when queried.\n",
+        "llm = OpenAI(model=\"gpt-3.5-turbo-0125\")\n",
+        "rag_eval_dataset = generate_question_context_pairs(\n",
+        "    nodes,\n",
+        "    llm=llm,\n",
+        "    num_questions_per_chunk=1\n",
+        ")\n",
+        "\n",
+        "# We can save the evaluation dataset as a json file for later use.\n",
+        "rag_eval_dataset.save_json(\"./rag_eval_dataset_cohere.json\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "998nNEGYhKhu"
+      },
+      "source": [
+        "If you have uploaded the generated question JSON file, please uncomment the code in the next cell block. This will avoid the need to generate the questions manually, saving you time and effort."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 27,
+      "metadata": {
+        "id": "3sA1K84U254o"
+      },
+      "outputs": [],
+      "source": [
+        "# from llama_index.finetuning.embeddings.common import (\n",
+        "#     EmbeddingQAFinetuneDataset,\n",
+        "# )\n",
+        "# rag_eval_dataset = EmbeddingQAFinetuneDataset.from_json(\n",
+        "#     \"./rag_eval_dataset_cohere.json\"\n",
+        "# )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 28,
+      "metadata": {
+        "id": "H7ubvcbk27vr"
+      },
+      "outputs": [],
+      "source": [
+        "import pandas as pd\n",
+        "\n",
+        "#  A simple function to show the evaluation result.\n",
+        "def display_results_retriever(name, eval_results):\n",
+        "    \"\"\"Display results from evaluate.\"\"\"\n",
+        "\n",
+        "    metric_dicts = []\n",
+        "    for eval_result in eval_results:\n",
+        "        metric_dict = eval_result.metric_vals_dict\n",
+        "        metric_dicts.append(metric_dict)\n",
+        "\n",
+        "    full_df = pd.DataFrame(metric_dicts)\n",
+        "\n",
+        "    hit_rate = full_df[\"hit_rate\"].mean()\n",
+        "    mrr = full_df[\"mrr\"].mean()\n",
+        "\n",
+        "    metric_df = pd.DataFrame(\n",
+        "        {\"Retriever Name\": [name], \"Hit Rate\": [hit_rate], \"MRR\": [mrr]}\n",
+        "    )\n",
+        "\n",
+        "    return metric_df"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 29,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "uNLxDxoc2-Ac",
+        "outputId": "8a2df94d-99b5-4aa4-a31e-b6c94256d1bb"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "    Retriever Name  Hit Rate       MRR\n",
+            "0  Retriever top_2  0.677355  0.562124\n",
+            "    Retriever Name  Hit Rate       MRR\n",
+            "0  Retriever top_4  0.815631  0.606045\n",
+            "    Retriever Name  Hit Rate       MRR\n",
+            "0  Retriever top_6  0.865731  0.615331\n",
+            "    Retriever Name  Hit Rate       MRR\n",
+            "0  Retriever top_8  0.887776  0.618301\n",
+            "     Retriever Name  Hit Rate       MRR\n",
+            "0  Retriever top_10    0.8998  0.619592\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.evaluation import RetrieverEvaluator\n",
+        "\n",
+        "# We can evaluate the retievers with different top_k values.\n",
+        "for i in [2, 4, 6, 8, 10]:\n",
+        "    retriever = index.as_retriever(similarity_top_k=i)\n",
+        "    retriever_evaluator = RetrieverEvaluator.from_metric_names(\n",
+        "        [\"mrr\", \"hit_rate\"], retriever=retriever\n",
+        "    )\n",
+        "    eval_results = await retriever_evaluator.aevaluate_dataset(rag_eval_dataset)\n",
+        "    print(display_results_retriever(f\"Retriever top_{i}\", eval_results))"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 30,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "3ukkWC9R2_0J",
+        "outputId": "d177c25d-a163-4b71-97f4-2af468737bbb"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/var/folders/l7/9qcp7g5x5rl9x8ltw0t85qym0000gn/T/ipykernel_74455/1546854213.py:11: DeprecationWarning: Call to deprecated class method from_defaults. (ServiceContext is deprecated, please use `llama_index.settings.Settings` instead.) -- Deprecated since version 0.10.0.\n",
+            "  service_context_gpt4 = ServiceContext.from_defaults(llm=llm_gpt4)\n"
+          ]
+        },
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "top_2 faithfulness_score: 1.0\n",
+            "top_2 relevancy_score: 1.0\n",
+            "-_-_-_-_-_-_-_-_-_-_\n",
+            "top_4 faithfulness_score: 1.0\n",
+            "top_4 relevancy_score: 1.0\n",
+            "-_-_-_-_-_-_-_-_-_-_\n",
+            "top_6 faithfulness_score: 1.0\n",
+            "top_6 relevancy_score: 1.0\n",
+            "-_-_-_-_-_-_-_-_-_-_\n",
+            "top_8 faithfulness_score: 0.45\n",
+            "top_8 relevancy_score: 0.45\n",
+            "-_-_-_-_-_-_-_-_-_-_\n",
+            "top_10 faithfulness_score: 0.65\n",
+            "top_10 relevancy_score: 0.65\n",
+            "-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.evaluation import RelevancyEvaluator, FaithfulnessEvaluator, BatchEvalRunner\n",
+        "from llama_index.core import ServiceContext\n",
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "for i in [2, 4, 6, 8, 10]:\n",
+        "    # Set Faithfulness and Relevancy evaluators\n",
+        "    query_engine = index.as_query_engine(similarity_top_k=i)\n",
+        "\n",
+        "    # While we use GPT3.5-Turbo to answer questions, we can use GPT4 to evaluate the answers.\n",
+        "    llm_gpt4 = OpenAI(temperature=0, model=\"gpt-4-0125-preview\")\n",
+        "    service_context_gpt4 = ServiceContext.from_defaults(llm=llm_gpt4)\n",
+        "\n",
+        "    faithfulness_evaluator = FaithfulnessEvaluator(service_context=service_context_gpt4)\n",
+        "    relevancy_evaluator = RelevancyEvaluator(service_context=service_context_gpt4)\n",
+        "\n",
+        "    # Run evaluation\n",
+        "    queries = list(rag_eval_dataset.queries.values())\n",
+        "    batch_eval_queries = queries[:20]\n",
+        "\n",
+        "    runner = BatchEvalRunner(\n",
+        "    {\"faithfulness\": faithfulness_evaluator, \"relevancy\": relevancy_evaluator},\n",
+        "    workers=8,\n",
+        "    )\n",
+        "    eval_results = await runner.aevaluate_queries(\n",
+        "        query_engine, queries=batch_eval_queries\n",
+        "    )\n",
+        "    faithfulness_score = sum(result.passing for result in eval_results['faithfulness']) / len(eval_results['faithfulness'])\n",
+        "    print(f\"top_{i} faithfulness_score: {faithfulness_score}\")\n",
+        "\n",
+        "    relevancy_score = sum(result.passing for result in eval_results['faithfulness']) / len(eval_results['relevancy'])\n",
+        "    print(f\"top_{i} relevancy_score: {relevancy_score}\")\n",
+        "    print(\"-_\"*10)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "1MB1YD1E3EKM"
+      },
+      "outputs": [],
+      "source": []
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "authorship_tag": "ABX9TyMx3DkzJEgLiO/6oTdKzS6v",
+      "include_colab_link": true,
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.8"
+    },
+    "widgets": {
+      "application/vnd.jupyter.widget-state+json": {
+        "03b8aded009343f288f0945b64d1f41c": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_b2ab2dc287a9421ca812074389ee31a7",
+            "placeholder": "",
+            "style": "IPY_MODEL_fa5c2f509ec54c5695a406160ab0626a",
+            "value": " 108/108 [00:03&lt;00:00, 30.08it/s]"
+          }
+        },
+        "06e7a0370c8c46dd9a47c72a474212d1": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "1db171d1920d432283f9e1795c4c0c80": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_aea6b63cbced40619bf32b1a2c350259",
+            "placeholder": "",
+            "style": "IPY_MODEL_c89c9dd46b454181aadaf82c7296cdae",
+            "value": "Generating embeddings: 100%"
+          }
+        },
+        "22024efa09cb4330ab68a8c2bdbf92ac": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "23675bffa00749849ec944f84986ff52": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_268f6f0800164e0ab7f8f31718f7f9be",
+            "max": 14,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_4001b95bd48147fb876b37a644e70dec",
+            "value": 14
+          }
+        },
+        "23e0caeaf15546f0b5c62aa263c99e09": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_bec71553390b44879accb638a5b4873f",
+            "max": 108,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_97e4316196e84c7a82a2dd3e4698bc55",
+            "value": 108
+          }
+        },
+        "268f6f0800164e0ab7f8f31718f7f9be": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "2b1095050bb847c48855e3b74ae18b19": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_a0a1c543115c4764b4150c5d0216370c",
+              "IPY_MODEL_23675bffa00749849ec944f84986ff52",
+              "IPY_MODEL_9e86b288110f4d418fd9761f59f5637f"
+            ],
+            "layout": "IPY_MODEL_d6a4fd2a9cf7431b8bf738d9da0e2a7c"
+          }
+        },
+        "4001b95bd48147fb876b37a644e70dec": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "4d922a99035d45c59ce9868a4ef73d68": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "700a1ffb298c4dd799c44fcee540b74c": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "97e4316196e84c7a82a2dd3e4698bc55": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "9dda1537424142e0b7f2fdd5f9c1b98d": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_1db171d1920d432283f9e1795c4c0c80",
+              "IPY_MODEL_23e0caeaf15546f0b5c62aa263c99e09",
+              "IPY_MODEL_03b8aded009343f288f0945b64d1f41c"
+            ],
+            "layout": "IPY_MODEL_4d922a99035d45c59ce9868a4ef73d68"
+          }
+        },
+        "9e86b288110f4d418fd9761f59f5637f": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_22024efa09cb4330ab68a8c2bdbf92ac",
+            "placeholder": "",
+            "style": "IPY_MODEL_c14678e2b8c546fc9123c94fa47b924d",
+            "value": " 14/14 [00:00&lt;00:00, 13.27it/s]"
+          }
+        },
+        "a0a1c543115c4764b4150c5d0216370c": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_700a1ffb298c4dd799c44fcee540b74c",
+            "placeholder": "",
+            "style": "IPY_MODEL_06e7a0370c8c46dd9a47c72a474212d1",
+            "value": "Parsing nodes: 100%"
+          }
+        },
+        "aea6b63cbced40619bf32b1a2c350259": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "b2ab2dc287a9421ca812074389ee31a7": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "bec71553390b44879accb638a5b4873f": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "c14678e2b8c546fc9123c94fa47b924d": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "c89c9dd46b454181aadaf82c7296cdae": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "d6a4fd2a9cf7431b8bf738d9da0e2a7c": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "fa5c2f509ec54c5695a406160ab0626a": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        }
+      }
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

notebooks/10-Adding_Reranking.ipynb ADDED Viewed

	@@ -0,0 +1,1462 @@

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "view-in-github"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/10-Adding_Reranking.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-zE1h0uQV7uT"
+      },
+      "source": [
+        "# Install Packages and Setup Variables"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "QPJzr-I9XQ7l",
+        "outputId": "440f5d93-1cac-4a70-e244-5e8af314464e"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -q llama-index==0.10.11 openai==1.12.0 llama-index-finetuning llama-index-embeddings-huggingface llama-index-embeddings-cohere llama-index-readers-web cohere==4.47 tiktoken==0.6.0 chromadb==0.4.22 pandas==2.2.0 html2text sentence_transformers pydantic"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 21,
+      "metadata": {
+        "id": "riuXwpSPcvWC"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" and \"CO_API_KEY\" (Cohere) in the Python environment.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\"\n",
+        "os.environ[\"CO_API_KEY\"] = \"<YOUR_COHERE_KEY>\"\n",
+        "cohere_key = os.environ[\"CO_API_KEY\"]"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {
+        "id": "jIEeZzqLbz0J"
+      },
+      "outputs": [],
+      "source": [
+        "# Allows running asyncio in environments with an existing event loop, like Jupyter notebooks.\n",
+        "\n",
+        "import nest_asyncio\n",
+        "\n",
+        "nest_asyncio.apply()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Bkgi2OrYzF7q"
+      },
+      "source": [
+        "# Load a Model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "id": "9oGT6crooSSj"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/Users/louis/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+            "  from .autonotebook import tqdm as notebook_tqdm\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "llm = OpenAI(temperature=0.9, model=\"gpt-3.5-turbo-0125\", max_tokens=512)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0BwVuJXlzHVL"
+      },
+      "source": [
+        "# Create a VectoreStore"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "id": "SQP87lHczHKc"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "\n",
+        "# create client and a new collection\n",
+        "# chromadb.EphemeralClient saves data in-memory.\n",
+        "chroma_client = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = chroma_client.create_collection(\"mini-llama-articles\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "id": "zAaGcYMJzHAN"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Define a storage context object using the created vector database.\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "I9JbAzFcjkpn"
+      },
+      "source": [
+        "# Load the Dataset (CSV)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ceveDuYdWCYk"
+      },
+      "source": [
+        "## Download"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "eZwf6pv7WFmD"
+      },
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model. Read the dataset as a long string."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "wl_pbPvMlv1h",
+        "outputId": "f844a7a8-484b-4693-8715-42506778b1de"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+            "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+            "100  169k  100  169k    0     0   768k      0 --:--:-- --:--:-- --:--:--  770k\n"
+          ]
+        }
+      ],
+      "source": [
+        "!curl -o ./mini-llama-articles.csv https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "VWBLtDbUWJfA"
+      },
+      "source": [
+        "## Read File"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "0Q9sxuW0g3Gd",
+        "outputId": "473050f8-0640-4e7c-91e7-3ea3485cfb51"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "14"
+            ]
+          },
+          "execution_count": 7,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "import csv\n",
+        "\n",
+        "rows = []\n",
+        "\n",
+        "# Load the file as a JSON\n",
+        "with open(\"./mini-llama-articles.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for idx, row in enumerate( csv_reader ):\n",
+        "    if idx == 0: continue; # Skip header row\n",
+        "    rows.append( row )\n",
+        "\n",
+        "# The number of characters in the dataset.\n",
+        "len( rows )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "S17g2RYOjmf2"
+      },
+      "source": [
+        "# Convert to Document obj"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {
+        "id": "YizvmXPejkJE"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row[1], metadata={\"title\": row[0], \"url\": row[2], \"source_name\": row[3]}) for row in rows]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "qjuLbmFuWsyl"
+      },
+      "source": [
+        "# Transforming"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "id": "9z3t70DGWsjO"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.text_splitter import TokenTextSplitter\n",
+        "\n",
+        "# Define the splitter object that split the text into segments with 512 tokens,\n",
+        "# with a 128 overlap between the segments.\n",
+        "text_splitter = TokenTextSplitter(\n",
+        "    separator=\" \", chunk_size=512, chunk_overlap=128\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 413,
+          "referenced_widgets": [
+            "4bb1e341a77d41c9aca0e6680911fb43",
+            "1d1faa15f5564b68b948eaffa58626b3",
+            "df22a67ae80b4673b708eea74646be61",
+            "3657dc19b6ac477b9f05bb6519271473",
+            "9045e402f0344428acc085d63df7ff03",
+            "f57a9ac0d924408fbaaac795c172862e",
+            "4cb8ba074b254e91b8877cc87ae0d279",
+            "cbd3e1411b2c4eeb943243c9d45245c4",
+            "04af736f84044e37aa6599aa708a77bc",
+            "8d35ab8c65ba47e1be446b98f0942ac4",
+            "75e40756175f463e874630f229ef4066",
+            "a0dd5f2c99b2407f9f5705587976ae76",
+            "8728ca516bd0474586b19e0c9b457499",
+            "aac433a9a64c48dfb18d7a01f64d3b27",
+            "4802a63f700e48fca16b5d89fbab333d",
+            "3f55aef52aee4e77864d53e3197c3cc3",
+            "f41df4b6ab4c4132b0d20232002f0294",
+            "3a621edd23354ea5924189885c97dee4",
+            "73d34cae940e4748a7b3127351925e65",
+            "2dc4a6c935ac4ef38ed9030608bd4b2f",
+            "4fcebf4a9ef54729889cc6ad4cbe5d10",
+            "195aa202b03a42a3a674e9da2f13d878"
+          ]
+        },
+        "id": "P9LDJ7o-Wsc-",
+        "outputId": "72b67575-2d55-4145-90be-a367f128fa44"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Parsing nodes: 100%|██████████| 14/14 [00:00<00:00, 28.69it/s]\n",
+            "100%|██████████| 108/108 [01:02<00:00,  1.72it/s]\n",
+            "100%|██████████| 108/108 [01:09<00:00,  1.55it/s]\n",
+            "100%|██████████| 108/108 [01:24<00:00,  1.29it/s]\n",
+            "Generating embeddings: 100%|██████████| 108/108 [00:01<00:00, 56.53it/s]\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.extractors import (\n",
+        "    SummaryExtractor,\n",
+        "    QuestionsAnsweredExtractor,\n",
+        "    KeywordExtractor,\n",
+        ")\n",
+        "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+        "from llama_index.core.ingestion import IngestionPipeline\n",
+        "\n",
+        "# Create the pipeline to apply the transformation on each chunk,\n",
+        "# and store the transformed text in the chroma vector store.\n",
+        "pipeline = IngestionPipeline(\n",
+        "    transformations=[\n",
+        "        text_splitter,\n",
+        "        QuestionsAnsweredExtractor(questions=3, llm=llm),\n",
+        "        SummaryExtractor(summaries=[\"prev\", \"self\"], llm=llm),\n",
+        "        KeywordExtractor(keywords=10, llm=llm),\n",
+        "        OpenAIEmbedding(),\n",
+        "    ],\n",
+        "    vector_store=vector_store\n",
+        ")\n",
+        "\n",
+        "# Run the transformation pipeline.\n",
+        "nodes = pipeline.run(documents=documents, show_progress=True);"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "mPGa85hM2P3P",
+        "outputId": "4586ad85-71bd-4407-a584-326941a5f474"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "108"
+            ]
+          },
+          "execution_count": 12,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "len( nodes )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "OeeG3jxT0taW",
+        "outputId": "8a2e3c63-c346-4034-8147-f2f1f996c326"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "updating: mini-llama-articles/ (stored 0%)\n",
+            "updating: mini-llama-articles/chroma.sqlite3 (deflated 65%)\n",
+            "  adding: mini-llama-articles/0e0852fc-d2a0-47e2-9824-f77f2f6d1b14/ (stored 0%)\n",
+            "  adding: mini-llama-articles/0e0852fc-d2a0-47e2-9824-f77f2f6d1b14/data_level0.bin (deflated 100%)\n",
+            "  adding: mini-llama-articles/0e0852fc-d2a0-47e2-9824-f77f2f6d1b14/length.bin (deflated 48%)\n",
+            "  adding: mini-llama-articles/0e0852fc-d2a0-47e2-9824-f77f2f6d1b14/link_lists.bin (stored 0%)\n",
+            "  adding: mini-llama-articles/0e0852fc-d2a0-47e2-9824-f77f2f6d1b14/header.bin (deflated 61%)\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Compress the vector store directory to a zip file to be able to download and use later.\n",
+        "!zip -r vectorstore.zip mini-llama-articles"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OWaT6rL7ksp8"
+      },
+      "source": [
+        "# Load Indexes"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "6fFGWiz3hoTd"
+      },
+      "source": [
+        "If you have already uploaded the zip file for the vector store checkpoint, please uncomment the code in the following cell block to extract its contents. After doing so, you will be able to load the dataset from local storage."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "XxPMJ4tq06qx",
+        "outputId": "8445e40a-b3c6-44ff-dfde-37cd4c73ffa2"
+      },
+      "outputs": [],
+      "source": [
+        "# !unzip vectorstore.zip"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 15,
+      "metadata": {
+        "id": "mXi56KTXk2sp"
+      },
+      "outputs": [],
+      "source": [
+        "# Load the vector store from the local storage.\n",
+        "db = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = db.get_or_create_collection(\"mini-llama-articles\")\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 17,
+      "metadata": {
+        "id": "jKXURvLtkuTS"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "\n",
+        "# Create the index based on the vector store.\n",
+        "index = VectorStoreIndex.from_vector_store(vector_store)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8JPD8yAinVSq"
+      },
+      "source": [
+        "# Query Dataset"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 22,
+      "metadata": {
+        "id": "BsFfFpVgn01h"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.postprocessor.cohere_rerank import CohereRerank\n",
+        "\n",
+        "# Define the Cohere Reranking object to return only the first two highest ranking chunks.\n",
+        "cohere_rerank = CohereRerank(top_n=2, api_key=cohere_key)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 23,
+      "metadata": {
+        "id": "b0gue7cyctt1"
+      },
+      "outputs": [],
+      "source": [
+        "# Define the ServiceCotext object to tie the LLM for generating final answer,\n",
+        "# and the embedding model to help with retrieving related nodes.\n",
+        "# The `node_postprocessors` function will be applied to the retrieved nodes.\n",
+        "query_engine = index.as_query_engine(\n",
+        "    similarity_top_k=10,\n",
+        "    node_postprocessors=[cohere_rerank]\n",
+        ")\n",
+        "\n",
+        "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 24,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        },
+        "id": "VKK3jMprctre",
+        "outputId": "3acce09e-faa2-4acd-ac8f-f62380d91567"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'The Llama 2 model is available in four different sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.'"
+            ]
+          },
+          "execution_count": 24,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "res.response"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 25,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "nvSmOtqBoCY2",
+        "outputId": "052a70df-d98d-4a87-bb7c-9e56d34db7f7"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Node ID\t 6fea54fa-138b-4931-9e37-42fe16fca62a\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "Score\t 0.90582335\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 99774ac6-5d8e-492b-8c94-4e9717afd2fc\n",
+            "Title\t Exploring Large Language Models -Part 3\n",
+            "Text\t LM model training via UnSupervised learning). Note that this model was loaded in 4-bit, making it runnable on a single T4 GPU and trained with QLoRa. With QLoRA, only a fraction of the adapter weights are trained and summed with the existing frozen pre-trained weights of the model during inference. Here is an illustrative Colab notebook. You can see that training the model with just the text as is, does not result in proper output to questions. The answers are not affected by the training data. Take 2: Instruct Fine-tuning with QLoRa Instruction Tuning concept is a higher-level training concept introduced by this paper FineTuned Language Models Are Zero shot Learners (FLAN) We leverage the intuition that NLP tasks can be described via natural language instructions, such as \"Is the sentiment of this movie review positive or negative?\" or \"Translate 'how are you' into Chinese.\" We take a pre-trained language model of 137B parameters and perform instruction tuning ... Since we use QLoRa we are effectively closely following this paper - QLORA: Efficient Finetuning of Quantized LLMs concerning the training data set, the format that the authors used to train their Gauanco model This is the format for the Llama2 model and will be different for others. One of the hardest problems of training is finding or creating a good quality data set to train. In our case, converting the available training data set to the instruction data set. Since our use case is Closed Book QA, we need to convert this to a QA format. Using older NLP methods like NER (Named Entity Recognition) and then using that to create a QA dataset was not effective. This is where the Self-instruct concept could be used However previous to Llama2, the best-performing model was the GPT 3/4 model via ChatGPT or its API and using these models to do the same was expensive. The 7 billion model of Llama2 has sufficient NLU (Natural Language Understanding) to create output based on a particular format. Running this in 4-bit mode via Quantisation makes it feasible compute-wise to run this on a large data set and convert it to a QA dataset. This was the prompt used. The\n",
+            "Score\t 0.88363826\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Show the retrieved nodes\n",
+        "for src in res.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Title\\t\", src.metadata['title'])\n",
+        "  print(\"Text\\t\", src.text)\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "iMkpzH7vvb09"
+      },
+      "source": [
+        "# Evaluate"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 26,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "H8a3eKgKvckU",
+        "outputId": "cb004dc9-6b49-4d10-a790-1d5257318cd7"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "100%|██████████| 108/108 [04:30<00:00,  2.51s/it]\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.evaluation import generate_question_context_pairs\n",
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "# Create questions for each segment. These questions will be used to\n",
+        "# assess whether the retriever can accurately identify and return the\n",
+        "# corresponding segment when queried.\n",
+        "llm = OpenAI(model=\"gpt-3.5-turbo-0125\")\n",
+        "rag_eval_dataset = generate_question_context_pairs(\n",
+        "    nodes,\n",
+        "    llm=llm,\n",
+        "    num_questions_per_chunk=1\n",
+        ")\n",
+        "\n",
+        "# We can save the evaluation dataset as a json file for later use.\n",
+        "rag_eval_dataset.save_json(\"./rag_eval_dataset_rerank.json\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "QvZBMpsXiWEw"
+      },
+      "source": [
+        "If you have uploaded the generated question JSON file, please uncomment the code in the next cell block. This will avoid the need to generate the questions manually, saving you time and effort."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 27,
+      "metadata": {
+        "id": "3sA1K84U254o"
+      },
+      "outputs": [],
+      "source": [
+        "# from llama_index.finetuning.embeddings.common import (\n",
+        "#     EmbeddingQAFinetuneDataset,\n",
+        "# )\n",
+        "# rag_eval_dataset = EmbeddingQAFinetuneDataset.from_json(\n",
+        "#     \"./rag_eval_dataset_rerank.json\"\n",
+        "# )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 28,
+      "metadata": {
+        "id": "H7ubvcbk27vr"
+      },
+      "outputs": [],
+      "source": [
+        "import pandas as pd\n",
+        "\n",
+        "#  A simple function to show the evaluation result.\n",
+        "def display_results_retriever(name, eval_results):\n",
+        "    \"\"\"Display results from evaluate.\"\"\"\n",
+        "\n",
+        "    metric_dicts = []\n",
+        "    for eval_result in eval_results:\n",
+        "        metric_dict = eval_result.metric_vals_dict\n",
+        "        metric_dicts.append(metric_dict)\n",
+        "\n",
+        "    full_df = pd.DataFrame(metric_dicts)\n",
+        "\n",
+        "    hit_rate = full_df[\"hit_rate\"].mean()\n",
+        "    mrr = full_df[\"mrr\"].mean()\n",
+        "\n",
+        "    metric_df = pd.DataFrame(\n",
+        "        {\"Retriever Name\": [name], \"Hit Rate\": [hit_rate], \"MRR\": [mrr]}\n",
+        "    )\n",
+        "\n",
+        "    return metric_df"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 29,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "uNLxDxoc2-Ac",
+        "outputId": "f42dc98d-789f-4779-c693-0603cd43e4c9"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "    Retriever Name  Hit Rate      MRR\n",
+            "0  Retriever top_2  0.665975  0.54668\n",
+            "    Retriever Name  Hit Rate       MRR\n",
+            "0  Retriever top_4  0.782158  0.582815\n",
+            "    Retriever Name  Hit Rate      MRR\n",
+            "0  Retriever top_6    0.8361  0.59305\n",
+            "    Retriever Name  Hit Rate       MRR\n",
+            "0  Retriever top_8  0.854772  0.595606\n",
+            "     Retriever Name  Hit Rate       MRR\n",
+            "0  Retriever top_10  0.871369  0.597404\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.evaluation import RetrieverEvaluator\n",
+        "\n",
+        "# We can evaluate the retievers with different top_k values.\n",
+        "for i in [2, 4, 6, 8, 10]:\n",
+        "    retriever = index.as_retriever(similarity_top_k=i, node_postprocessors=[cohere_rerank])\n",
+        "    retriever_evaluator = RetrieverEvaluator.from_metric_names(\n",
+        "        [\"mrr\", \"hit_rate\"], retriever=retriever\n",
+        "    )\n",
+        "    eval_results = await retriever_evaluator.aevaluate_dataset(rag_eval_dataset)\n",
+        "    print(display_results_retriever(f\"Retriever top_{i}\", eval_results))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ikMYkBATFY3l"
+      },
+      "source": [
+        "It's important to keep in mind that all the results above are based on only two samples even when the retriever fetch 10 items from the vector store. So, it means that instead of passing 10 chunks of data which translates into more API usage and higher cost, we will get the same quality by passing 2 chunk of data.\n",
+        "\n",
+        "The bot's hit rate without Cohere Reranking using two chunks is 0.65, while we get the 0.87 hit rate using two chunks after the Cohere's post processing."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "-DMSFJI8F6jl"
+      },
+      "outputs": [],
+      "source": []
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "authorship_tag": "ABX9TyNPhIDuwnBNGZxkxkMnLtTw",
+      "include_colab_link": true,
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.8"
+    },
+    "widgets": {
+      "application/vnd.jupyter.widget-state+json": {
+        "04af736f84044e37aa6599aa708a77bc": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "195aa202b03a42a3a674e9da2f13d878": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "1d1faa15f5564b68b948eaffa58626b3": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_f57a9ac0d924408fbaaac795c172862e",
+            "placeholder": "",
+            "style": "IPY_MODEL_4cb8ba074b254e91b8877cc87ae0d279",
+            "value": "Parsing nodes: 100%"
+          }
+        },
+        "2dc4a6c935ac4ef38ed9030608bd4b2f": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "3657dc19b6ac477b9f05bb6519271473": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_8d35ab8c65ba47e1be446b98f0942ac4",
+            "placeholder": "",
+            "style": "IPY_MODEL_75e40756175f463e874630f229ef4066",
+            "value": " 14/14 [00:01&lt;00:00, 10.94it/s]"
+          }
+        },
+        "3a621edd23354ea5924189885c97dee4": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "3f55aef52aee4e77864d53e3197c3cc3": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "4802a63f700e48fca16b5d89fbab333d": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_4fcebf4a9ef54729889cc6ad4cbe5d10",
+            "placeholder": "",
+            "style": "IPY_MODEL_195aa202b03a42a3a674e9da2f13d878",
+            "value": " 108/108 [00:07&lt;00:00, 10.36it/s]"
+          }
+        },
+        "4bb1e341a77d41c9aca0e6680911fb43": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_1d1faa15f5564b68b948eaffa58626b3",
+              "IPY_MODEL_df22a67ae80b4673b708eea74646be61",
+              "IPY_MODEL_3657dc19b6ac477b9f05bb6519271473"
+            ],
+            "layout": "IPY_MODEL_9045e402f0344428acc085d63df7ff03"
+          }
+        },
+        "4cb8ba074b254e91b8877cc87ae0d279": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "4fcebf4a9ef54729889cc6ad4cbe5d10": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "73d34cae940e4748a7b3127351925e65": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "75e40756175f463e874630f229ef4066": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "8728ca516bd0474586b19e0c9b457499": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_f41df4b6ab4c4132b0d20232002f0294",
+            "placeholder": "",
+            "style": "IPY_MODEL_3a621edd23354ea5924189885c97dee4",
+            "value": "Generating embeddings: 100%"
+          }
+        },
+        "8d35ab8c65ba47e1be446b98f0942ac4": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "9045e402f0344428acc085d63df7ff03": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "a0dd5f2c99b2407f9f5705587976ae76": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_8728ca516bd0474586b19e0c9b457499",
+              "IPY_MODEL_aac433a9a64c48dfb18d7a01f64d3b27",
+              "IPY_MODEL_4802a63f700e48fca16b5d89fbab333d"
+            ],
+            "layout": "IPY_MODEL_3f55aef52aee4e77864d53e3197c3cc3"
+          }
+        },
+        "aac433a9a64c48dfb18d7a01f64d3b27": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_73d34cae940e4748a7b3127351925e65",
+            "max": 108,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_2dc4a6c935ac4ef38ed9030608bd4b2f",
+            "value": 108
+          }
+        },
+        "cbd3e1411b2c4eeb943243c9d45245c4": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "df22a67ae80b4673b708eea74646be61": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_cbd3e1411b2c4eeb943243c9d45245c4",
+            "max": 14,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_04af736f84044e37aa6599aa708a77bc",
+            "value": 14
+          }
+        },
+        "f41df4b6ab4c4132b0d20232002f0294": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "f57a9ac0d924408fbaaac795c172862e": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        }
+      }
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

notebooks/11-Adding_Hybrid_Search.ipynb ADDED Viewed

	@@ -0,0 +1,1645 @@

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "view-in-github"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/11-Adding_Hybrid_Search.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-zE1h0uQV7uT"
+      },
+      "source": [
+        "# Install Packages and Setup Variables"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "QPJzr-I9XQ7l",
+        "outputId": "3115889a-14ee-457c-c0d5-271c1053a1e9"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -q llama-index==0.10.11 openai==1.12.0 llama-index-finetuning llama-index-embeddings-huggingface llama-index-readers-web tiktoken==0.6.0 chromadb==0.4.22 pandas==2.2.0 html2text sentence_transformers pydantic"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "id": "riuXwpSPcvWC"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\""
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {
+        "id": "jIEeZzqLbz0J"
+      },
+      "outputs": [],
+      "source": [
+        "# Allows running asyncio in environments with an existing event loop, like Jupyter notebooks.\n",
+        "\n",
+        "import nest_asyncio\n",
+        "\n",
+        "nest_asyncio.apply()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Bkgi2OrYzF7q"
+      },
+      "source": [
+        "# Load a Model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "id": "9oGT6crooSSj"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/Users/louis/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+            "  from .autonotebook import tqdm as notebook_tqdm\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "llm = OpenAI(temperature=0.9, model=\"gpt-3.5-turbo-0125\", max_tokens=512)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0BwVuJXlzHVL"
+      },
+      "source": [
+        "# Create a VectoreStore"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "id": "SQP87lHczHKc"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "\n",
+        "# create client and a new collection\n",
+        "# chromadb.EphemeralClient saves data in-memory.\n",
+        "chroma_client = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = chroma_client.create_collection(\"mini-llama-articles\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "id": "zAaGcYMJzHAN"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Define a storage context object using the created vector database.\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "I9JbAzFcjkpn"
+      },
+      "source": [
+        "# Load the Dataset (CSV)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ceveDuYdWCYk"
+      },
+      "source": [
+        "## Download"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "eZwf6pv7WFmD"
+      },
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model. Read the dataset as a long string."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "wl_pbPvMlv1h",
+        "outputId": "24342259-24f0-44fa-bd0d-21da798d0555"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+            "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+            "100  169k  100  169k    0     0   864k      0 --:--:-- --:--:-- --:--:--  865k\n"
+          ]
+        }
+      ],
+      "source": [
+        "!curl -o ./mini-llama-articles.csv https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "VWBLtDbUWJfA"
+      },
+      "source": [
+        "## Read File"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "0Q9sxuW0g3Gd",
+        "outputId": "889c1127-cf04-4ce7-d99c-d60826ffe92f"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "14"
+            ]
+          },
+          "execution_count": 8,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "import csv\n",
+        "\n",
+        "rows = []\n",
+        "\n",
+        "# Load the file as a JSON\n",
+        "with open(\"./mini-llama-articles.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for idx, row in enumerate( csv_reader ):\n",
+        "    if idx == 0: continue; # Skip header row\n",
+        "    rows.append( row )\n",
+        "\n",
+        "# The number of characters in the dataset.\n",
+        "len( rows )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "S17g2RYOjmf2"
+      },
+      "source": [
+        "# Convert to Document obj"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "id": "YizvmXPejkJE"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row[1], metadata={\"title\": row[0], \"url\": row[2], \"source_name\": row[3]}) for row in rows]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "qjuLbmFuWsyl"
+      },
+      "source": [
+        "# Transforming"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "metadata": {
+        "id": "9z3t70DGWsjO"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.text_splitter import TokenTextSplitter\n",
+        "\n",
+        "# Define the splitter object that split the text into segments with 512 tokens,\n",
+        "# with a 128 overlap between the segments.\n",
+        "text_splitter = TokenTextSplitter(\n",
+        "    separator=\" \", chunk_size=512, chunk_overlap=128\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 331,
+          "referenced_widgets": [
+            "3fbabd8a8660461ba5e7bc08ef39139a",
+            "df2365556ae242a2ab1a119f9a31a561",
+            "5f4b9d32df8f446e858e4c289dc282f9",
+            "5b588f83a15d42d9aca888e06bbd95ff",
+            "ad073bca655540809e39f26538d2ec0d",
+            "13b9c5395bca4c3ba21265240cb936cf",
+            "47a4586384274577a726c57605e7f8d9",
+            "96a3bdece738481db57e811ccb74a974",
+            "5c7973afd79349ed997a69120d0629b2",
+            "af9b6ae927dd4764b9692507791bc67e",
+            "134210510d49476e959dd7d032bbdbdc",
+            "5f9bb065c2b74d2e8ded32e1306a7807",
+            "73a06bc546a64f7f99a9e4a135319dcd",
+            "ce48deaf4d8c49cdae92bfdbb3a78df0",
+            "4a172e8c6aa44e41a42fc1d9cf714fd0",
+            "0245f2604e4d49c8bd0210302746c47b",
+            "e956dfab55084a9cbe33c8e331b511e7",
+            "cb394578badd43a89850873ad2526542",
+            "193aef33d9184055bb9223f56d456de6",
+            "abfc9aa911ce4a5ea81c7c451f08295f",
+            "e7937a1bc68441a080374911a6563376",
+            "e532ed7bfef34f67b5fcacd9534eb789"
+          ]
+        },
+        "id": "P9LDJ7o-Wsc-",
+        "outputId": "01070c1f-dffa-4ab7-ad71-b07b76b12e03"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Parsing nodes:   0%|          | 0/14 [00:00<?, ?it/s]"
+          ]
+        },
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Parsing nodes: 100%|██████████| 14/14 [00:00<00:00, 27.40it/s]\n",
+            "100%|██████████| 108/108 [00:59<00:00,  1.81it/s]\n",
+            "100%|██████████| 108/108 [01:08<00:00,  1.58it/s]\n",
+            "100%|██████████| 108/108 [00:27<00:00,  3.88it/s]\n",
+            "Generating embeddings: 100%|██████████| 108/108 [00:01<00:00, 77.68it/s]\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.extractors import (\n",
+        "    SummaryExtractor,\n",
+        "    QuestionsAnsweredExtractor,\n",
+        "    KeywordExtractor,\n",
+        ")\n",
+        "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+        "from llama_index.core.ingestion import IngestionPipeline\n",
+        "\n",
+        "# Create the pipeline to apply the transformation on each chunk,\n",
+        "# and store the transformed text in the chroma vector store.\n",
+        "pipeline = IngestionPipeline(\n",
+        "    transformations=[\n",
+        "        text_splitter,\n",
+        "        QuestionsAnsweredExtractor(questions=3, llm=llm),\n",
+        "        SummaryExtractor(summaries=[\"prev\", \"self\"], llm=llm),\n",
+        "        KeywordExtractor(keywords=10, llm=llm),\n",
+        "        OpenAIEmbedding(),\n",
+        "    ],\n",
+        "    vector_store=vector_store\n",
+        ")\n",
+        "\n",
+        "# Run the transformation pipeline.\n",
+        "nodes = pipeline.run(documents=documents, show_progress=True);"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "mPGa85hM2P3P",
+        "outputId": "c106c463-2459-4b11-bbae-5bd5e2246011"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "108"
+            ]
+          },
+          "execution_count": 13,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "len( nodes )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {
+        "id": "23x20bL3_jRb"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "updating: mini-llama-articles/ (stored 0%)\n",
+            "updating: mini-llama-articles/chroma.sqlite3 (deflated 65%)\n",
+            "  adding: mini-llama-articles/6059cb71-7dfb-4096-aaab-f06eaf1d0ace/ (stored 0%)\n",
+            "  adding: mini-llama-articles/6059cb71-7dfb-4096-aaab-f06eaf1d0ace/data_level0.bin (deflated 97%)\n",
+            "  adding: mini-llama-articles/6059cb71-7dfb-4096-aaab-f06eaf1d0ace/length.bin (deflated 23%)\n",
+            "  adding: mini-llama-articles/6059cb71-7dfb-4096-aaab-f06eaf1d0ace/link_lists.bin (stored 0%)\n",
+            "  adding: mini-llama-articles/6059cb71-7dfb-4096-aaab-f06eaf1d0ace/header.bin (deflated 61%)\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Compress the vector store directory to a zip file to be able to download and use later.\n",
+        "!zip -r vectorstore.zip mini-llama-articles"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OWaT6rL7ksp8"
+      },
+      "source": [
+        "# Load Indexes"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "d7mY7AdLjs4F"
+      },
+      "source": [
+        "If you have already uploaded the zip file for the vector store checkpoint, please uncomment the code in the following cell block to extract its contents. After doing so, you will be able to load the dataset from local storage."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 15,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "SodY2Xpf_kxg",
+        "outputId": "701258b4-ea35-46d1-df33-536a45752a28"
+      },
+      "outputs": [],
+      "source": [
+        "# !unzip vectorstore.zip"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 16,
+      "metadata": {
+        "id": "mXi56KTXk2sp"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Load the vector store from the local storage.\n",
+        "db = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = db.get_or_create_collection(\"mini-llama-articles\")\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 17,
+      "metadata": {
+        "id": "jKXURvLtkuTS"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "\n",
+        "# Create the index based on the vector store.\n",
+        "vector_index = VectorStoreIndex.from_vector_store(vector_store)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XjIQGo11j5N-"
+      },
+      "source": [
+        "# Retrieving All the Nodes"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "RZBPFntrj8tp"
+      },
+      "source": [
+        "To develop a custom retriever with keyword index, we require access to all nodes. We use the index as a retriever and requesting it to fetch a large number of documents, we can ensure that the retriever returns every document stored in the vector store. (This method serves as a temporary solution because LlamaIndex currently lacks the capability to fetch all documents from a chromadb. However, this limitation may be addressed in future updates.)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 18,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "Za6m06wpcJpN",
+        "outputId": "98806ea5-5c2d-4a87-97ea-ee37a890c7bf"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Number of requested results 100000000 is greater than number of elements in index 108, updating n_results = 108\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Set similarity_top_k to a large number to retrieve all the nodes\n",
+        "retriever = vector_index.as_retriever(similarity_top_k=100000000)\n",
+        "\n",
+        "# Retrieve all nodes\n",
+        "all_nodes = retriever.retrieve('Hello!')"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 19,
+      "metadata": {
+        "id": "2Tz_n2MLj62B"
+      },
+      "outputs": [],
+      "source": [
+        "all_nodes = [item.node for item in all_nodes]"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 20,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "mquOgF8UnXZi",
+        "outputId": "cd41e132-237e-4e4f-bb35-464dba9307ba"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "108"
+            ]
+          },
+          "execution_count": 20,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "len( all_nodes )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 21,
+      "metadata": {
+        "id": "hcmwBAsCZIwR"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import SimpleKeywordTableIndex\n",
+        "\n",
+        "# Define the KeyworddTableIndex using all the nodes.\n",
+        "keyword_index = SimpleKeywordTableIndex(nodes=all_nodes)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "K3wtAa7Lo2Vh"
+      },
+      "source": [
+        "# Custom Retriever"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 22,
+      "metadata": {
+        "id": "txPFNOkUo2Kj"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import QueryBundle\n",
+        "from llama_index.core.schema import NodeWithScore\n",
+        "from llama_index.core.retrievers import (\n",
+        "    BaseRetriever,\n",
+        "    VectorIndexRetriever,\n",
+        "    KeywordTableSimpleRetriever,\n",
+        ")\n",
+        "from typing import List\n",
+        "\n",
+        "# The custom retriever that can use both vector index and keyword index to retrieve documents.\n",
+        "# It has two modes: \"AND\" meaning it uses nodes that are retrieved in both indexes.\n",
+        "# \"OR\" meaning that it merges the retrieved nodes.\n",
+        "class CustomRetriever(BaseRetriever):\n",
+        "    \"\"\"Custom retriever that performs both semantic search and hybrid search.\"\"\"\n",
+        "\n",
+        "    def __init__(\n",
+        "        self,\n",
+        "        vector_retriever: VectorIndexRetriever,\n",
+        "        keyword_retriever: KeywordTableSimpleRetriever,\n",
+        "        mode: str = \"AND\",\n",
+        "    ) -> None:\n",
+        "        \"\"\"Init params.\"\"\"\n",
+        "\n",
+        "        self._vector_retriever = vector_retriever\n",
+        "        self._keyword_retriever = keyword_retriever\n",
+        "        if mode not in (\"AND\", \"OR\"):\n",
+        "            raise ValueError(\"Invalid mode.\")\n",
+        "        self._mode = mode\n",
+        "        super().__init__()\n",
+        "\n",
+        "    def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:\n",
+        "        \"\"\"Retrieve nodes given query.\"\"\"\n",
+        "\n",
+        "        vector_nodes = self._vector_retriever.retrieve(query_bundle)\n",
+        "        keyword_nodes = self._keyword_retriever.retrieve(query_bundle)\n",
+        "\n",
+        "        vector_ids = {n.node.node_id for n in vector_nodes}\n",
+        "        keyword_ids = {n.node.node_id for n in keyword_nodes}\n",
+        "\n",
+        "        combined_dict = {n.node.node_id: n for n in vector_nodes}\n",
+        "        combined_dict.update({n.node.node_id: n for n in keyword_nodes})\n",
+        "\n",
+        "        if self._mode == \"AND\":\n",
+        "            retrieve_ids = vector_ids.intersection(keyword_ids)\n",
+        "        else:\n",
+        "            retrieve_ids = vector_ids.union(keyword_ids)\n",
+        "\n",
+        "        retrieve_nodes = [combined_dict[rid] for rid in retrieve_ids]\n",
+        "\n",
+        "        return retrieve_nodes"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 23,
+      "metadata": {
+        "id": "YWLckX40pii-"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import get_response_synthesizer\n",
+        "from llama_index.core.query_engine import RetrieverQueryEngine\n",
+        "\n",
+        "# define custom retriever\n",
+        "vector_retriever = VectorIndexRetriever(index=vector_index, similarity_top_k=2)\n",
+        "keyword_retriever = KeywordTableSimpleRetriever(index=keyword_index, max_keywords_per_query=2)\n",
+        "custom_retriever = CustomRetriever(vector_retriever, keyword_retriever, \"OR\")\n",
+        "\n",
+        "# define response synthesizer\n",
+        "response_synthesizer = get_response_synthesizer()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8JPD8yAinVSq"
+      },
+      "source": [
+        "# Query Dataset"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 24,
+      "metadata": {
+        "id": "b0gue7cyctt1"
+      },
+      "outputs": [],
+      "source": [
+        "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "custom_query_engine = RetrieverQueryEngine(\n",
+        "    retriever=custom_retriever,\n",
+        "    response_synthesizer=response_synthesizer,\n",
+        ")\n",
+        "\n",
+        "res = custom_query_engine.query(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 25,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 35
+        },
+        "id": "VKK3jMprctre",
+        "outputId": "370a6a1a-133d-428f-80c7-28777f4349b3"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'The LLaMA2 model has 52 billion parameters.'"
+            ]
+          },
+          "execution_count": 25,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "res.response"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 26,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "465dH4yQc7Ct",
+        "outputId": "8f43f543-40b1-4f63-a433-d59b33545774"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Node ID\t 322a5cb0-5b0c-413f-bc5e-e72747b385d1\n",
+            "Title\t Building Intuition on the Concepts behind LLMs like ChatGPT - Part 1- Neural Networks, Transformers, Pretraining, and Fine Tuning\n",
+            "Text\t backpropagation, the degree of the error of the model (the loss value) is propagated backward through the neural network. It computes the derivative to the output of each individual weight and bias i.e. how sensitive the output is to changes in each specific parameter. For my people who didn't take on differential calculus in school (such as myself), think of the model parameters (weights/biases) as adjustable knobs. These knobs are arbitrary - in the sense that you can't tell in what specific way it governs the prediction ability of the model. The knobs, which can be rotated clockwise or counterclockwise have different effects on the behavior of the output. Knob A might increase the loss 3x when turned clockwise, knob B reduces the loss by 1/8 when turned counterclockwise (and so on). All these knobs are checked (all billions of them) and to get information on how sensitive the output is to adjustments of each knob - this numerical value is their derivative with respect to the output. Calculating these derivatives is called backpropagation. The output of backpropagation is a vector (a list of numbers) whose elements or dimensions consist of the parameters' individual derivatives. This vector is the gradient of the error with respect to the existing parameter values (or the current learnings) of the neural network. A vector has two properties: length or magnitude and direction. The gradient vector contains information on the direction in which the error or loss is increasing. The magnitude of the vector signifies the steepness or rate of increase. Think of the gradient vector as the map of a foggy hill you're descending from - gradient descent optimization is using the information about direction and steepness from the gradient vector to reach the bottom of the hill (the minimum loss value) as efficiently as possible by navigating to the path with the greatest downward incline (the opposite direction of the gradient vector). This involves iteratively adjusting the values of the weights and biases of the network (by subtracting small values to it i.e. the learning rate) en masse to reach this optimal state. After these steps, the hope\n",
+            "Score\t None\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t f097d19f-45bd-402b-9547-5482f57110ea\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "Score\t 0.7156515131319103\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 22cea8a0-aea7-4405-b7e1-a2cb02ff10e8\n",
+            "Title\t The Generative AI Revolution: Exploring the Current Landscape\n",
+            "Text\t Cloud announced its partnership with Cohere. The company intends to use Cloud's TPU for the development and deployment of its products, and Sagemaker by Amazon also gives access to Cohere's language AI. Cohere powers Hyperwrite, which helps in quickly generating articles. AWS has also announced a partnership with Cohere AI. To date, Cohere has raised $170 million, and with the ongoing rush of funding in AI platforms, the Canadian startup is expected to be valued at $6 billion. Cohere is set to introduce a new dialogue model to aid enterprise users in generating text while engaging with the model to fine-tune the output. Cohere's Xlarge model resembles ChatGPT but provides developers and businesses with access to this technology. Cohere's base model has 52 billion parameters compared to OpenAI's GPT-3 DaVinci model, which has 175B parameters. Cohere stresses on accuracy, speed, safety, cost, and ease of use for its users and has paid much attention to the product and its design, developing a cohesive model.  8. Anthropic AI's Claude Anthropic is an American AI startup and public benefit corporation founded in 2021 by Daniela Amodei and Dario Amodei, former members of OpenAI. The company specializes in developing AI systems and language models, with a particular focus on transformer architecture. Anthropic's research on the interpretability of machine learning systems covers fields ranging from natural language and interpretability to human feedback, scaling laws, reinforcement learning, and code generation, among others. The company stresses the application of responsible AI and presents itself as an AI safety and research company working towards building reliable, steerable, and interpretable AI systems. By 2022, Google had invested nearly $400 million in Anthropic, resulting in a formal partnership between the two companies and giving Google a 10% stake in Anthropic. Outside backing amounted to $580 million, with total investments in Anthropic exceeding $1 billion to date. Anthropic has developed a conversational large language model AI chatbot named Claude, which uses a messaging interface and a technique called constitutional AI to better align AI systems with human intentions. AnthropicLM v4-s3 is a\n",
+            "Score\t None\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 603fb039-960c-4c3e-a98a-a65c57ab6761\n",
+            "Title\t Building Intuition on the Concepts behind LLMs like ChatGPT - Part 1- Neural Networks, Transformers, Pretraining, and Fine Tuning\n",
+            "Text\t published by OpenAI, to train better models, increasing the number of parameters is 3x more important than increasing the size of the training data. (Note: DeepMind has since published a paper with a differing view.) This translates to a significant increase in computational requirements, as handling a larger number of parameters demands more complex calculations. Parallelization, which is the process of dividing a single task into multiple sub-tasks that can be processed simultaneously across multiple compute resources, becomes essential in dealing with this problem. Parallelization is difficult to achieve with RNNs given their sequential nature. This is not an issue for transformers as it computes relationships between all elements in a sequence simultaneously, rather than sequentially. It also means that they work well with GPUs or video cards. Graphics rendering requires a large number of simple calculations happening concurrently. The numerous, small, and efficient processing cores that a GPU has, which are designed for simultaneous operations, make it a good fit for tasks such as matrix and vector operations that are central to deep learning. AI going 'mainstream' and the mad scramble to build larger and better models is a boon to GPU manufacturers. NVIDIA- specifically - whose stock price has grown 200% YTD as of this writing, has made them the highest-performing stock this year and pushed their market cap to USD 1 trillion. They join megacaps like Apple, Google, Microsoft, and Amazon in this exclusive club. The Transformer is a decidedly complex topic and the explanation above wholesale left out important concepts in order to be more digestible to a broader audience. If you want to know more, I found these gentle yet significantly more fleshed-out introductions to the topic: Jay Allamar's illustrated transformer, Lili Jiang's potion analogy, or if you want something more advanced - Karpathy's nanoGPT that babbles in Shakepear-ish.  Fine-tuning 'chat' models like ChatGPT The output of pretrainings are base models or foundation models. Examples of recently released text-generation foundation models are GPT-4, Bard, LLaMa 1 & 2, and Claude 1\n",
+            "Score\t None\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 56881e5c-1c47-48bd-be19-df7ada6ab593\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+            "Score\t 0.7009231750702649\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 4aada7f3-39f9-4911-ae2a-fb57876ee4a4\n",
+            "Title\t Exploring Large Language Models -Part 3\n",
+            "Text\t concept with toy datasets. The real trouble is making the model 'understand' the data first and not just parrot it out. Without understanding, it will parrot out the answer based on the similarity of the question in the training set, or both the question and answer. To prevent this, the authors have an intermediate step called 'Recite' where the model is made to recite/output the relevant passages and, after that, output the answer. Just to be clear, there is no doubt now (2023), especially with GPT3/4, LLAMA2 and similar models about the feasibility of this use case, that a model can understand the question, has some ability for causal reasoning, and can generalize to learn a world model from its training data, and to use both to create a well-formed answer to the question. Let's see the difficulties one by one however, of training a large model. First is the importance of the model size. This GIF from the Google AI blog illustrates this beautifully. It is relatively easy and cost-efficient to train or fine-tune a small model with our custom data, as the GPU and infrastructure requirements are very less. On the contrary, it needs huge fleets of GPUs and training infrastructure to load very large language models and fine-tune them (without quantisation) in a distributed way (e.g. see libraries like DeepSpeed) LLMs come in various sizes, based on the number of trainable parameters or weights. The smaller ones, which have less than 1 billion parameters (GPT2 124 M, Bloom 560M, Flan-T5 783 M ) etc can be trained on a laptop GPU with 8 to 15 GB GPU RAM ) For quite some time, this is what I tried. I tried to overfit a small test data set on decoder models like GPP2-small, GPT-Medium, and Bloom and encoder-decoder models like Flan-T5, thinking somehow that the understanding we see in ChatGPT ( see- unsupervised learning Part 1) may come in some form if we train on these smaller models. ( less than one billion parameters). As per the paper, I tried both Causal training, where the model is presented with only previous tokens, and Masked\n",
+            "Score\t None\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Show the retrieved nodes\n",
+        "for src in res.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Title\\t\", src.metadata['title'])\n",
+        "  print(\"Text\\t\", src.text)\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "iMkpzH7vvb09"
+      },
+      "source": [
+        "# Evaluate"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 27,
+      "metadata": {
+        "id": "H8a3eKgKvckU"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "100%|██████████| 108/108 [06:17<00:00,  3.49s/it]\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.evaluation import generate_question_context_pairs\n",
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "# Create questions for each segment. These questions will be used to\n",
+        "# assess whether the retriever can accurately identify and return the\n",
+        "# corresponding segment when queried.\n",
+        "llm = OpenAI(model=\"gpt-3.5-turbo-0125\")\n",
+        "rag_eval_dataset = generate_question_context_pairs(\n",
+        "    nodes,\n",
+        "    llm=llm,\n",
+        "    num_questions_per_chunk=1\n",
+        ")\n",
+        "\n",
+        "# We can save the evaluation dataset as a json file for later use.\n",
+        "rag_eval_dataset.save_json(\"./rag_eval_dataset.json\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0O7cLF_TlnZV"
+      },
+      "source": [
+        "If you have uploaded the generated question JSON file, please uncomment the code in the next cell block. This will avoid the need to generate the questions manually, saving you time and effort."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "3sA1K84U254o"
+      },
+      "outputs": [],
+      "source": [
+        "# from llama_index.finetuning.embeddings.common import (\n",
+        "#     EmbeddingQAFinetuneDataset,\n",
+        "# )\n",
+        "# rag_eval_dataset = EmbeddingQAFinetuneDataset.from_json(\n",
+        "#     \"./rag_eval_dataset.json\"\n",
+        "# )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 28,
+      "metadata": {
+        "id": "H7ubvcbk27vr"
+      },
+      "outputs": [],
+      "source": [
+        "import pandas as pd\n",
+        "\n",
+        "#  A simple function to show the evaluation result.\n",
+        "def display_results_retriever(name, eval_results):\n",
+        "    \"\"\"Display results from evaluate.\"\"\"\n",
+        "\n",
+        "    metric_dicts = []\n",
+        "    for eval_result in eval_results:\n",
+        "        metric_dict = eval_result.metric_vals_dict\n",
+        "        metric_dicts.append(metric_dict)\n",
+        "\n",
+        "    full_df = pd.DataFrame(metric_dicts)\n",
+        "\n",
+        "    hit_rate = full_df[\"hit_rate\"].mean()\n",
+        "    mrr = full_df[\"mrr\"].mean()\n",
+        "\n",
+        "    metric_df = pd.DataFrame(\n",
+        "        {\"Retriever Name\": [name], \"Hit Rate\": [hit_rate], \"MRR\": [mrr]}\n",
+        "    )\n",
+        "\n",
+        "    return metric_df"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 29,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 435
+        },
+        "id": "uNLxDxoc2-Ac",
+        "outputId": "93f03e7e-2590-46f0-fce0-3e8b29852a88"
+      },
+      "outputs": [
+        {
+          "ename": "ValidationError",
+          "evalue": "1 validation error for RetrieverEvaluator\nretriever\n  instance of BaseRetriever expected (type=type_error.arbitrary_type; expected_arbitrary_type=BaseRetriever)",
+          "output_type": "error",
+          "traceback": [
+            "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+            "\u001b[0;31mValidationError\u001b[0m                           Traceback (most recent call last)",
+            "Cell \u001b[0;32mIn[29], line 11\u001b[0m\n\u001b[1;32m      6\u001b[0m custom_retriever \u001b[38;5;241m=\u001b[39m CustomRetriever(vector_retriever, keyword_retriever, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mOR\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m      7\u001b[0m custom_query_engine \u001b[38;5;241m=\u001b[39m RetrieverQueryEngine(\n\u001b[1;32m      8\u001b[0m     retriever\u001b[38;5;241m=\u001b[39mcustom_retriever,\n\u001b[1;32m      9\u001b[0m     response_synthesizer\u001b[38;5;241m=\u001b[39mresponse_synthesizer,\n\u001b[1;32m     10\u001b[0m )\n\u001b[0;32m---> 11\u001b[0m retriever_evaluator \u001b[38;5;241m=\u001b[39m \u001b[43mRetrieverEvaluator\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_metric_names\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m     12\u001b[0m \u001b[43m    \u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmrr\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mhit_rate\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mretriever\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcustom_query_engine\u001b[49m\n\u001b[1;32m     13\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m     14\u001b[0m eval_results \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m retriever_evaluator\u001b[38;5;241m.\u001b[39maevaluate_dataset(rag_eval_dataset)\n\u001b[1;32m     15\u001b[0m \u001b[38;5;28mprint\u001b[39m(display_results_retriever(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRetriever top_\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mi\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m, eval_results))\n",
+            "File \u001b[0;32m~/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/llama_index/core/evaluation/retrieval/base.py:99\u001b[0m, in \u001b[0;36mBaseRetrievalEvaluator.from_metric_names\u001b[0;34m(cls, metric_names, **kwargs)\u001b[0m\n\u001b[1;32m     91\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Create evaluator from metric names.\u001b[39;00m\n\u001b[1;32m     92\u001b[0m \n\u001b[1;32m     93\u001b[0m \u001b[38;5;124;03mArgs:\u001b[39;00m\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m     96\u001b[0m \n\u001b[1;32m     97\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m     98\u001b[0m metric_types \u001b[38;5;241m=\u001b[39m resolve_metrics(metric_names)\n\u001b[0;32m---> 99\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mmetrics\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m[\u001b[49m\u001b[43mmetric\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mfor\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mmetric\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01min\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mmetric_types\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
+            "File \u001b[0;32m~/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/llama_index/core/evaluation/retrieval/evaluator.py:45\u001b[0m, in \u001b[0;36mRetrieverEvaluator.__init__\u001b[0;34m(self, metrics, retriever, node_postprocessors, **kwargs)\u001b[0m\n\u001b[1;32m     37\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m__init__\u001b[39m(\n\u001b[1;32m     38\u001b[0m     \u001b[38;5;28mself\u001b[39m,\n\u001b[1;32m     39\u001b[0m     metrics: Sequence[BaseRetrievalMetric],\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m     42\u001b[0m     \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs: Any,\n\u001b[1;32m     43\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m     44\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"Init params.\"\"\"\u001b[39;00m\n\u001b[0;32m---> 45\u001b[0m     \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[1;32m     46\u001b[0m \u001b[43m        \u001b[49m\u001b[43mmetrics\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetrics\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     47\u001b[0m \u001b[43m        \u001b[49m\u001b[43mretriever\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mretriever\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     48\u001b[0m \u001b[43m        \u001b[49m\u001b[43mnode_postprocessors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnode_postprocessors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     49\u001b[0m \u001b[43m        \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m     50\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n",
+            "File \u001b[0;32m~/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/pydantic/main.py:341\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.__init__\u001b[0;34m()\u001b[0m\n",
+            "\u001b[0;31mValidationError\u001b[0m: 1 validation error for RetrieverEvaluator\nretriever\n  instance of BaseRetriever expected (type=type_error.arbitrary_type; expected_arbitrary_type=BaseRetriever)"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.evaluation import RetrieverEvaluator\n",
+        "\n",
+        "# We can evaluate the retievers with different top_k values.\n",
+        "for i in [2, 4, 6, 8, 10]:\n",
+        "    vector_retriever = VectorIndexRetriever(index=vector_index, similarity_top_k=i)\n",
+        "    custom_retriever = CustomRetriever(vector_retriever, keyword_retriever, \"OR\")\n",
+        "    custom_query_engine = RetrieverQueryEngine(\n",
+        "        retriever=custom_retriever,\n",
+        "        response_synthesizer=response_synthesizer,\n",
+        "    )\n",
+        "    retriever_evaluator = RetrieverEvaluator.from_metric_names(\n",
+        "        [\"mrr\", \"hit_rate\"], retriever=custom_query_engine\n",
+        "    )\n",
+        "    eval_results = await retriever_evaluator.aevaluate_dataset(rag_eval_dataset)\n",
+        "    print(display_results_retriever(f\"Retriever top_{i}\", eval_results))"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "1MB1YD1E3EKM"
+      },
+      "outputs": [],
+      "source": []
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "authorship_tag": "ABX9TyO362/noWgs82KNvLAlRlkT",
+      "include_colab_link": true,
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.8"
+    },
+    "widgets": {
+      "application/vnd.jupyter.widget-state+json": {
+        "0245f2604e4d49c8bd0210302746c47b": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "134210510d49476e959dd7d032bbdbdc": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "13b9c5395bca4c3ba21265240cb936cf": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "193aef33d9184055bb9223f56d456de6": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "3fbabd8a8660461ba5e7bc08ef39139a": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_df2365556ae242a2ab1a119f9a31a561",
+              "IPY_MODEL_5f4b9d32df8f446e858e4c289dc282f9",
+              "IPY_MODEL_5b588f83a15d42d9aca888e06bbd95ff"
+            ],
+            "layout": "IPY_MODEL_ad073bca655540809e39f26538d2ec0d"
+          }
+        },
+        "47a4586384274577a726c57605e7f8d9": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "4a172e8c6aa44e41a42fc1d9cf714fd0": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_e7937a1bc68441a080374911a6563376",
+            "placeholder": "",
+            "style": "IPY_MODEL_e532ed7bfef34f67b5fcacd9534eb789",
+            "value": " 108/108 [00:03&lt;00:00, 33.70it/s]"
+          }
+        },
+        "5b588f83a15d42d9aca888e06bbd95ff": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_af9b6ae927dd4764b9692507791bc67e",
+            "placeholder": "",
+            "style": "IPY_MODEL_134210510d49476e959dd7d032bbdbdc",
+            "value": " 14/14 [00:00&lt;00:00, 21.41it/s]"
+          }
+        },
+        "5c7973afd79349ed997a69120d0629b2": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "5f4b9d32df8f446e858e4c289dc282f9": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_96a3bdece738481db57e811ccb74a974",
+            "max": 14,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_5c7973afd79349ed997a69120d0629b2",
+            "value": 14
+          }
+        },
+        "5f9bb065c2b74d2e8ded32e1306a7807": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_73a06bc546a64f7f99a9e4a135319dcd",
+              "IPY_MODEL_ce48deaf4d8c49cdae92bfdbb3a78df0",
+              "IPY_MODEL_4a172e8c6aa44e41a42fc1d9cf714fd0"
+            ],
+            "layout": "IPY_MODEL_0245f2604e4d49c8bd0210302746c47b"
+          }
+        },
+        "73a06bc546a64f7f99a9e4a135319dcd": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_e956dfab55084a9cbe33c8e331b511e7",
+            "placeholder": "",
+            "style": "IPY_MODEL_cb394578badd43a89850873ad2526542",
+            "value": "Generating embeddings: 100%"
+          }
+        },
+        "96a3bdece738481db57e811ccb74a974": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "abfc9aa911ce4a5ea81c7c451f08295f": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "ad073bca655540809e39f26538d2ec0d": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "af9b6ae927dd4764b9692507791bc67e": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "cb394578badd43a89850873ad2526542": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "ce48deaf4d8c49cdae92bfdbb3a78df0": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_193aef33d9184055bb9223f56d456de6",
+            "max": 108,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_abfc9aa911ce4a5ea81c7c451f08295f",
+            "value": 108
+          }
+        },
+        "df2365556ae242a2ab1a119f9a31a561": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_13b9c5395bca4c3ba21265240cb936cf",
+            "placeholder": "",
+            "style": "IPY_MODEL_47a4586384274577a726c57605e7f8d9",
+            "value": "Parsing nodes: 100%"
+          }
+        },
+        "e532ed7bfef34f67b5fcacd9534eb789": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "e7937a1bc68441a080374911a6563376": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "e956dfab55084a9cbe33c8e331b511e7": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        }
+      }
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

notebooks/12-Improve_Query.ipynb ADDED Viewed

	@@ -0,0 +1,1786 @@

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "view-in-github"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/12-Improve_Query.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-zE1h0uQV7uT"
+      },
+      "source": [
+        "# Install Packages and Setup Variables"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "QPJzr-I9XQ7l",
+        "outputId": "5d48c88b-a0a9-49ff-d788-e076d1cb4ead"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -q llama-index==0.10.11 openai==1.12.0 llama-index-finetuning llama-index-embeddings-huggingface llama-index-readers-web tiktoken==0.6.0 chromadb==0.4.22 pandas==2.2.0 html2text sentence_transformers pydantic kaleido==0.2.1"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "id": "riuXwpSPcvWC"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\""
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {
+        "id": "jIEeZzqLbz0J"
+      },
+      "outputs": [],
+      "source": [
+        "# Allows running asyncio in environments with an existing event loop, like Jupyter notebooks.\n",
+        "\n",
+        "import nest_asyncio\n",
+        "\n",
+        "nest_asyncio.apply()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Bkgi2OrYzF7q"
+      },
+      "source": [
+        "# Load a Model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "id": "9oGT6crooSSj"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/Users/louis/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+            "  from .autonotebook import tqdm as notebook_tqdm\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "llm = OpenAI(temperature=0.9, model=\"gpt-3.5-turbo-0125\", max_tokens=512)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0BwVuJXlzHVL"
+      },
+      "source": [
+        "# Create a VectoreStore"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "id": "SQP87lHczHKc"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "\n",
+        "# create client and a new collection\n",
+        "# chromadb.EphemeralClient saves data in-memory.\n",
+        "chroma_client = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = chroma_client.create_collection(\"mini-llama-articles\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "id": "zAaGcYMJzHAN"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Define a storage context object using the created vector database.\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "I9JbAzFcjkpn"
+      },
+      "source": [
+        "# Load the Dataset (CSV)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ceveDuYdWCYk"
+      },
+      "source": [
+        "## Download"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "eZwf6pv7WFmD"
+      },
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model. Read the dataset as a long string."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "wl_pbPvMlv1h",
+        "outputId": "a453b612-20a8-4396-d22b-b19d2bc47816"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+            "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+            "100  169k  100  169k    0     0   915k      0 --:--:-- --:--:-- --:--:--  911k\n"
+          ]
+        }
+      ],
+      "source": [
+        "!curl -o ./mini-llama-articles.csv https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "VWBLtDbUWJfA"
+      },
+      "source": [
+        "## Read File"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "0Q9sxuW0g3Gd",
+        "outputId": "49b27d8a-1f96-4e8d-fa0f-27afbf2c395c"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "14"
+            ]
+          },
+          "execution_count": 9,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "import csv\n",
+        "\n",
+        "rows = []\n",
+        "\n",
+        "# Load the file as a JSON\n",
+        "with open(\"./mini-llama-articles.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for idx, row in enumerate( csv_reader ):\n",
+        "    if idx == 0: continue; # Skip header row\n",
+        "    rows.append( row )\n",
+        "\n",
+        "# The number of characters in the dataset.\n",
+        "len( rows )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "S17g2RYOjmf2"
+      },
+      "source": [
+        "# Convert to Document obj"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "metadata": {
+        "id": "YizvmXPejkJE"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row[1], metadata={\"title\": row[0], \"url\": row[2], \"source_name\": row[3]}) for row in rows]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "qjuLbmFuWsyl"
+      },
+      "source": [
+        "# Transforming"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 11,
+      "metadata": {
+        "id": "9z3t70DGWsjO"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.text_splitter import TokenTextSplitter\n",
+        "\n",
+        "text_splitter = TokenTextSplitter(\n",
+        "    separator=\" \", chunk_size=512, chunk_overlap=128\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 331,
+          "referenced_widgets": [
+            "3fbabd8a8660461ba5e7bc08ef39139a",
+            "df2365556ae242a2ab1a119f9a31a561",
+            "5f4b9d32df8f446e858e4c289dc282f9",
+            "5b588f83a15d42d9aca888e06bbd95ff",
+            "ad073bca655540809e39f26538d2ec0d",
+            "13b9c5395bca4c3ba21265240cb936cf",
+            "47a4586384274577a726c57605e7f8d9",
+            "96a3bdece738481db57e811ccb74a974",
+            "5c7973afd79349ed997a69120d0629b2",
+            "af9b6ae927dd4764b9692507791bc67e",
+            "134210510d49476e959dd7d032bbdbdc",
+            "5f9bb065c2b74d2e8ded32e1306a7807",
+            "73a06bc546a64f7f99a9e4a135319dcd",
+            "ce48deaf4d8c49cdae92bfdbb3a78df0",
+            "4a172e8c6aa44e41a42fc1d9cf714fd0",
+            "0245f2604e4d49c8bd0210302746c47b",
+            "e956dfab55084a9cbe33c8e331b511e7",
+            "cb394578badd43a89850873ad2526542",
+            "193aef33d9184055bb9223f56d456de6",
+            "abfc9aa911ce4a5ea81c7c451f08295f",
+            "e7937a1bc68441a080374911a6563376",
+            "e532ed7bfef34f67b5fcacd9534eb789"
+          ]
+        },
+        "id": "P9LDJ7o-Wsc-",
+        "outputId": "01070c1f-dffa-4ab7-ad71-b07b76b12e03"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Parsing nodes:   0%|          | 0/14 [00:00<?, ?it/s]"
+          ]
+        },
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Parsing nodes: 100%|██████████| 14/14 [00:00<00:00, 28.28it/s]\n",
+            "100%|██████████| 108/108 [01:36<00:00,  1.12it/s]\n",
+            "100%|██████████| 108/108 [01:22<00:00,  1.30it/s]\n",
+            "100%|██████████| 108/108 [00:29<00:00,  3.72it/s]\n",
+            "Generating embeddings: 100%|█████████���| 108/108 [00:02<00:00, 38.77it/s]\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.extractors import (\n",
+        "    SummaryExtractor,\n",
+        "    QuestionsAnsweredExtractor,\n",
+        "    KeywordExtractor,\n",
+        ")\n",
+        "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+        "from llama_index.core.ingestion import IngestionPipeline\n",
+        "\n",
+        "pipeline = IngestionPipeline(\n",
+        "    transformations=[\n",
+        "        text_splitter,\n",
+        "        QuestionsAnsweredExtractor(questions=3, llm=llm),\n",
+        "        SummaryExtractor(summaries=[\"prev\", \"self\"], llm=llm),\n",
+        "        KeywordExtractor(keywords=10, llm=llm),\n",
+        "        OpenAIEmbedding(),\n",
+        "    ],\n",
+        "    vector_store=vector_store\n",
+        ")\n",
+        "\n",
+        "nodes = pipeline.run(documents=documents, show_progress=True);"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "mPGa85hM2P3P",
+        "outputId": "c106c463-2459-4b11-bbae-5bd5e2246011"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "108"
+            ]
+          },
+          "execution_count": 13,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "len( nodes )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {
+        "id": "23x20bL3_jRb"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "updating: mini-llama-articles/ (stored 0%)\n",
+            "updating: mini-llama-articles/chroma.sqlite3 (deflated 65%)\n",
+            "  adding: mini-llama-articles/aaac4d54-4f82-40da-b769-a6aecfa59eb0/ (stored 0%)\n",
+            "  adding: mini-llama-articles/aaac4d54-4f82-40da-b769-a6aecfa59eb0/data_level0.bin (deflated 96%)\n",
+            "  adding: mini-llama-articles/aaac4d54-4f82-40da-b769-a6aecfa59eb0/length.bin (deflated 35%)\n",
+            "  adding: mini-llama-articles/aaac4d54-4f82-40da-b769-a6aecfa59eb0/link_lists.bin (stored 0%)\n",
+            "  adding: mini-llama-articles/aaac4d54-4f82-40da-b769-a6aecfa59eb0/header.bin (deflated 61%)\n"
+          ]
+        }
+      ],
+      "source": [
+        "!zip -r vectorstore.zip mini-llama-articles"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OWaT6rL7ksp8"
+      },
+      "source": [
+        "# Load Indexes"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 16,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "SodY2Xpf_kxg",
+        "outputId": "9f8b7153-ea58-4824-8363-c47e922612a8"
+      },
+      "outputs": [],
+      "source": [
+        "# !unzip vectorstore.zip"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 17,
+      "metadata": {
+        "id": "mXi56KTXk2sp"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Create your index\n",
+        "db = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = db.get_or_create_collection(\"mini-llama-articles\")\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 18,
+      "metadata": {
+        "id": "jKXURvLtkuTS"
+      },
+      "outputs": [],
+      "source": [
+        "# Create your index\n",
+        "from llama_index.core import VectorStoreIndex\n",
+        "\n",
+        "vector_index = VectorStoreIndex.from_vector_store(vector_store)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "SLrn8A3jckmW"
+      },
+      "source": [
+        "# Multi-Step Query Engine"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "UmpfpVCje8h3"
+      },
+      "source": [
+        "## GPT-4"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 19,
+      "metadata": {
+        "id": "CaxFzDz4cRMd"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/var/folders/l7/9qcp7g5x5rl9x8ltw0t85qym0000gn/T/ipykernel_1424/226941912.py:4: DeprecationWarning: Call to deprecated class method from_defaults. (ServiceContext is deprecated, please use `llama_index.settings.Settings` instead.) -- Deprecated since version 0.10.0.\n",
+            "  service_context_gpt4 = ServiceContext.from_defaults(llm=gpt4)\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core import ServiceContext\n",
+        "\n",
+        "gpt4 = OpenAI(temperature=0, model=\"gpt-4-0125-preview\")\n",
+        "service_context_gpt4 = ServiceContext.from_defaults(llm=gpt4)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 20,
+      "metadata": {
+        "id": "8y-Ya3GyfcAk"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.indices.query.query_transform.base import StepDecomposeQueryTransform\n",
+        "\n",
+        "step_decompose_transform_gpt4 = StepDecomposeQueryTransform(llm=gpt4, verbose=True)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 21,
+      "metadata": {
+        "id": "zntXdSbGf_qF"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.query_engine.multistep_query_engine import MultiStepQueryEngine\n",
+        "\n",
+        "query_engine_gpt4 = vector_index.as_query_engine(service_context=service_context_gpt4)\n",
+        "query_engine_gpt4 = MultiStepQueryEngine(\n",
+        "    query_engine=query_engine_gpt4,\n",
+        "    query_transform=step_decompose_transform_gpt4,\n",
+        "    index_summary=\"Used to answer questions about the LLaMA2 Model\",\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8JPD8yAinVSq"
+      },
+      "source": [
+        "# Query Dataset"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "D2IByQ5-ox9U"
+      },
+      "source": [
+        "## Default"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 22,
+      "metadata": {
+        "id": "b0gue7cyctt1"
+      },
+      "outputs": [],
+      "source": [
+        "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "query_engine = vector_index.as_query_engine()\n",
+        "\n",
+        "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 23,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        },
+        "id": "VKK3jMprctre",
+        "outputId": "b6ed346c-714b-44a6-b8fa-bfaca1b38deb"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'The Llama 2 model is available in four different sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.'"
+            ]
+          },
+          "execution_count": 23,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "res.response"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 24,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "465dH4yQc7Ct",
+        "outputId": "6f7eb440-cc24-4d20-ac35-fa747265d18d"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Node ID\t 63380d3f-7aff-47cd-b2c1-e4baaed70a7e\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "Score\t 0.7167442801500137\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 77d0679b-6de4-4467-bc39-7932e18ae282\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+            "Score\t 0.6967843740521363\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "for src in res.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Title\\t\", src.metadata['title'])\n",
+        "  print(\"Text\\t\", src.text)\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "2y2AiInmpz7g"
+      },
+      "source": [
+        "## GPT-4 Multi-Step"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 25,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "69kADAFilW1n",
+        "outputId": "8a847a58-539f-4ba7-ca07-ef80ceb8b3e2"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\u001b[1;3;33m> Current query: How many parameters LLaMA2 model has?\n",
+            "\u001b[0m\u001b[1;3;38;5;200m> New query: What is the LLaMA2 Model?\n",
+            "\u001b[0m\u001b[1;3;33m> Current query: How many parameters LLaMA2 model has?\n",
+            "\u001b[0m\u001b[1;3;38;5;200m> New query: None\n",
+            "\u001b[0m"
+          ]
+        }
+      ],
+      "source": [
+        "response_gpt4 = query_engine_gpt4.query(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 26,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 35
+        },
+        "id": "_ul5p3AMldzk",
+        "outputId": "8c5cadda-8e06-4398-81bc-8571d4710b2a"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'The LLaMA2 model has parameters ranging from 7 billion to 70 billion.'"
+            ]
+          },
+          "execution_count": 26,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "response_gpt4.response"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 27,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "k5pJPBPRqjbG",
+        "outputId": "0bdd8382-8392-483d-bb6a-51e7a146eeb3"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Node ID\t 3f7709ed-985e-417f-b88d-e3eac6ae8a06\n",
+            "Text\t \n",
+            "Question: What is the LLaMA2 Model?\n",
+            "Answer: The Llama 2 model is an open-source commercial language model developed by Meta, available in different sizes ranging from 7 billion to 70 billion parameters. It is designed to be integrated into AI-powered applications for businesses, with a focus on safety considerations in its design. The model's Ghost Attention feature enhances conversational continuity, and it possesses a groundbreaking temporal capability for organizing information based on time relevance to deliver contextually accurate responses.\n",
+            "Score\t None\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 63380d3f-7aff-47cd-b2c1-e4baaed70a7e\n",
+            "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "Score\t 0.7149311149257048\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 8a4bda7f-9e2a-44ff-a59b-84f36b8b3431\n",
+            "Text\t with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering strong competition to closed-source models.  V. Ghost Attention: Enhancing Conversational Continuity One unique feature in Llama 2 is Ghost Attention, which ensures continuity in conversations. This means that even after multiple interactions, the model remembers its initial instructions, ensuring more coherent and consistent responses throughout the conversation. This feature significantly enhances the user experience and makes Llama 2 a more reliable language model for interactive applications. In the example below, on the left, it forgets to use an emoji after a few conversations. On the right, with Ghost Attention, even after having many conversations, it will remember the context and continue to use emojis in its response.  VI. Temporal Capability: A Leap in Information Organization Meta reported a groundbreaking temporal capability, where the model organizes information based on time relevance. Each question posed to the model is associated with a date, and it responds accordingly by considering the event date before which the question becomes irrelevant. For example, if you ask the question, \"How long ago did Barack Obama become president?\", its only relevant after 2008. This temporal awareness allows Llama 2 to deliver more contextually accurate responses, enriching the user experience further.  VII. Open Questions and Future Outlook Meta's open-sourcing of Llama 2 represents a seismic shift, now offering developers and researchers commercial access to a leading language model. With Llama 2 outperforming MosaicML's current MPT models, all eyes are on how Databricks will respond. Can MosaicML's next MPT iteration beat Llama 2? Is it worthwhile to compete\n",
+            "Score\t 0.714000324321046\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "for src in response_gpt4.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Text\\t\", src.text)\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "jwcSCiMhp4Uh"
+      },
+      "source": [
+        "# Test GPT-3 Multi-Step"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 28,
+      "metadata": {
+        "id": "uH9gNfZuslHK"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/var/folders/l7/9qcp7g5x5rl9x8ltw0t85qym0000gn/T/ipykernel_1424/1136257440.py:6: DeprecationWarning: Call to deprecated class method from_defaults. (ServiceContext is deprecated, please use `llama_index.settings.Settings` instead.) -- Deprecated since version 0.10.0.\n",
+            "  service_context_gpt3 = ServiceContext.from_defaults(llm=gpt3)\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core import ServiceContext\n",
+        "from llama_index.core.indices.query.query_transform.base import StepDecomposeQueryTransform\n",
+        "from llama_index.core.query_engine.multistep_query_engine import MultiStepQueryEngine\n",
+        "\n",
+        "gpt3 = OpenAI(temperature=0, model=\"gpt-3.5-turbo-0125\")\n",
+        "service_context_gpt3 = ServiceContext.from_defaults(llm=gpt3)\n",
+        "\n",
+        "step_decompose_transform_gpt3 = StepDecomposeQueryTransform(llm=gpt3, verbose=True)\n",
+        "\n",
+        "query_engine_gpt3 = vector_index.as_query_engine(service_context=service_context_gpt3)\n",
+        "query_engine_gpt3 = MultiStepQueryEngine(\n",
+        "    query_engine=query_engine_gpt3,\n",
+        "    query_transform=step_decompose_transform_gpt3,\n",
+        "    index_summary=\"Used to answer questions about the LLaMA2 Model\",\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 29,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "9s6SkHI0p6VZ",
+        "outputId": "1c87dbda-e026-4e28-f7eb-b01145c62b77"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\u001b[1;3;33m> Current query: How many parameters LLaMA2 model has?\n",
+            "\u001b[0m\u001b[1;3;38;5;200m> New query: What are the main components or features of the LLaMA2 model?\n",
+            "\u001b[0m\u001b[1;3;33m> Current query: How many parameters LLaMA2 model has?\n",
+            "\u001b[0m\u001b[1;3;38;5;200m> New query: What is the range of model sizes available for the LLaMA2 model?\n",
+            "\u001b[0m\u001b[1;3;33m> Current query: How many parameters LLaMA2 model has?\n",
+            "\u001b[0m\u001b[1;3;38;5;200m> New query: What are the safety considerations in the LLaMA2 model?\n",
+            "\u001b[0m"
+          ]
+        }
+      ],
+      "source": [
+        "response_gpt3 = query_engine_gpt3.query(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 30,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 35
+        },
+        "id": "FlgMkAhQsTIY",
+        "outputId": "0996e879-3914-44b1-cdec-e4f0b0ba7a4e"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'The LLaMA2 model has model sizes ranging from 7 billion to 70 billion parameters.'"
+            ]
+          },
+          "execution_count": 30,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "response_gpt3.response"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "DxOF2qth1gUC"
+      },
+      "source": [
+        "# Test Retriever on Multistep"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 31,
+      "metadata": {
+        "id": "In9BZbU10KAz"
+      },
+      "outputs": [],
+      "source": [
+        "import llama_index"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 32,
+      "metadata": {
+        "id": "_-fBK2g2zkKb"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.indices.query.schema import QueryBundle"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 33,
+      "metadata": {
+        "id": "wqT7mlhx1KGB"
+      },
+      "outputs": [],
+      "source": [
+        "t = QueryBundle(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 34,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 304
+        },
+        "id": "OHpa3MqXyyvd",
+        "outputId": "d9b39a47-751d-48a1-ce68-ebf0a50b938d"
+      },
+      "outputs": [
+        {
+          "ename": "NotImplementedError",
+          "evalue": "This query engine does not support retrieve, use query directly",
+          "output_type": "error",
+          "traceback": [
+            "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+            "\u001b[0;31mNotImplementedError\u001b[0m                       Traceback (most recent call last)",
+            "Cell \u001b[0;32mIn[34], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mquery_engine_gpt3\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mretrieve\u001b[49m\u001b[43m(\u001b[49m\u001b[43mt\u001b[49m\u001b[43m)\u001b[49m\n",
+            "File \u001b[0;32m~/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/llama_index/core/base/base_query_engine.py:49\u001b[0m, in \u001b[0;36mBaseQueryEngine.retrieve\u001b[0;34m(self, query_bundle)\u001b[0m\n\u001b[1;32m     48\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mretrieve\u001b[39m(\u001b[38;5;28mself\u001b[39m, query_bundle: QueryBundle) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m List[NodeWithScore]:\n\u001b[0;32m---> 49\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mNotImplementedError\u001b[39;00m(\n\u001b[1;32m     50\u001b[0m         \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mThis query engine does not support retrieve, use query directly\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m     51\u001b[0m     )\n",
+            "\u001b[0;31mNotImplementedError\u001b[0m: This query engine does not support retrieve, use query directly"
+          ]
+        }
+      ],
+      "source": [
+        "query_engine_gpt3.retrieve(t)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "FCdPwVAQ6ixg"
+      },
+      "source": [
+        "# HyDE Transform"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 35,
+      "metadata": {
+        "id": "1x6He0T961Kg"
+      },
+      "outputs": [],
+      "source": [
+        "query_engine = vector_index.as_query_engine()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 36,
+      "metadata": {
+        "id": "0GgtfeBC6m0H"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.indices.query.query_transform import HyDEQueryTransform\n",
+        "from llama_index.core.query_engine.transform_query_engine import TransformQueryEngine\n",
+        "\n",
+        "hyde = HyDEQueryTransform(include_original=True)\n",
+        "hyde_query_engine = TransformQueryEngine(query_engine, hyde)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 37,
+      "metadata": {
+        "id": "mm3nYnIE6mwl"
+      },
+      "outputs": [],
+      "source": [
+        "response = hyde_query_engine.query(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 38,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        },
+        "id": "PjTJ2poc6mt5",
+        "outputId": "32fc89c2-474d-4791-e4b0-2a1de262b571"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'The LLaMA 2 model has four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.'"
+            ]
+          },
+          "execution_count": 38,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "response.response"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 39,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "StgikqWZ6mrl",
+        "outputId": "f0552af4-524e-444b-b8cb-67a665fad474"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Node ID\t 63380d3f-7aff-47cd-b2c1-e4baaed70a7e\n",
+            "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "Score\t 0.7504822493620628\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 5f575ba5-e68d-4c50-90bf-1125c4bd51f8\n",
+            "Text\t Models Meta AI, formerly known as Facebook Artificial Intelligence Research (FAIR), is an artificial intelligence laboratory that aims to share open-source frameworks, tools, libraries, and models for research exploration and large-scale production deployment. In 2018, they released the open-source PyText, a modeling framework focused on NLP systems. Then, in August 2022, they announced the release of BlenderBot 3, a chatbot designed to improve conversational skills and safety. In November 2022, Meta developed a large language model called Galactica, which assists scientists with tasks such as summarizing academic papers and annotating molecules and proteins. Released in February 2023, LLaMA (Large Language Model Meta AI) is a transformer-based foundational large language model by Meta that ventures into both the AI and academic spaces. The model aims to help researchers, scientists, and engineers advance their work in exploring AI applications. It will be released under a non-commercial license to prevent misuse, and access will be granted to academic researchers, individuals, and organizations affiliated with the government, civil society, academia, and industry research facilities on a selective case-by-case basis. The sharing of codes and weights allows other researchers to test new approaches in LLMs. The LLaMA models have a range of 7 billion to 65 billion parameters. LLaMA-65B can be compared to DeepMind's Chinchilla and Google's PaLM. Publicly available unlabeled data was used to train these models, and training smaller foundational models require less computing power and resources. LLaMA 65B and 33B have been trained on 1.4 trillion tokens in 20 different languages, and according to the Facebook Artificial Intelligence Research (FAIR) team, the model's performance varies across languages. The data sources used for training included CCNet (67%), GitHub, Wikipedia, ArXiv, Stack Exchange, and books. LLaMA, like other large scale language models, has issues related to biased & toxic generation and hallucination.  6. Eleuther's GPT-Neo Models Founded in July 2020 by Connor Leahy, Sid Black, and Leo Gao, EleutherAI is a non-profit AI research lab\n",
+            "Score\t 0.7375396701691563\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "for src in response.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Text\\t\", src.text)\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 40,
+      "metadata": {
+        "id": "17Jbo1FH6mjH"
+      },
+      "outputs": [],
+      "source": [
+        "query_bundle = hyde(\"How many parameters LLaMA2 model has?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 41,
+      "metadata": {
+        "id": "UZEK63K77W7X"
+      },
+      "outputs": [],
+      "source": [
+        "hyde_doc = query_bundle.embedding_strs[0]"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 42,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 214
+        },
+        "id": "wyzwkpSn7Yi1",
+        "outputId": "9b03f8dc-a26e-45e4-eec1-22366bd68dd2"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "'The LLaMA2 model has a total of 12 parameters. These parameters include the weights and biases of the neural network layers, as well as the hyperparameters such as learning rate, batch size, and number of epochs. Additionally, the model may also include regularization parameters such as L1 or L2 regularization coefficients. Overall, these parameters are crucial in determining the performance and behavior of the LLaMA2 model in various machine learning tasks.\"'"
+            ]
+          },
+          "execution_count": 42,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "hyde_doc"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": []
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "authorship_tag": "ABX9TyMcBonOXFUEEHJsKREchiOp",
+      "include_colab_link": true,
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.8"
+    },
+    "widgets": {
+      "application/vnd.jupyter.widget-state+json": {
+        "0245f2604e4d49c8bd0210302746c47b": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "134210510d49476e959dd7d032bbdbdc": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "13b9c5395bca4c3ba21265240cb936cf": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "193aef33d9184055bb9223f56d456de6": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "3fbabd8a8660461ba5e7bc08ef39139a": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_df2365556ae242a2ab1a119f9a31a561",
+              "IPY_MODEL_5f4b9d32df8f446e858e4c289dc282f9",
+              "IPY_MODEL_5b588f83a15d42d9aca888e06bbd95ff"
+            ],
+            "layout": "IPY_MODEL_ad073bca655540809e39f26538d2ec0d"
+          }
+        },
+        "47a4586384274577a726c57605e7f8d9": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "4a172e8c6aa44e41a42fc1d9cf714fd0": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_e7937a1bc68441a080374911a6563376",
+            "placeholder": "",
+            "style": "IPY_MODEL_e532ed7bfef34f67b5fcacd9534eb789",
+            "value": " 108/108 [00:03&lt;00:00, 33.70it/s]"
+          }
+        },
+        "5b588f83a15d42d9aca888e06bbd95ff": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_af9b6ae927dd4764b9692507791bc67e",
+            "placeholder": "",
+            "style": "IPY_MODEL_134210510d49476e959dd7d032bbdbdc",
+            "value": " 14/14 [00:00&lt;00:00, 21.41it/s]"
+          }
+        },
+        "5c7973afd79349ed997a69120d0629b2": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "5f4b9d32df8f446e858e4c289dc282f9": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_96a3bdece738481db57e811ccb74a974",
+            "max": 14,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_5c7973afd79349ed997a69120d0629b2",
+            "value": 14
+          }
+        },
+        "5f9bb065c2b74d2e8ded32e1306a7807": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_73a06bc546a64f7f99a9e4a135319dcd",
+              "IPY_MODEL_ce48deaf4d8c49cdae92bfdbb3a78df0",
+              "IPY_MODEL_4a172e8c6aa44e41a42fc1d9cf714fd0"
+            ],
+            "layout": "IPY_MODEL_0245f2604e4d49c8bd0210302746c47b"
+          }
+        },
+        "73a06bc546a64f7f99a9e4a135319dcd": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_e956dfab55084a9cbe33c8e331b511e7",
+            "placeholder": "",
+            "style": "IPY_MODEL_cb394578badd43a89850873ad2526542",
+            "value": "Generating embeddings: 100%"
+          }
+        },
+        "96a3bdece738481db57e811ccb74a974": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "abfc9aa911ce4a5ea81c7c451f08295f": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "ad073bca655540809e39f26538d2ec0d": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "af9b6ae927dd4764b9692507791bc67e": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "cb394578badd43a89850873ad2526542": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "ce48deaf4d8c49cdae92bfdbb3a78df0": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_193aef33d9184055bb9223f56d456de6",
+            "max": 108,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_abfc9aa911ce4a5ea81c7c451f08295f",
+            "value": 108
+          }
+        },
+        "df2365556ae242a2ab1a119f9a31a561": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_13b9c5395bca4c3ba21265240cb936cf",
+            "placeholder": "",
+            "style": "IPY_MODEL_47a4586384274577a726c57605e7f8d9",
+            "value": "Parsing nodes: 100%"
+          }
+        },
+        "e532ed7bfef34f67b5fcacd9534eb789": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "e7937a1bc68441a080374911a6563376": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "e956dfab55084a9cbe33c8e331b511e7": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        }
+      }
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

notebooks/13-Adding_Router.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

notebooks/14-Adding_Chat.ipynb ADDED Viewed

	@@ -0,0 +1,1618 @@

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "colab_type": "text",
+        "id": "view-in-github"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/14-Adding_Chat.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-zE1h0uQV7uT"
+      },
+      "source": [
+        "# Install Packages and Setup Variables"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "QPJzr-I9XQ7l",
+        "outputId": "19864102-680b-446b-fb38-7fad066cee09"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -q llama-index==0.10.11 openai==1.12.0 llama-index-finetuning llama-index-embeddings-huggingface llama-index-readers-web tiktoken==0.6.0 chromadb==0.4.22 pandas==2.2.0 html2text sentence_transformers pydantic kaleido==0.2.1"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {
+        "id": "riuXwpSPcvWC"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_KEY>\""
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {
+        "id": "jIEeZzqLbz0J"
+      },
+      "outputs": [],
+      "source": [
+        "# Allows running asyncio in environments with an existing event loop, like Jupyter notebooks.\n",
+        "\n",
+        "import nest_asyncio\n",
+        "\n",
+        "nest_asyncio.apply()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Bkgi2OrYzF7q"
+      },
+      "source": [
+        "# Load a Model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "id": "9oGT6crooSSj"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/Users/louis/Documents/GitHub/ai-tutor-rag-system/.conda/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+            "  from .autonotebook import tqdm as notebook_tqdm\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "llm = OpenAI(temperature=0, model=\"gpt-3.5-turbo-0125\", max_tokens=512)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0BwVuJXlzHVL"
+      },
+      "source": [
+        "# Create a VectoreStore"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "id": "SQP87lHczHKc"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "\n",
+        "# create client and a new collection\n",
+        "# chromadb.EphemeralClient saves data in-memory.\n",
+        "chroma_client = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = chroma_client.create_collection(\"mini-llama-articles\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "id": "zAaGcYMJzHAN"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Define a storage context object using the created vector database.\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "I9JbAzFcjkpn"
+      },
+      "source": [
+        "# Load the Dataset (CSV)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ceveDuYdWCYk"
+      },
+      "source": [
+        "## Download"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "eZwf6pv7WFmD"
+      },
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model. Read the dataset as a long string."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "wl_pbPvMlv1h",
+        "outputId": "5418de57-b95b-4b90-b7d0-a801ea3c73f7"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+            "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+            "100  169k  100  169k    0     0   784k      0 --:--:-- --:--:-- --:--:--  785k\n"
+          ]
+        }
+      ],
+      "source": [
+        "!curl -o ./mini-llama-articles.csv https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "VWBLtDbUWJfA"
+      },
+      "source": [
+        "## Read File"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "0Q9sxuW0g3Gd",
+        "outputId": "801f2ba8-b498-4923-c1cc-c17d3208850c"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "14"
+            ]
+          },
+          "execution_count": 7,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "import csv\n",
+        "\n",
+        "rows = []\n",
+        "\n",
+        "# Load the file as a JSON\n",
+        "with open(\"./mini-llama-articles.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for idx, row in enumerate( csv_reader ):\n",
+        "    if idx == 0: continue; # Skip header row\n",
+        "    rows.append( row )\n",
+        "\n",
+        "# The number of characters in the dataset.\n",
+        "len( rows )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "S17g2RYOjmf2"
+      },
+      "source": [
+        "# Convert to Document obj"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {
+        "id": "YizvmXPejkJE"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row[1], metadata={\"title\": row[0], \"url\": row[2], \"source_name\": row[3]}) for row in rows]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "qjuLbmFuWsyl"
+      },
+      "source": [
+        "# Transforming"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "id": "9z3t70DGWsjO"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core.text_splitter import TokenTextSplitter\n",
+        "\n",
+        "# Define the splitter object that split the text into segments with 512 tokens,\n",
+        "# with a 128 overlap between the segments.\n",
+        "text_splitter = TokenTextSplitter(\n",
+        "    separator=\" \", chunk_size=512, chunk_overlap=128\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 331,
+          "referenced_widgets": [
+            "3fbabd8a8660461ba5e7bc08ef39139a",
+            "df2365556ae242a2ab1a119f9a31a561",
+            "5f4b9d32df8f446e858e4c289dc282f9",
+            "5b588f83a15d42d9aca888e06bbd95ff",
+            "ad073bca655540809e39f26538d2ec0d",
+            "13b9c5395bca4c3ba21265240cb936cf",
+            "47a4586384274577a726c57605e7f8d9",
+            "96a3bdece738481db57e811ccb74a974",
+            "5c7973afd79349ed997a69120d0629b2",
+            "af9b6ae927dd4764b9692507791bc67e",
+            "134210510d49476e959dd7d032bbdbdc",
+            "5f9bb065c2b74d2e8ded32e1306a7807",
+            "73a06bc546a64f7f99a9e4a135319dcd",
+            "ce48deaf4d8c49cdae92bfdbb3a78df0",
+            "4a172e8c6aa44e41a42fc1d9cf714fd0",
+            "0245f2604e4d49c8bd0210302746c47b",
+            "e956dfab55084a9cbe33c8e331b511e7",
+            "cb394578badd43a89850873ad2526542",
+            "193aef33d9184055bb9223f56d456de6",
+            "abfc9aa911ce4a5ea81c7c451f08295f",
+            "e7937a1bc68441a080374911a6563376",
+            "e532ed7bfef34f67b5fcacd9534eb789"
+          ]
+        },
+        "id": "P9LDJ7o-Wsc-",
+        "outputId": "01070c1f-dffa-4ab7-ad71-b07b76b12e03"
+      },
+      "outputs": [
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Parsing nodes:   0%|          | 0/14 [00:00<?, ?it/s]"
+          ]
+        },
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "Parsing nodes: 100%|██████████| 14/14 [00:00<00:00, 28.48it/s]\n",
+            "100%|██████████| 108/108 [00:59<00:00,  1.82it/s]\n",
+            "100%|██████████| 108/108 [01:46<00:00,  1.02it/s]\n",
+            "100%|██████████| 108/108 [00:28<00:00,  3.75it/s]\n",
+            "Generating embeddings: 100%|██████████| 108/108 [00:01<00:00, 59.76it/s]\n"
+          ]
+        }
+      ],
+      "source": [
+        "from llama_index.core.extractors import (\n",
+        "    SummaryExtractor,\n",
+        "    QuestionsAnsweredExtractor,\n",
+        "    KeywordExtractor,\n",
+        ")\n",
+        "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+        "from llama_index.core.ingestion import IngestionPipeline\n",
+        "\n",
+        "# Create the pipeline to apply the transformation on each chunk,\n",
+        "# and store the transformed text in the chroma vector store.\n",
+        "pipeline = IngestionPipeline(\n",
+        "    transformations=[\n",
+        "        text_splitter,\n",
+        "        QuestionsAnsweredExtractor(questions=3, llm=llm),\n",
+        "        SummaryExtractor(summaries=[\"prev\", \"self\"], llm=llm),\n",
+        "        KeywordExtractor(keywords=10, llm=llm),\n",
+        "        OpenAIEmbedding(),\n",
+        "    ],\n",
+        "    vector_store=vector_store\n",
+        ")\n",
+        "\n",
+        "nodes = pipeline.run(documents=documents, show_progress=True);"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 11,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "mPGa85hM2P3P",
+        "outputId": "c106c463-2459-4b11-bbae-5bd5e2246011"
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "108"
+            ]
+          },
+          "execution_count": 11,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "len( nodes )"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "metadata": {
+        "id": "23x20bL3_jRb"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "updating: mini-llama-articles/ (stored 0%)\n",
+            "updating: mini-llama-articles/chroma.sqlite3 (deflated 65%)\n",
+            "  adding: mini-llama-articles/1a47984b-079a-4e72-809a-387c43e980b6/ (stored 0%)\n",
+            "  adding: mini-llama-articles/1a47984b-079a-4e72-809a-387c43e980b6/data_level0.bin (deflated 100%)\n",
+            "  adding: mini-llama-articles/1a47984b-079a-4e72-809a-387c43e980b6/length.bin (deflated 63%)\n",
+            "  adding: mini-llama-articles/1a47984b-079a-4e72-809a-387c43e980b6/link_lists.bin (stored 0%)\n",
+            "  adding: mini-llama-articles/1a47984b-079a-4e72-809a-387c43e980b6/header.bin (deflated 61%)\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Compress the vector store directory to a zip file to be able to download and use later.\n",
+        "!zip -r vectorstore.zip mini-llama-articles"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OWaT6rL7ksp8"
+      },
+      "source": [
+        "# Load Indexes"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BLkmv3Yxp9mu"
+      },
+      "source": [
+        "If you have already uploaded the zip file for the vector store checkpoint, please uncomment the code in the following cell block to extract its contents. After doing so, you will be able to load the dataset from local storage."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "SodY2Xpf_kxg",
+        "outputId": "a6f7ae4a-447c-4222-e400-0fe55e7e26d9"
+      },
+      "outputs": [],
+      "source": [
+        "# !unzip vectorstore.zip"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {
+        "id": "mXi56KTXk2sp"
+      },
+      "outputs": [],
+      "source": [
+        "import chromadb\n",
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Load the vector store from the local storage.\n",
+        "db = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = db.get_or_create_collection(\"mini-llama-articles\")\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 15,
+      "metadata": {
+        "id": "jKXURvLtkuTS"
+      },
+      "outputs": [],
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "\n",
+        "# Create the index based on the vector store.\n",
+        "vector_index = VectorStoreIndex.from_vector_store(vector_store)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "q0m5rl195bcz"
+      },
+      "source": [
+        "# Disply result"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 16,
+      "metadata": {
+        "id": "4JpaHEmF5dSS"
+      },
+      "outputs": [],
+      "source": [
+        "# A simple function to show the response and the sources.\n",
+        "def display_res(response):\n",
+        "  print(\"Response:\\n\\t\", response.response.replace(\"\\n\", \"\") )\n",
+        "\n",
+        "  print(\"Sources:\")\n",
+        "  if response.source_nodes:\n",
+        "    for src in response.source_nodes:\n",
+        "      print(\"\\tNode ID\\t\", src.node_id)\n",
+        "      print(\"\\tText\\t\", src.text)\n",
+        "      print(\"\\tScore\\t\", src.score)\n",
+        "      print(\"\\t\" + \"-_\"*20)\n",
+        "  else:\n",
+        "    print(\"\\tNo sources used!\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "hbStjvUJ1cft"
+      },
+      "source": [
+        "# Chat Engine"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 17,
+      "metadata": {
+        "id": "kwWlDpoR1cRI"
+      },
+      "outputs": [],
+      "source": [
+        "# define the chat_engine by using the index\n",
+        "chat_engine = vector_index.as_chat_engine() #chat_mode=\"best\""
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 18,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "ER3Lb-oN46lJ",
+        "outputId": "8b34da39-622f-43f2-cb45-01a1ff37efd7"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Response:\n",
+            "\t The LLaMA2 model has four different model sizes with varying parameters: 7 billion, 13 billion, 34 billion, and 70 billion parameters.\n",
+            "Sources:\n",
+            "\tNode ID\t c3239b40-e206-4a80-b020-eea87cf471cc\n",
+            "\tText\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "\tScore\t 0.7031083612095066\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "\tNode ID\t bc123b3d-b031-4c09-9400-d60ba9a161d6\n",
+            "\tText\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+            "\tScore\t 0.7004323686791223\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "# First Question:\n",
+        "response = chat_engine.chat(\"Use the tool to answer, How many parameters LLaMA2 model has?\")\n",
+        "display_res(response)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 19,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "3RRmiJEQ5R1Q",
+        "outputId": "15efcc9b-583f-4efe-8e36-fa8b5160da16"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Response:\n",
+            "\t Why did the scarecrow win an award? Because he was outstanding in his field!\n",
+            "Sources:\n",
+            "\tNode ID\t 8685e48d-1fdb-4f55-8f62-6f2ea4cfaf5d\n",
+            "\tText\t with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering strong competition to closed-source models.  V. Ghost Attention: Enhancing Conversational Continuity One unique feature in Llama 2 is Ghost Attention, which ensures continuity in conversations. This means that even after multiple interactions, the model remembers its initial instructions, ensuring more coherent and consistent responses throughout the conversation. This feature significantly enhances the user experience and makes Llama 2 a more reliable language model for interactive applications. In the example below, on the left, it forgets to use an emoji after a few conversations. On the right, with Ghost Attention, even after having many conversations, it will remember the context and continue to use emojis in its response.  VI. Temporal Capability: A Leap in Information Organization Meta reported a groundbreaking temporal capability, where the model organizes information based on time relevance. Each question posed to the model is associated with a date, and it responds accordingly by considering the event date before which the question becomes irrelevant. For example, if you ask the question, \"How long ago did Barack Obama become president?\", its only relevant after 2008. This temporal awareness allows Llama 2 to deliver more contextually accurate responses, enriching the user experience further.  VII. Open Questions and Future Outlook Meta's open-sourcing of Llama 2 represents a seismic shift, now offering developers and researchers commercial access to a leading language model. With Llama 2 outperforming MosaicML's current MPT models, all eyes are on how Databricks will respond. Can MosaicML's next MPT iteration beat Llama 2? Is it worthwhile to compete\n",
+            "\tScore\t 0.5624851990178006\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "\tNode ID\t e03eb322-1360-4c76-b461-236ec8312de1\n",
+            "\tText\t Introduce GPT4All GPT4All is a large language model (LLM) chatbot developed by Nomic AI, the world's first information cartography company. It was fine-tuned from LLaMA 7B model, the leaked large language model from Meta (aka Facebook). GPT4All is trained on a massive dataset of text and code, and it can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way. GPT4All is available to the public on GitHub. LLaMA is available for commercial use under the GPL-3.0 license - while the LLaMA code is available for commercial use, the WEIGHTS are not. This effectively puts it in the same license class as GPT4All. Nomic is working on a GPT-J-based version of GPT4All with an open commercial license. GPT4All is not going to have a subscription fee ever. GPT4All is Free4All. Although GPT4All is still in its early stages, it has already left a notable mark on the AI landscape. Its popularity and capabilities are expected to expand further in the future.  How to Run GPT4All Locally GPT4All Readme provides some details about its usage. Here will briefly demonstrate to run GPT4All locally on M1 CPU Mac. Download gpt4all-lora-quantized.bin from the-eye.Clone this repository, navigate to chat, and place the downloaded file there. Simply run the following command for M1 Mac: Now, it's ready to run locally. Please see a few snapshots below: Similar to ChatGPT, GPT4All has the ability to comprehend Chinese, a feature that Bard lacks. If you want to interact with GPT4All programmatically, you can install the nomic client as follows. Install the nomic client using pip install nomic.Use the following Python script to interact with GPT4All:  Chat4All Demystified GPT4All aims to provide a cost-effective and fine-tuned model for high-quality LLM results. The GPT4All model was fine-tuned using an instance of LLaMA\n",
+            "\tScore\t 0.5615408071202241\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Second Question:\n",
+        "response = chat_engine.chat(\"Tell me a joke?\")\n",
+        "display_res(response)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 20,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "8eOzp5Xc5Vbj",
+        "outputId": "13bc6714-dd89-45b3-a86b-759806245241"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Response:\n",
+            "\t The first question you asked was \"How many parameters LLaMA2 model has?\"\n",
+            "Sources:\n",
+            "\tNo sources used!\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Third Question: (check if it can recall previous interactions)\n",
+        "response = chat_engine.chat(\"What was the first question I asked?\")\n",
+        "display_res(response)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 21,
+      "metadata": {
+        "id": "7jfiLpru5VZT"
+      },
+      "outputs": [],
+      "source": [
+        "# Reset the session to clear the memory\n",
+        "chat_engine.reset()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 22,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "Jt0q8RW25VXN",
+        "outputId": "0e2d0d4e-c0ff-48bf-8df3-478fcdc66abd"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Response:\n",
+            "\t The first question you asked was \"How can a Q&A bot be built over private documents using OpenAI and LangChain?\"\n",
+            "Sources:\n",
+            "\tNode ID\t baa8a99c-f38b-4818-b854-5741598c0776\n",
+            "\tText\t Private data to be used The example provided can be used with any dataset. I am using a data set that has Analyst recommendations from various stocks. For the purpose of demonstration, I have gathered publicly available analyst recommendations to showcase its capabilities. You can replace this with your own information to try this. Below is a partial extract of the information commonly found in these documents. If you wish to try it yourself, you can download analyst recommendations for your preferred stocks from online sources or access them through subscription platforms like Barron's. Although the example provided focuses on analyst recommendations, the underlying structure can be utilized to query various other types of documents in any industry as well. I have assembled such data for a few stocks for demonstration purposes. This includes Google, Microsoft, Meta, and Tesla. To facilitate easy access and updating of analysts' recommendations, all the recommendations can be organized into a designated folder. Each stock corresponds to a separate file within this folder. For example, if there are recommendations for 20 stocks, there will be 20 individual files. This organization enables convenient updating of information for each stock as new recommendations arrive, streamlining the process of managing and maintaining the most up-to-date data for each stock.  Questions this Q&A bot application can answer The data we have for this application is stock market analyst recommendations for many stocks. Let's say you are looking for insight about Microsoft stock. You can ask any of the following questions as an example: What is the median target price for Microsoft (MSFT)?What is the highest price estimate for Microsoft (MSFT)?What is the lowest price estimate for Microsoft (MSFT)?How much percentage increase is expected in the stock price of Microsoft (MSFT)?How many analysts provided price forecasts for Microsoft (MSFT)?What is the current consensus among investment analysts regarding Microsoft (MSFT)?Has the consensus rating for Microsoft (MSFT) changed recently?When was the consensus rating last updated for Microsoft (MSFT)?Is the current recommendation for Microsoft (MSFT) to buy, sell, or hold the stock?Are there any recent analyst reports available for Microsoft (MSFT)? These questions cover various aspects of the stock analysis, including price forecasts, analyst recommendations, and recent changes in ratings. The\n",
+            "\tScore\t 0.5990934490336279\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "\tNode ID\t d03de2fa-70aa-4b32-8760-3af0dd0ebb24\n",
+            "\tText\t you want to specify exact documents, you can do it the following way. To load the files you want to ingest, you can specify the path to each file individually. The loaded files can then be saved into a list. This list serves as the input that is sent to the vector database to store the data. The alternative approach is a more versatile method in which we can load all pertinent documents from a designated folder and store the file locations in a list for subsequent processing. This approach offers flexibility and allows for the efficient handling of multiple documents by capturing their locations in a centralized list, enabling seamless data retrieval and analysis.  Load the documents into the vector store. When dealing with a vast number of documents, it becomes inefficient to send all documents (analyst recommendations) to your large language model (LLM) when seeking answers to specific questions. For instance, if your question pertains to MSFT, it would be more cost-effective to only send document extracts that reference MSFT to your LLM for answering the question. This approach helps optimize resource utilization. To achieve this, all documents are split into chunks and stored in a vector database in a numeric format (embeddings). When a new question is posed, the system queries the vector database for relevant text chunks related to this question, which is then shared with the LLM to generate an appropriate response. Within the LangChain framework, the VectorstoreIndexCreator class serves as a utility for creating a vector store index. This index stores vector representations of the documents (in chromadb), enabling various text operations, such as finding similar documents based on a specific question. When a user asks a question, a similarity search is performed in the vector store to get document chunks relevant to the question. The question, along with the chunks are sent to OpenAI to get the response back. Now we are ready to query these documents.  Setting up the web application The application is presented in the browser using Streamlit, providing a user-friendly interface. Within the application, a text box is available for users to enter their questions. Upon submitting the question by pressing enter, the application processes the input and generates a corresponding response. This response is then displayed below the text box, allowing users to conveniently view the relevant\n",
+            "\tScore\t 0.5904441993661576\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Fourth Question: (don't recall the previous interactions.)\n",
+        "response = chat_engine.chat(\"What was the first question I asked?\")\n",
+        "display_res(response)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0Egsib7yPJGR"
+      },
+      "source": [
+        "# Streaming"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 23,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "zanJeMbaPJcq",
+        "outputId": "de7f0905-c1b1-49ac-fb66-d1578da35cad"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Here is a paragraph about the LLaMA2 model's capabilities:\n",
+            "\n",
+            "\"The Llama 2 model showcases impressive capabilities in the realm of open-source language models. It introduces innovative features like Ghost Attention, which enhances conversational continuity by ensuring consistent responses throughout interactions. Additionally, Llama 2 boasts a groundbreaking temporal capability that organizes information based on time relevance, leading to more contextually accurate responses. Despite facing challenges in coding and math problems compared to larger models like Chat GPT 4, Llama 2 demonstrates efficiency and potential in the market, competing well with both open-source and closed-source models. Its ability to balance helpfulness and safety in optimizing responses further solidifies its position as a reliable and advanced language model for commercial use.\""
+          ]
+        }
+      ],
+      "source": [
+        "# Stream the words as soon as they are available instead of waiting for the model to finish generation.\n",
+        "streaming_response = chat_engine.stream_chat(\"Write a paragraph about the LLaMA2 model's capabilities.\")\n",
+        "for token in streaming_response.response_gen:\n",
+        "    print(token, end=\"\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "DuRgOJ2AHMJh"
+      },
+      "source": [
+        "## Condense Question"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Yb2Lt41jq145"
+      },
+      "source": [
+        "Enhance the input prompt by looking at the previous chat history along with the present question. The refined prompt can then be used to fetch the nodes."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 24,
+      "metadata": {
+        "id": "v0gmM5LGIaRl"
+      },
+      "outputs": [],
+      "source": [
+        "# Define GPT-4 model that will be used by the chat_engine to improve the query.\n",
+        "gpt4 = OpenAI(temperature=0.9, model=\"gpt-4-0125-preview\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 25,
+      "metadata": {
+        "id": "EDWsaBTBIhK7"
+      },
+      "outputs": [],
+      "source": [
+        "chat_engine = vector_index.as_chat_engine(chat_mode=\"condense_question\", llm=gpt4, verbose=True)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 26,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "h4c--hJ75VU2",
+        "outputId": "e80fd9bf-e6d5-4532-8771-8cbf781e782e"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Querying with: Using the tool at your disposal, can you please determine which company released the LLaMA2 model and explain what specific functionality or purpose this model is known for?\n",
+            "Response:\n",
+            "\t The LLaMA2 model was released by Meta. The model is known for its temporal awareness feature which enhances the accuracy of its responses by delivering more contextually accurate responses based on time relevance. For example, for the question, \"How long ago did Barack Obama become president?\", it only considers information relevant after 2008. Meta's open-sourcing of LLaMA2 provides developers and researchers with commercial access to the advanced language model, which represents a significant shift in the AI industry.\n",
+            "Sources:\n",
+            "\tNode ID\t 7adec56f-6714-4376-8ebf-180b694c4d59\n",
+            "\tText\t LLaMA: Meta's new AI tool According to the official release, LLaMA is a foundational language model developed to assist 'researchers and academics' in their work (as opposed to the average web user) to understand and study these NLP models. Leveraging AI in such a way could give researchers an edge in terms of time spent. You may not know this, but this would be Meta's third LLM after Blender Bot 3 and Galactica. However, the two LLMs were shut down soon, and Meta stopped their further development, as it produced erroneous results. Before moving further, it is important to emphasize that LLaMA is NOT a chatbot like ChatGPT. As I mentioned before, it is a 'research tool' for researchers. We can expect the initial versions of LLaMA to be a bit more technical and indirect to use as opposed to the case with ChatGPT, which was very direct, interactive, and a lot easy to use. \"Smaller, more performant models such as LLaMA enable ... research community who don't have access to large amounts of infrastructure to study these models.. further democratizing access in this important, fast-changing field,\" said Meta in its official blog. Meta's effort of \"democratizing\" access to the public could shed light on one of the critical issues of Generative AI - toxicity and bias. ChatGPT and other LLMs (obviously, I am referring to Bing) have a track record of responding in a way that is toxic and, well... evil. The Verge and major critics have covered it in much detail. Oh and the community did get the access, but not in the way Meta anticipated. On March 3rd, a downloadable torrent of the LLaMA system was posted on 4chan. 4chan is an anonymous online forum known for its controversial content and diverse range of discussions, which has nearly 222 million unique monthly visitors. LLaMA is currently not in use on any of Meta's products. But Meta has plans to make it available to researchers before they can use them in their own products. It's worth mentioning that Meta did not release\n",
+            "\tScore\t 0.696738130166742\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "\tNode ID\t d19bbfc9-9ff8-4c21-ba5a-ef78e5db2d87\n",
+            "\tText\t the question, \"How long ago did Barack Obama become president?\", its only relevant after 2008. This temporal awareness allows Llama 2 to deliver more contextually accurate responses, enriching the user experience further.  VII. Open Questions and Future Outlook Meta's open-sourcing of Llama 2 represents a seismic shift, now offering developers and researchers commercial access to a leading language model. With Llama 2 outperforming MosaicML's current MPT models, all eyes are on how Databricks will respond. Can MosaicML's next MPT iteration beat Llama 2? Is it worthwhile to compete with Llama 2 or join hands with the open-source community to make the open-source models better? Meanwhile, Microsoft's move to host Llama 2 on Azure despite having significant investment in ChatGPT raises interesting questions. Will users prefer the capabilities and transparency of an open-source model like Llama 2 over closed, proprietary options? The stakes are high, as Meta's bold democratization play stands to reshape preferences and partnerships in the AI space. One thing is certain - the era of open language model competition has begun.  VIII. Conclusion With the launch of Llama 2, Meta has achieved a landmark breakthrough in open-source language models, unleashing new potential through its commercial accessibility. Llama 2's formidable capabilities in natural language processing, along with robust safety protocols and temporal reasoning, set new benchmarks for the field. While select limitations around math and coding exist presently, Llama 2's strengths far outweigh its weaknesses. As Meta continues honing Llama technology, this latest innovation promises to be truly transformative. By open-sourcing such an advanced model, Meta is propelling democratization and proliferation of AI across industries. From healthcare to education and beyond, Llama 2 stands to shape the landscape by putting groundbreaking language modeling into the hands of all developers and researchers. The possibilities unlocked by this open-source approach signal a shift towards a more collaborative, creative AI future.\n",
+            "\tScore\t 0.692383316770113\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "response = chat_engine.chat(\"Use the tool to answer, which company released LLaMA2 model? What is the model useful for?\")\n",
+        "display_res(response)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ysL9ONePOsGB"
+      },
+      "source": [
+        "## REACT"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "KiEFmxAtrmF-"
+      },
+      "source": [
+        "ReAct is an agent-based chat mode that uses a loop to decide on querying a data engine during interactions, offering flexibility but relying on the Large Language Model's quality for effective responses, requiring careful management to avoid inaccurate answers."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 27,
+      "metadata": {
+        "id": "-M1jWoKXOs2t"
+      },
+      "outputs": [],
+      "source": [
+        "chat_engine = vector_index.as_chat_engine(chat_mode=\"react\", verbose=True)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 28,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "UZkEW1SSOs0H",
+        "outputId": "4869c5fc-e0e1-44c6-e7f0-87db92bb2eb6"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Added user message to memory: Which company released LLaMA2 model? What is the model useful for?\n",
+            "=== Calling Function ===\n",
+            "Calling function: query_engine_tool with args: {\"input\": \"Which company released LLaMA2 model?\"}\n",
+            "Got output: Meta released the LLaMA2 model.\n",
+            "========================\n",
+            "\n",
+            "=== Calling Function ===\n",
+            "Calling function: query_engine_tool with args: {\"input\": \"What is the LLaMA2 model useful for?\"}\n",
+            "Got output: The Llama 2 model is useful for businesses to integrate into products to create AI-powered applications.\n",
+            "========================\n",
+            "\n"
+          ]
+        }
+      ],
+      "source": [
+        "response = chat_engine.chat(\"Which company released LLaMA2 model? What is the model useful for?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 29,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "eW5P1lD4Osxf",
+        "outputId": "b128bc94-081b-49aa-c549-7d7d7be90b63"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Response:\n",
+            "\t The LLaMA2 model was released by Meta. It is useful for businesses to integrate into products to create AI-powered applications.\n",
+            "Sources:\n",
+            "\tNode ID\t 7adec56f-6714-4376-8ebf-180b694c4d59\n",
+            "\tText\t LLaMA: Meta's new AI tool According to the official release, LLaMA is a foundational language model developed to assist 'researchers and academics' in their work (as opposed to the average web user) to understand and study these NLP models. Leveraging AI in such a way could give researchers an edge in terms of time spent. You may not know this, but this would be Meta's third LLM after Blender Bot 3 and Galactica. However, the two LLMs were shut down soon, and Meta stopped their further development, as it produced erroneous results. Before moving further, it is important to emphasize that LLaMA is NOT a chatbot like ChatGPT. As I mentioned before, it is a 'research tool' for researchers. We can expect the initial versions of LLaMA to be a bit more technical and indirect to use as opposed to the case with ChatGPT, which was very direct, interactive, and a lot easy to use. \"Smaller, more performant models such as LLaMA enable ... research community who don't have access to large amounts of infrastructure to study these models.. further democratizing access in this important, fast-changing field,\" said Meta in its official blog. Meta's effort of \"democratizing\" access to the public could shed light on one of the critical issues of Generative AI - toxicity and bias. ChatGPT and other LLMs (obviously, I am referring to Bing) have a track record of responding in a way that is toxic and, well... evil. The Verge and major critics have covered it in much detail. Oh and the community did get the access, but not in the way Meta anticipated. On March 3rd, a downloadable torrent of the LLaMA system was posted on 4chan. 4chan is an anonymous online forum known for its controversial content and diverse range of discussions, which has nearly 222 million unique monthly visitors. LLaMA is currently not in use on any of Meta's products. But Meta has plans to make it available to researchers before they can use them in their own products. It's worth mentioning that Meta did not release\n",
+            "\tScore\t 0.6701682333186606\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "\tNode ID\t d19bbfc9-9ff8-4c21-ba5a-ef78e5db2d87\n",
+            "\tText\t the question, \"How long ago did Barack Obama become president?\", its only relevant after 2008. This temporal awareness allows Llama 2 to deliver more contextually accurate responses, enriching the user experience further.  VII. Open Questions and Future Outlook Meta's open-sourcing of Llama 2 represents a seismic shift, now offering developers and researchers commercial access to a leading language model. With Llama 2 outperforming MosaicML's current MPT models, all eyes are on how Databricks will respond. Can MosaicML's next MPT iteration beat Llama 2? Is it worthwhile to compete with Llama 2 or join hands with the open-source community to make the open-source models better? Meanwhile, Microsoft's move to host Llama 2 on Azure despite having significant investment in ChatGPT raises interesting questions. Will users prefer the capabilities and transparency of an open-source model like Llama 2 over closed, proprietary options? The stakes are high, as Meta's bold democratization play stands to reshape preferences and partnerships in the AI space. One thing is certain - the era of open language model competition has begun.  VIII. Conclusion With the launch of Llama 2, Meta has achieved a landmark breakthrough in open-source language models, unleashing new potential through its commercial accessibility. Llama 2's formidable capabilities in natural language processing, along with robust safety protocols and temporal reasoning, set new benchmarks for the field. While select limitations around math and coding exist presently, Llama 2's strengths far outweigh its weaknesses. As Meta continues honing Llama technology, this latest innovation promises to be truly transformative. By open-sourcing such an advanced model, Meta is propelling democratization and proliferation of AI across industries. From healthcare to education and beyond, Llama 2 stands to shape the landscape by putting groundbreaking language modeling into the hands of all developers and researchers. The possibilities unlocked by this open-source approach signal a shift towards a more collaborative, creative AI future.\n",
+            "\tScore\t 0.6696485090138802\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "\tNode ID\t bc123b3d-b031-4c09-9400-d60ba9a161d6\n",
+            "\tText\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+            "\tScore\t 0.7141285410107295\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "\tNode ID\t c3239b40-e206-4a80-b020-eea87cf471cc\n",
+            "\tText\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "\tScore\t 0.7116485926146265\n",
+            "\t-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ],
+      "source": [
+        "display_res(response)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "zf6r2AmFOsca"
+      },
+      "outputs": [],
+      "source": []
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "authorship_tag": "ABX9TyNlfE1aMk+m6avCgDavT2ZF",
+      "include_colab_link": true,
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.8"
+    },
+    "widgets": {
+      "application/vnd.jupyter.widget-state+json": {
+        "0245f2604e4d49c8bd0210302746c47b": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "134210510d49476e959dd7d032bbdbdc": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "13b9c5395bca4c3ba21265240cb936cf": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "193aef33d9184055bb9223f56d456de6": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "3fbabd8a8660461ba5e7bc08ef39139a": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_df2365556ae242a2ab1a119f9a31a561",
+              "IPY_MODEL_5f4b9d32df8f446e858e4c289dc282f9",
+              "IPY_MODEL_5b588f83a15d42d9aca888e06bbd95ff"
+            ],
+            "layout": "IPY_MODEL_ad073bca655540809e39f26538d2ec0d"
+          }
+        },
+        "47a4586384274577a726c57605e7f8d9": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "4a172e8c6aa44e41a42fc1d9cf714fd0": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_e7937a1bc68441a080374911a6563376",
+            "placeholder": "",
+            "style": "IPY_MODEL_e532ed7bfef34f67b5fcacd9534eb789",
+            "value": " 108/108 [00:03&lt;00:00, 33.70it/s]"
+          }
+        },
+        "5b588f83a15d42d9aca888e06bbd95ff": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_af9b6ae927dd4764b9692507791bc67e",
+            "placeholder": "",
+            "style": "IPY_MODEL_134210510d49476e959dd7d032bbdbdc",
+            "value": " 14/14 [00:00&lt;00:00, 21.41it/s]"
+          }
+        },
+        "5c7973afd79349ed997a69120d0629b2": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "5f4b9d32df8f446e858e4c289dc282f9": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_96a3bdece738481db57e811ccb74a974",
+            "max": 14,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_5c7973afd79349ed997a69120d0629b2",
+            "value": 14
+          }
+        },
+        "5f9bb065c2b74d2e8ded32e1306a7807": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HBoxModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_73a06bc546a64f7f99a9e4a135319dcd",
+              "IPY_MODEL_ce48deaf4d8c49cdae92bfdbb3a78df0",
+              "IPY_MODEL_4a172e8c6aa44e41a42fc1d9cf714fd0"
+            ],
+            "layout": "IPY_MODEL_0245f2604e4d49c8bd0210302746c47b"
+          }
+        },
+        "73a06bc546a64f7f99a9e4a135319dcd": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_e956dfab55084a9cbe33c8e331b511e7",
+            "placeholder": "",
+            "style": "IPY_MODEL_cb394578badd43a89850873ad2526542",
+            "value": "Generating embeddings: 100%"
+          }
+        },
+        "96a3bdece738481db57e811ccb74a974": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "abfc9aa911ce4a5ea81c7c451f08295f": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "ProgressStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "ad073bca655540809e39f26538d2ec0d": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "af9b6ae927dd4764b9692507791bc67e": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "cb394578badd43a89850873ad2526542": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "ce48deaf4d8c49cdae92bfdbb3a78df0": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "FloatProgressModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_193aef33d9184055bb9223f56d456de6",
+            "max": 108,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_abfc9aa911ce4a5ea81c7c451f08295f",
+            "value": 108
+          }
+        },
+        "df2365556ae242a2ab1a119f9a31a561": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "HTMLModel",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_13b9c5395bca4c3ba21265240cb936cf",
+            "placeholder": "",
+            "style": "IPY_MODEL_47a4586384274577a726c57605e7f8d9",
+            "value": "Parsing nodes: 100%"
+          }
+        },
+        "e532ed7bfef34f67b5fcacd9534eb789": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_module_version": "1.5.0",
+          "model_name": "DescriptionStyleModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "e7937a1bc68441a080374911a6563376": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "e956dfab55084a9cbe33c8e331b511e7": {
+          "model_module": "@jupyter-widgets/base",
+          "model_module_version": "1.2.0",
+          "model_name": "LayoutModel",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        }
+      }
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

notebooks/15-Use_OpenSource_Models.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

notebooks/17-Using_LLMs_to_rank_chunks_as_the_Judge.ipynb ADDED Viewed

	@@ -0,0 +1,830 @@

+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "authorship_tag": "ABX9TyMhd0xkjZD3StMhSoQIPv+w",
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/17-Using_LLMs_to_rank_chunks_as_the_Judge.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Install Packages and Setup Variables"
+      ],
+      "metadata": {
+        "id": "0FbELaf7TrW7"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "Yubz8AanRRSW",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "2487c4fd-0fb5-4894-ffe6-c747f4adb952"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m226.7/226.7 kB\u001b[0m \u001b[31m4.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.8/1.8 MB\u001b[0m \u001b[31m11.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m211.1/211.1 kB\u001b[0m \u001b[31m6.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m508.6/508.6 kB\u001b[0m \u001b[31m7.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m15.4/15.4 MB\u001b[0m \u001b[31m15.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m23.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.6/75.6 kB\u001b[0m \u001b[31m2.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m81.3/81.3 kB\u001b[0m \u001b[31m6.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m97.6/97.6 kB\u001b[0m \u001b[31m5.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.4/7.4 MB\u001b[0m \u001b[31m28.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.4/2.4 MB\u001b[0m \u001b[31m55.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m91.8/91.8 kB\u001b[0m \u001b[31m7.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m60.8/60.8 kB\u001b[0m \u001b[31m5.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m41.3/41.3 kB\u001b[0m \u001b[31m3.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.4/5.4 MB\u001b[0m \u001b[31m56.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m6.8/6.8 MB\u001b[0m \u001b[31m47.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m60.1/60.1 kB\u001b[0m \u001b[31m4.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m106.1/106.1 kB\u001b[0m \u001b[31m1.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.3/67.3 kB\u001b[0m \u001b[31m4.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
+            "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
+            "  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m698.9/698.9 kB\u001b[0m \u001b[31m55.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m80.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.6/67.6 kB\u001b[0m \u001b[31m9.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m71.9/71.9 kB\u001b[0m \u001b[31m8.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.9/77.9 kB\u001b[0m \u001b[31m10.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m6.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m141.9/141.9 kB\u001b[0m \u001b[31m16.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m290.4/290.4 kB\u001b[0m \u001b[31m26.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m46.0/46.0 kB\u001b[0m \u001b[31m5.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m50.8/50.8 kB\u001b[0m \u001b[31m6.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m341.4/341.4 kB\u001b[0m \u001b[31m35.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.4/3.4 MB\u001b[0m \u001b[31m92.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━���━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m78.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m130.2/130.2 kB\u001b[0m \u001b[31m16.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m86.8/86.8 kB\u001b[0m \u001b[31m11.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.4/49.4 kB\u001b[0m \u001b[31m5.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Building wheel for tinysegmenter (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for feedfinder2 (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for jieba3k (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for pypika (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for sgmllib3k (setup.py) ... \u001b[?25l\u001b[?25hdone\n"
+          ]
+        }
+      ],
+      "source": [
+        "!pip install -q llama-index==0.10.30 openai==1.12.0 tiktoken==0.6.0 chromadb==0.4.21 llama-index-vector-stores-chroma==0.1.7"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"[OPENAI_API_KEY]\""
+      ],
+      "metadata": {
+        "id": "xLXFuRW-TpUu"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Load a Model"
+      ],
+      "metadata": {
+        "id": "r6GCYYqqTuMc"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "llm = OpenAI(temperature=0.9, model=\"gpt-3.5-turbo\", max_tokens=512)"
+      ],
+      "metadata": {
+        "id": "pupJpdZaTu5m"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Create a Vector Store"
+      ],
+      "metadata": {
+        "id": "gaKYO-KrTwsn"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import chromadb\n",
+        "\n",
+        "# create client and a new collection\n",
+        "# chromadb.EphemeralClient saves data in-memory.\n",
+        "chroma_client = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = chroma_client.create_collection(\"mini-llama-articles\")"
+      ],
+      "metadata": {
+        "id": "npCqCZSPZKR0"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+        "\n",
+        "# Define a storage context object using the created vector database.\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ],
+      "metadata": {
+        "id": "dG9eKSVrZMs1"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Load the Dataset (CSV)"
+      ],
+      "metadata": {
+        "id": "HmiFENBdZMAk"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Download"
+      ],
+      "metadata": {
+        "id": "X-20isiTZRIa"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "The dataset includes several articles from the TowardsAI blog, which provide an in-depth explanation of the LLaMA2 model. Read the dataset as a long string."
+      ],
+      "metadata": {
+        "id": "-lWKX814ZURc"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "!wget https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv"
+      ],
+      "metadata": {
+        "id": "fmlEL849ZPrH",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "63039988-ab7a-4ecf-deb0-d9510628ecb8"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "--2024-04-30 18:37:36--  https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/mini-llama-articles.csv\n",
+            "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.111.133, ...\n",
+            "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.\n",
+            "HTTP request sent, awaiting response... 200 OK\n",
+            "Length: 173646 (170K) [text/plain]\n",
+            "Saving to: ‘mini-llama-articles.csv’\n",
+            "\n",
+            "mini-llama-articles 100%[===================>] 169.58K  --.-KB/s    in 0.01s   \n",
+            "\n",
+            "2024-04-30 18:37:37 (11.3 MB/s) - ‘mini-llama-articles.csv’ saved [173646/173646]\n",
+            "\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Read File"
+      ],
+      "metadata": {
+        "id": "r9PL_eiTZW7y"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import csv\n",
+        "\n",
+        "rows = []\n",
+        "\n",
+        "# Load the file as a JSON\n",
+        "with open(\"./mini-llama-articles.csv\", mode=\"r\", encoding=\"utf-8\") as file:\n",
+        "  csv_reader = csv.reader(file)\n",
+        "\n",
+        "  for idx, row in enumerate( csv_reader ):\n",
+        "    if idx == 0: continue; # Skip header row\n",
+        "    rows.append( row )\n",
+        "\n",
+        "# The number of characters in the dataset.\n",
+        "len( rows )"
+      ],
+      "metadata": {
+        "id": "x5IwXJi8ZQGh"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Convert to Document obj"
+      ],
+      "metadata": {
+        "id": "ktYUZzzSZaDW"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core.schema import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row[1], metadata={\"title\": row[0], \"url\": row[2], \"source_name\": row[3]}) for row in rows]"
+      ],
+      "metadata": {
+        "id": "oO10Q-UyZQEB"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Transforming"
+      ],
+      "metadata": {
+        "id": "0PnovZ0tZdAT"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core.node_parser import TokenTextSplitter\n",
+        "\n",
+        "# Define the splitter object that split the text into segments with 512 tokens,\n",
+        "# with a 128 overlap between the segments.\n",
+        "text_splitter = TokenTextSplitter(\n",
+        "    separator=\" \", chunk_size=512, chunk_overlap=128\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "wzOQZH6VZQBm"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core.extractors import (\n",
+        "    SummaryExtractor,\n",
+        "    QuestionsAnsweredExtractor,\n",
+        "    KeywordExtractor,\n",
+        ")\n",
+        "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+        "from llama_index.core.ingestion import IngestionPipeline\n",
+        "\n",
+        "# Create the pipeline to apply the transformation on each chunk,\n",
+        "# and store the transformed text in the chroma vector store.\n",
+        "pipeline = IngestionPipeline(\n",
+        "    transformations=[\n",
+        "        text_splitter,\n",
+        "        QuestionsAnsweredExtractor(questions=3, llm=llm),\n",
+        "        SummaryExtractor(summaries=[\"prev\", \"self\"], llm=llm),\n",
+        "        KeywordExtractor(keywords=10, llm=llm),\n",
+        "        OpenAIEmbedding(),\n",
+        "    ],\n",
+        "    vector_store=vector_store\n",
+        ")\n",
+        "\n",
+        "# Run the transformation pipeline.\n",
+        "nodes = pipeline.run(documents=documents, show_progress=True);"
+      ],
+      "metadata": {
+        "id": "l6UP7M_rZeXS"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "len( nodes )"
+      ],
+      "metadata": {
+        "id": "GcUUhs88ZeUs"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Compress the vector store directory to a zip file to be able to download and use later.\n",
+        "!zip -r vectorstore.zip mini-llama-articles"
+      ],
+      "metadata": {
+        "id": "B_P8Cil-ZeQM"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Load Indexes"
+      ],
+      "metadata": {
+        "id": "YSGHsZMMZj4E"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "If you have already uploaded the zip file for the vector store checkpoint, please uncomment the code in the following cell block to extract its contents. After doing so, you will be able to load the dataset from local storage."
+      ],
+      "metadata": {
+        "id": "J81Yvj0AZlvK"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# !unzip vectorstore.zip"
+      ],
+      "metadata": {
+        "id": "M8iaOOGyZeNp",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "6a117a0b-161a-4889-daf6-baf94ae00d2a"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Archive:  vectorstore.zip\n",
+            "   creating: mini-llama-articles/\n",
+            "   creating: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/\n",
+            "  inflating: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/data_level0.bin  \n",
+            "  inflating: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/header.bin  \n",
+            " extracting: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/link_lists.bin  \n",
+            "  inflating: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/length.bin  \n",
+            "  inflating: mini-llama-articles/chroma.sqlite3  \n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Load the vector store from the local storage.\n",
+        "db = chromadb.PersistentClient(path=\"./mini-llama-articles\")\n",
+        "chroma_collection = db.get_or_create_collection(\"mini-llama-articles\")\n",
+        "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+      ],
+      "metadata": {
+        "id": "6tzS_EKPZeLS"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "\n",
+        "# Create the index based on the vector store.\n",
+        "index = VectorStoreIndex.from_vector_store(vector_store)"
+      ],
+      "metadata": {
+        "id": "0T6FL7J3ZrNK"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# RankGPT"
+      ],
+      "metadata": {
+        "id": "w2XBkzNwLle5"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core.postprocessor.rankGPT_rerank import RankGPTRerank\n",
+        "\n",
+        "rankGPT = RankGPTRerank(top_n=3, llm=OpenAI(model=\"gpt-3.5-turbo\"))"
+      ],
+      "metadata": {
+        "id": "_it2CxTtLmHT"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "# The `node_postprocessors` function will be applied to the retrieved nodes.\n",
+        "query_engine = index.as_query_engine(\n",
+        "    similarity_top_k=10,\n",
+        "    node_postprocessors=[rankGPT]\n",
+        ")\n",
+        "\n",
+        "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+      ],
+      "metadata": {
+        "id": "YA3M9m9CL6AJ"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "res.response"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        },
+        "id": "wgyjv9e6MCVm",
+        "outputId": "70723d5e-9d16-4123-884b-0d65cd91a405"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'The Llama 2 model has four different parameter sizes: 7 billion, 13 billion, 34 billion, and 70 billion.'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 12
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Show the retrieved nodes\n",
+        "for src in res.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Title\\t\", src.metadata['title'])\n",
+        "  print(\"Text\\t\", src.text)\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "wUhOlwWcMEUT",
+        "outputId": "eae3754b-5cb8-4c5d-d739-c42c9686006d"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Node ID\t d6f533e5-fef8-469c-a313-def19fd38efe\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "Score\t 0.7077337819711658\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 2f3b7c34-8fd0-4134-af38-ef1b77e32cd8\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+            "Score\t 0.7025566634608498\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 1c7a8637-6f65-401e-be33-26886c828a34\n",
+            "Title\t Inside Code Llama: Meta AI's Entrance in the Code LLM Space\n",
+            "Text\t Inside Code Llama The release of Code Llama does not include a single model but three different variants, characterized by their parameter sizes of 7B, 13B, and 34B. Each of these models has been trained on an extensive pool of 500B tokens encompassing code and code-related information. Notably, the 7B and 13B base and instruct models have been endowed with fill-in-the-middle (FIM) competence, empowering them to seamlessly insert code into existing code structures. This attribute equips them to handle tasks like code completion right from the outset.The trio of models caters to distinct requisites concerning serving and latency. For instance, the 7B model boasts the ability to operate on a single GPU. While the 34B model stands out for yielding optimal outcomes and elevating coding assistance, the smaller 7B and 13B versions excel in speed, making them fitting for low-latency tasks such as real-time code completion. Meta AI's innovations further extend to two nuanced adaptations of Code Llama: Code Llama - Python and Code Llama - Instruct. Code Llama - Python is a specialized derivation, meticulously honed on a substantial volume of Python code spanning 100B tokens. Given Python's central role in code generation benchmarks and its significance within the AI community, this focused model augments utility.Code Llama - Instruct represents an alignment and refinement of Code Llama through instructional fine-tuning. This novel training approach entails furnishing the model with \"natural language instruction\" inputs paired with anticipated outputs. This strategic methodology enhances the model's capacity to grasp human expectations in prompts. For endeavors involving code generation, it is advised to opt for Code Llama - Instruct versions, as they have been calibrated to yield useful and secure natural language responses. Deep diving into the Code Llama training and fine-tuning, there are a few aspects that are worth highlighting 1) DatasetLlama's training rests on a meticulously curated dataset enriched with publicly available code, offering a near-duplicate-free landscape. The dataset consists of 500B tokens during the initial phase, starting from the 7B, 13B, and 34B\n",
+            "Score\t 0.6889534709415898\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Custom Postprocessor"
+      ],
+      "metadata": {
+        "id": "5mcAcZqhQluE"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## The `Judger` Function"
+      ],
+      "metadata": {
+        "id": "7v7vmJblQrN6"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "The following function will query GPT-4 to retrieve the top three nodes that has highest similarity to the asked question."
+      ],
+      "metadata": {
+        "id": "6k8IKlN9QvU7"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from pydantic import BaseModel\n",
+        "from llama_index.llms.openai import OpenAI\n",
+        "from llama_index.core.prompts import PromptTemplate\n",
+        "\n",
+        "\n",
+        "def judger(nodes, query):\n",
+        "\n",
+        "  # The model's output template\n",
+        "  class OrderedNodes(BaseModel):\n",
+        "    \"\"\"A node with the id and assigned score.\"\"\"\n",
+        "    node_id: list\n",
+        "    score: list\n",
+        "\n",
+        "  # Prepare the nodes and wrap them in <NODE></NODE> identifier, as well as the query\n",
+        "  the_nodes=\"\"\n",
+        "  for idx, item in enumerate(nodes):\n",
+        "    the_nodes += f\"<NODE{idx+1}>\\nNode ID: {item.node_id}\\nText: {item.text}\\n</NODE{idx+1}>\\n\"\n",
+        "\n",
+        "  query = \"<QUERY>\\n{}\\n</QUERY>\".format(query)\n",
+        "\n",
+        "  # Define the prompt template\n",
+        "  prompt_tmpl = PromptTemplate(\n",
+        "    \"\"\"\n",
+        "    You receive a qurey along with a list of nodes' text and their ids. Your task is to assign score\n",
+        "    to each node based on its contextually closeness to the given query. The final output is each\n",
+        "    node id along with its proximity score.\n",
+        "    Here is the list of nodes:\n",
+        "    {nodes_list}\n",
+        "\n",
+        "    And the following is the query:\n",
+        "    {user_query}\n",
+        "\n",
+        "    Score each of the nodes based on their text and their relevancy to the provided query.\n",
+        "    The score must be a decimal number between 0 an 1 so we can rank them.\"\"\"\n",
+        "  )\n",
+        "\n",
+        "  # Define the an instance of GPT-4 and send the request\n",
+        "  llm = OpenAI(model=\"gpt-4\")\n",
+        "  ordered_nodes = llm.structured_predict(\n",
+        "    OrderedNodes, prompt_tmpl, nodes_list=the_nodes, user_query=query\n",
+        "  )\n",
+        "\n",
+        "  return ordered_nodes"
+      ],
+      "metadata": {
+        "id": "WhtJ1OeF9L3G"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Define Postprocessor"
+      ],
+      "metadata": {
+        "id": "Q5f1GrBKZprO"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "The following class will use the `judger` function to rank the nodes, and filter them based on the ranks."
+      ],
+      "metadata": {
+        "id": "yZujUJTvQ6Yu"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from typing import (\n",
+        "    List,\n",
+        "    Optional\n",
+        ")\n",
+        "from llama_index.core import QueryBundle\n",
+        "from llama_index.core.postprocessor.types import BaseNodePostprocessor\n",
+        "from llama_index.core.schema import NodeWithScore\n",
+        "\n",
+        "\n",
+        "class OpenaiAsJudgePostprocessor(BaseNodePostprocessor):\n",
+        "    def _postprocess_nodes(\n",
+        "        self, nodes: List[NodeWithScore], query_bundle: Optional[QueryBundle]\n",
+        "    ) -> List[NodeWithScore]:\n",
+        "\n",
+        "        r = judger(nodes, query_bundle)\n",
+        "\n",
+        "        node_ids = r.node_id\n",
+        "        scores = r.score\n",
+        "\n",
+        "        sorted_scores = sorted(enumerate(scores), key=lambda x: x[1], reverse=True)\n",
+        "        top_three_nodes = [sorted_scores[i][0] for i in range(3)]\n",
+        "\n",
+        "        selected_nodes_id = [node_ids[item] for item in top_three_nodes]\n",
+        "\n",
+        "        final_nodes = []\n",
+        "        for item in nodes:\n",
+        "          if item.node_id in selected_nodes_id:\n",
+        "            final_nodes.append( item )\n",
+        "\n",
+        "        return final_nodes"
+      ],
+      "metadata": {
+        "id": "-QtyuC8fZun0"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "judge = OpenaiAsJudgePostprocessor()"
+      ],
+      "metadata": {
+        "id": "jk-lqYlYLipi"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Query Engine with Postprocessor"
+      ],
+      "metadata": {
+        "id": "cgtsvxR7SflP"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "# The `node_postprocessors` function will be applied to the retrieved nodes.\n",
+        "query_engine = index.as_query_engine(\n",
+        "    similarity_top_k=10,\n",
+        "    node_postprocessors=[judge]\n",
+        ")\n",
+        "\n",
+        "res = query_engine.query(\"How many parameters LLaMA2 model has?\")"
+      ],
+      "metadata": {
+        "id": "1Hh3RLCeLfXZ"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "res.response"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 53
+        },
+        "id": "zmZv0EIyF0wG",
+        "outputId": "7ff1b3bf-1b5f-4985-ea0d-3048d94c8da1"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'The Llama 2 model is available in four different sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 29
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Show the retrieved nodes\n",
+        "for src in res.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Title\\t\", src.metadata['title'])\n",
+        "  print(\"Text\\t\", src.text)\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ],
+      "metadata": {
+        "id": "bBMaG6yaZzjA",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "8a173ef7-e66f-4f9b-a979-c88a17028ef0"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Node ID\t d6f533e5-fef8-469c-a313-def19fd38efe\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 tokens, twice the size of its predecessor Llama 1. Meta also released a Llama 2 fine-tuned model for chat applications that was trained on over 1 million human annotations. Such extensive training comes at a cost, with the 70B model taking a staggering 1720320 GPU hours to train. The context window's length determines the amount of content the model can process at once, making Llama 2 a powerful language model in terms of scale and efficiency.  III. Safety Considerations: A Top Priority for Meta Meta's commitment to safety and alignment shines through in Llama 2's design. The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving\n",
+            "Score\t 0.7077337819711658\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 2f3b7c34-8fd0-4134-af38-ef1b77e32cd8\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t The model demonstrates exceptionally low AI safety violation percentages, surpassing even ChatGPT in safety benchmarks. Finding the right balance between helpfulness and safety when optimizing a model poses significant challenges. While a highly helpful model may be capable of answering any question, including sensitive ones like \"How do I build a bomb?\", it also raises concerns about potential misuse. Thus, striking the perfect equilibrium between providing useful information and ensuring safety is paramount. However, prioritizing safety to an extreme extent can lead to a model that struggles to effectively address a diverse range of questions. This limitation could hinder the model's practical applicability and user experience. Thus, achieving an optimum balance that allows the model to be both helpful and safe is of utmost importance. To strike the right balance between helpfulness and safety, Meta employed two reward models - one for helpfulness and another for safety - to optimize the model's responses. The 34B parameter model has reported higher safety violations than other variants, possibly contributing to the delay in its release.  IV. Helpfulness Comparison: Llama 2 Outperforms Competitors Llama 2 emerges as a strong contender in the open-source language model arena, outperforming its competitors in most categories. The 70B parameter model outperforms all other open-source models, while the 7B and 34B models outshine Falcon in all categories and MPT in all categories except coding. Despite being smaller, Llam a2's performance rivals that of Chat GPT 3.5, a significantly larger closed-source model. While GPT 4 and PalM-2-L, with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering\n",
+            "Score\t 0.7025566634608498\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 021c859e-809b-49b8-8d0d-38cc326c1203\n",
+            "Title\t Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use\n",
+            "Text\t with their larger size, outperform Llama 2, this is expected due to their capacity for handling complex language tasks. Llama 2's impressive ability to compete with larger models highlights its efficiency and potential in the market. However, Llama 2 does face challenges in coding and math problems, where models like Chat GPT 4 excel, given their significantly larger size. Chat GPT 4 performed significantly better than Llama 2 for coding (HumanEval benchmark)and math problem tasks (GSM8k benchmark). Open-source AI technologies, like Llama 2, continue to advance, offering strong competition to closed-source models.  V. Ghost Attention: Enhancing Conversational Continuity One unique feature in Llama 2 is Ghost Attention, which ensures continuity in conversations. This means that even after multiple interactions, the model remembers its initial instructions, ensuring more coherent and consistent responses throughout the conversation. This feature significantly enhances the user experience and makes Llama 2 a more reliable language model for interactive applications. In the example below, on the left, it forgets to use an emoji after a few conversations. On the right, with Ghost Attention, even after having many conversations, it will remember the context and continue to use emojis in its response.  VI. Temporal Capability: A Leap in Information Organization Meta reported a groundbreaking temporal capability, where the model organizes information based on time relevance. Each question posed to the model is associated with a date, and it responds accordingly by considering the event date before which the question becomes irrelevant. For example, if you ask the question, \"How long ago did Barack Obama become president?\", its only relevant after 2008. This temporal awareness allows Llama 2 to deliver more contextually accurate responses, enriching the user experience further.  VII. Open Questions and Future Outlook Meta's open-sourcing of Llama 2 represents a seismic shift, now offering developers and researchers commercial access to a leading language model. With Llama 2 outperforming MosaicML's current MPT models, all eyes are on how Databricks will respond. Can MosaicML's next MPT iteration beat Llama 2? Is it worthwhile to compete\n",
+            "Score\t 0.691486848320407\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [],
+      "metadata": {
+        "id": "J7sIPpFFTep3"
+      },
+      "execution_count": null,
+      "outputs": []
+    }
+  ]
+}

notebooks/Crawl_a_Website.ipynb ADDED Viewed

	@@ -0,0 +1,574 @@

+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "toc_visible": true,
+      "authorship_tag": "ABX9TyOUem37lhhg0mJYauho+pvb",
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/Crawl_a_Website.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "!pip install -q llama-index==0.10.30 openai==1.12.0 cohere==4.47 tiktoken==0.6.0 newspaper3k==0.2.8"
+      ],
+      "metadata": {
+        "id": "4CW8ux1RSdem",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "155feab4-8ae6-43da-a07f-8a1f4b677c2b"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m211.1/211.1 kB\u001b[0m \u001b[31m4.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m81.3/81.3 kB\u001b[0m \u001b[31m8.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m97.6/97.6 kB\u001b[0m \u001b[31m9.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.4/7.4 MB\u001b[0m \u001b[31m43.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for tinysegmenter (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for feedfinder2 (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for jieba3k (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for sgmllib3k (setup.py) ... \u001b[?25l\u001b[?25hdone\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"[OPENAI_API_KEY]\"\n",
+        "USESCRAPER_API_KEY = \"[USESCRAPER_API_KEY]\""
+      ],
+      "metadata": {
+        "id": "wxDPsVXSAj6_"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "There are two primary methods for extracting webpage content. The first method involves having a list of URLs; one can iterate through this list to retrieve the content of each page. The second method, web crawling, requires using a script or service to extract page URLs from a sitemap or manually following links on the page to access all the content. Initially, we will explore web scraping techniques before discussing how to use a service like usescraper.com to perform web crawling."
+      ],
+      "metadata": {
+        "id": "VSc7-1mljmrp"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# 1. Scraping using `newspaper` Library"
+      ],
+      "metadata": {
+        "id": "D3r2tYHgeIK9"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Define URLs"
+      ],
+      "metadata": {
+        "id": "it43ZQf8jatw"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "urls = [\n",
+        "    \"https://docs.llamaindex.ai/en/stable/understanding\",\n",
+        "    \"https://docs.llamaindex.ai/en/stable/understanding/using_llms/using_llms/\",\n",
+        "    \"https://docs.llamaindex.ai/en/stable/understanding/indexing/indexing/\",\n",
+        "    \"https://docs.llamaindex.ai/en/stable/understanding/querying/querying/\"\n",
+        "]"
+      ],
+      "metadata": {
+        "id": "x74PqfQ7eIzD"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Get Page Contents"
+      ],
+      "metadata": {
+        "id": "tgxfpfSsjcMC"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import newspaper\n",
+        "\n",
+        "pages_content = []\n",
+        "\n",
+        "# Retrieve the Content\n",
+        "for url in urls:\n",
+        "\ttry:\n",
+        "\t\tarticle = newspaper.Article( url )\n",
+        "\t\tarticle.download()\n",
+        "\t\tarticle.parse()\n",
+        "\t\tif len(article.text) > 0:\n",
+        "\t\t\tpages_content.append({ \"url\": url, \"title\": article.title, \"text\": article.text })\n",
+        "\texcept:\n",
+        "\t\tcontinue"
+      ],
+      "metadata": {
+        "id": "Q6Xs1OhUfVQV"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "pages_content[0]"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "3cNdJNi2g1ly",
+        "outputId": "f5184c15-6b55-47ee-98ee-646a06290a4c"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "{'url': 'https://docs.llamaindex.ai/en/stable/understanding',\n",
+              " 'title': 'Building an LLM Application',\n",
+              " 'text': \"Building an LLM application#\\n\\nWelcome to the beginning of Understanding LlamaIndex. This is a series of short, bite-sized tutorials on every stage of building an LLM application to get you acquainted with how to use LlamaIndex before diving into more advanced and subtle strategies. If you're an experienced programmer new to LlamaIndex, this is the place to start.\\n\\nKey steps in building an LLM application#\\n\\nTip If you've already read our high-level concepts page you'll recognize several of these steps.\\n\\nThere are a series of key steps involved in building any LLM-powered application, whether it's answering questions about your data, creating a chatbot, or an autonomous agent. Throughout our documentation, you'll notice sections are arranged roughly in the order you'll perform these steps while building your app. You'll learn about:\\n\\nUsing LLMs : whether it's OpenAI or any number of hosted LLMs or a locally-run model of your own, LLMs are used at every step of the way, from indexing and storing to querying and parsing your data. LlamaIndex comes with a huge number of reliable, tested prompts and we'll also show you how to customize your own.\\n\\nLoading : getting your data from wherever it lives, whether that's unstructured text, PDFs, databases, or APIs to other applications. LlamaIndex has hundreds of connectors to every data source over at LlamaHub.\\n\\nIndexing : once you've got your data there are an infinite number of ways to structure access to that data to ensure your applications is always working with the most relevant data. LlamaIndex has a huge number of these strategies built-in and can help you select the best ones.\\n\\nStoring : you will probably find it more efficient to store your data in indexed form, or pre-processed summaries provided by an LLM, often in a specialized database known as a Vector Store (see below). You can also store your indexes, metadata and more.\\n\\nQuerying : every indexing strategy has a corresponding querying strategy and there are lots of ways to improve the relevance, speed and accuracy of what you retrieve and what the LLM does with it before returning it to you, including turning it into structured responses such as an API.\\n\\nPutting it all together : whether you are building question & answering, chatbots, an API, or an autonomous agent, we show you how to get your application into production.\\n\\nTracing and debugging : also called observability , it's especially important with LLM applications to be able to look into the inner workings of what's going on to help you debug problems and spot places to improve.\\n\\nEvaluating: every strategy has pros and cons and a key part of building, shipping and evolving your application is evaluating whether your change has improved your application in terms of accuracy, performance, clarity, cost and more. Reliably evaluating your changes is a crucial part of LLM application development.\\n\\nReady to dive in? Head to using LLMs.\"}"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 57
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "len( pages_content )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "WleP60A3gkQM",
+        "outputId": "8c79ab53-e47b-4227-eb6f-0286b8ba2d15"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "5"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 38
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Convert to Document"
+      ],
+      "metadata": {
+        "id": "i5mCiRfGjfNx"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core.schema import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row['text'], metadata={\"title\": row['title'], \"url\": row['url']}) for row in pages_content]"
+      ],
+      "metadata": {
+        "id": "TOJ3K-CBfVDR"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# 2. Submit the Crawler Job"
+      ],
+      "metadata": {
+        "id": "CkjEyEmkJevT"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "tYpchBo5-brp",
+        "outputId": "927f84c5-c13a-408c-8802-df90bc05c733"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "{'org': '581', 'id': '7YE3T8VSPJVSCYE6EDQ90DJNFT', 'urls': ['https://docs.llamaindex.ai/en/stable/understanding/'], 'exclude_globs': [], 'exclude_elements': 'nav, header, footer, script, style, noscript, svg, [role=\"alert\"], [role=\"banner\"], [role=\"dialog\"], [role=\"alertdialog\"], [role=\"region\"][aria-label*=\"skip\" i], [aria-modal=\"true\"]', 'output_format': 'markdown', 'output_expiry': 604800, 'min_length': 50, 'page_limit': 10000, 'force_crawling_mode': 'link', 'block_resources': True, 'include_linked_files': False, 'createdAt': 1713883978029, 'status': 'starting', 'use_browser': True, 'sitemapPageCount': 0, 'notices': []}\n"
+          ]
+        }
+      ],
+      "source": [
+        "import requests\n",
+        "import json\n",
+        "\n",
+        "payload = {\n",
+        "    \"urls\": [\"https://docs.llamaindex.ai/en/stable/understanding/\"], # list of urls to crawl\n",
+        "    \"output_format\": \"markdown\", # text, html, markdown\n",
+        "    \"output_expiry\": 604800, # Automatically delete after X seconds\n",
+        "    \"min_length\": 50, # Skip pages with less than X characters\n",
+        "    \"page_limit\": 10000, # Maximum number of pages to crawl\n",
+        "    \"force_crawling_mode\": \"link\", # \"link\" follows links in the page reccursively, or \"sitemap\" to find pages from website's sitemap\n",
+        "    \"block_resources\": True, # skip loading images, stylesheets, or scripts\n",
+        "    \"include_linked_files\": False # include files (PDF, text, ...) in output\n",
+        "}\n",
+        "headers = {\n",
+        "    \"Authorization\": \"Bearer \" + USESCRAPER_API_KEY,\n",
+        "    \"Content-Type\": \"application/json\"\n",
+        "}\n",
+        "\n",
+        "response = requests.request(\"POST\", \"https://api.usescraper.com/crawler/jobs\", json=payload, headers=headers)\n",
+        "\n",
+        "response = json.loads( response.text )\n",
+        "\n",
+        "print(response)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Get the Status"
+      ],
+      "metadata": {
+        "id": "nx_4MjHxJgxh"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "url = \"https://api.usescraper.com/crawler/jobs/{}\".format(response['id'])\n",
+        "\n",
+        "status_res = requests.request(\"GET\", url, headers=headers)\n",
+        "\n",
+        "status_res = json.loads( status_res.text )\n",
+        "\n",
+        "print( status_res['status'] )\n",
+        "print( status_res['progress'] )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "ZLJ0BUR8c1a8",
+        "outputId": "cfd3aee9-68bf-4171-9340-abe2d03fa5ac"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "running\n",
+            "{'scraped': 9, 'discarded': 0, 'failed': 0}\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Get the Data"
+      ],
+      "metadata": {
+        "id": "vHcRJIDsJh2i"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "url = \"https://api.usescraper.com/crawler/jobs/{}/data\".format(response['id'])\n",
+        "\n",
+        "data_res = requests.request(\"GET\", url, headers=headers)\n",
+        "\n",
+        "data_res = json.loads( data_res.text )\n",
+        "\n",
+        "print( data_res )"
+      ],
+      "metadata": {
+        "id": "J4dUn4cmGGab"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "print( \"URL:\", data_res['data'][0]['meta']['url'] )\n",
+        "print( \"Title:\", data_res['data'][0]['meta']['meta']['title'] )\n",
+        "print( \"Content:\", data_res['data'][0]['text'][0:500], \"...\" )"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "F8VEQvJkITLJ",
+        "outputId": "b54ec108-7221-4230-8b61-d0a4be503a66"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "URL: https://docs.llamaindex.ai/en/stable/understanding/putting_it_all_together/graphs/\n",
+            "Title: Knowledge Graphs - LlamaIndex\n",
+            "Content:  \n",
+            "[ Skip to content ](https://docs.llamaindex.ai/en/stable/understanding/putting_it_all_together/graphs/#knowledge-graphs)\n",
+            "#Knowledge Graphs[#](https://docs.llamaindex.ai/en/stable/understanding/putting_it_all_together/graphs/#knowledge-graphs)\n",
+            "LlamaIndex contains some fantastic guides for building with knowledge graphs.\n",
+            "\n",
+            "Check out the end-to-end tutorials/workshops below. Also check out our [knowledge graph query engine guides](https://docs.llamaindex.ai/en/stable/module_guides/deploying/query_ ...\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Convert to Document"
+      ],
+      "metadata": {
+        "id": "rt2nyuLhSYLR"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core.schema import Document\n",
+        "\n",
+        "# Convert the chunks to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row['text'], metadata={\"title\": row['meta']['meta']['title'], \"url\": row['meta']['url']}) for row in data_res['data']]"
+      ],
+      "metadata": {
+        "id": "YEieGzSFSXas"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Create RAG Pipeline"
+      ],
+      "metadata": {
+        "id": "vqbJG5a1i3Jo"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.llms.openai import OpenAI\n",
+        "\n",
+        "llm = OpenAI(model=\"gpt-3.5-turbo\")"
+      ],
+      "metadata": {
+        "id": "wxmiQDv3SXV6"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+        "\n",
+        "embed_model = OpenAIEmbedding(model=\"text-embedding-3-large\")"
+      ],
+      "metadata": {
+        "id": "tCVhv4OkSXTV"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core.node_parser import SentenceSplitter\n",
+        "\n",
+        "text_splitter = SentenceSplitter(chunk_size=512, chunk_overlap=30)"
+      ],
+      "metadata": {
+        "id": "quwJI61dNVr-"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core import Settings\n",
+        "\n",
+        "Settings.llm = llm\n",
+        "Settings.embed_model = embed_model\n",
+        "Settings.text_splitter = text_splitter"
+      ],
+      "metadata": {
+        "id": "6KpeCRMBUgup"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "\n",
+        "index = VectorStoreIndex.from_documents( documents )"
+      ],
+      "metadata": {
+        "id": "nWTBidwoZSO0"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "query_engine = index.as_query_engine()"
+      ],
+      "metadata": {
+        "id": "RUuJO0IIYSeU"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "res = query_engine.query(\"What is a query engine?\")"
+      ],
+      "metadata": {
+        "id": "6_s2LkH6YX1V"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "res.response"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 71
+        },
+        "id": "02zdJNqIZKep",
+        "outputId": "76340610-0d98-4fd0-d237-ddb9f1752391"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'A query engine is a fundamental component used in querying processes. It is responsible for retrieving the most relevant documents from an index based on a query, postprocessing the retrieved nodes if needed, and then synthesizing a response by combining the query, relevant data, and prompt to be sent to the language model for generating an answer.'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 28
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Show the retrieved nodes\n",
+        "for src in res.source_nodes:\n",
+        "  print(\"Node ID\\t\", src.node_id)\n",
+        "  print(\"Title\\t\", src.metadata['title'])\n",
+        "  print(\"URL\\t\", src.metadata['url'])\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "PuCcgP0nZSIl",
+        "outputId": "e136cdbb-2ee4-4dfb-f532-f6c9365e519e"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Node ID\t 081b6c8c-d9ea-4476-bac0-1008facd3db8\n",
+            "Title\t Querying - LlamaIndex\n",
+            "URL\t https://docs.llamaindex.ai/en/stable/understanding/querying/querying/\n",
+            "Score\t 0.46212738505767387\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Node ID\t 3786c195-c5de-4bba-98b6-996031349a88\n",
+            "Title\t Querying - LlamaIndex\n",
+            "URL\t https://docs.llamaindex.ai/en/stable/understanding/querying/querying/\n",
+            "Score\t 0.43141762602042416\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ]
+    }
+  ]
+}

notebooks/Web_Search_API.ipynb ADDED Viewed

	@@ -0,0 +1,491 @@

+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "authorship_tag": "ABX9TyNH2OsWaT8fcT3tgDhO3NQn",
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/Web_Search_API.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "JboB5VaCJUrb",
+        "outputId": "b7221d06-8783-4586-f98a-72af45cae54f"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m211.1/211.1 kB\u001b[0m \u001b[31m4.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m81.3/81.3 kB\u001b[0m \u001b[31m8.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m97.6/97.6 kB\u001b[0m \u001b[31m10.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.4/7.4 MB\u001b[0m \u001b[31m24.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for tinysegmenter (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for feedfinder2 (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for jieba3k (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Building wheel for sgmllib3k (setup.py) ... \u001b[?25l\u001b[?25hdone\n"
+          ]
+        }
+      ],
+      "source": [
+        "!pip install -q llama-index==0.10.5 openai==1.12.0 tiktoken==0.6.0 llama-index-tools-google==0.1.3 newspaper3k==0.2.8"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "\n",
+        "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+        "os.environ[\"OPENAI_API_KEY\"] = \"[OPENAI_API_KEY]\"\n",
+        "GOOGLE_SEARCH_KEY = \"[GOOGLE_SEARCH_KEY]\"\n",
+        "GOOGLE_SEARCH_ENGINE = \"[GOOGLE_SEARCH_ENGINE]\""
+      ],
+      "metadata": {
+        "id": "1NKAn5scN_g9"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Using Agents/Tools"
+      ],
+      "metadata": {
+        "id": "ex1gQVHvITMI"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Define Google Search Tool"
+      ],
+      "metadata": {
+        "id": "0LMypoqUyuXq"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.tools.google import GoogleSearchToolSpec\n",
+        "\n",
+        "tool_spec = GoogleSearchToolSpec(key=GOOGLE_SEARCH_KEY, engine=GOOGLE_SEARCH_ENGINE)"
+      ],
+      "metadata": {
+        "id": "4Q7sc69nJvWI"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Import and initialize our tool spec\n",
+        "from llama_index.core.tools.tool_spec.load_and_search import LoadAndSearchToolSpec\n",
+        "\n",
+        "# Wrap the google search tool to create an index on top of the returned Google search\n",
+        "wrapped_tool = LoadAndSearchToolSpec.from_defaults(\n",
+        "    tool_spec.to_tool_list()[0],\n",
+        ").to_tool_list()"
+      ],
+      "metadata": {
+        "id": "VrbuIOaMeOIf"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Create the Agent"
+      ],
+      "metadata": {
+        "id": "T3ENpLyBy7UL"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.agent.openai import OpenAIAgent\n",
+        "\n",
+        "agent = OpenAIAgent.from_tools(wrapped_tool, verbose=False)"
+      ],
+      "metadata": {
+        "id": "-_Ab47ppK8b2"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "res = agent.chat(\"How many parameters LLaMA2 model has?\")"
+      ],
+      "metadata": {
+        "id": "YcUyz1-FlCQ8"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "res.response"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 35
+        },
+        "id": "w4wK5sY-lOOv",
+        "outputId": "8090a106-6fac-4514-fdbd-c72a01b28169"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'The LLaMA2 model has parameters available in three different sizes: 7 billion, 13 billion, and 70 billion.'"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            }
+          },
+          "metadata": {},
+          "execution_count": 72
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "res.sources"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "TM_cvBA1nTJM",
+        "outputId": "0bf3533a-c62d-4d0d-bd76-76c043477042"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "[ToolOutput(content='Content loaded! You can now search the information using read_google_search', tool_name='google_search', raw_input={'args': (), 'kwargs': {'query': 'parameters of LLaMA2 model'}}, raw_output='Content loaded! You can now search the information using read_google_search', is_error=False),\n",
+              " ToolOutput(content='Answer: The parameters of the LLaMA2 model are available in three different sizes: 7 billion, 13 billion, and 70 billion.', tool_name='read_google_search', raw_input={'args': (), 'kwargs': {'query': 'parameters of LLaMA2 model'}}, raw_output='Answer: The parameters of the LLaMA2 model are available in three different sizes: 7 billion, 13 billion, and 70 billion.', is_error=False)]"
+            ]
+          },
+          "metadata": {},
+          "execution_count": 73
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Using Tools w/ VectorStoreIndex"
+      ],
+      "metadata": {
+        "id": "who-NM4pIhPn"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "A limitation of the current agent/tool in LlamaIndex is that it **relies solely on the page description from the retrieved pages** to answer questions. This approach will miss answers that are not visible in the page's description tag. To address this, a possible workaround is to fetch the page results, extract the page content using the newspaper3k library, and then create an index based on the downloaded content. Also, the previous method stacks all retrieved items from the search engine into a single document, making it **difficult to pinpoint the exact source** of the response. However, the following method will enable us to present the sources easily."
+      ],
+      "metadata": {
+        "id": "9g9cTM9GI-19"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Define Google Search Tool"
+      ],
+      "metadata": {
+        "id": "31G_fxxJIsbC"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.tools.google import GoogleSearchToolSpec\n",
+        "\n",
+        "tool_spec = GoogleSearchToolSpec(key=GOOGLE_SEARCH_KEY, engine=GOOGLE_SEARCH_ENGINE)"
+      ],
+      "metadata": {
+        "id": "lwRmj2odIHxt"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "search_results = tool_spec.google_search(\"LLaMA2 model details\")"
+      ],
+      "metadata": {
+        "id": "UVIxdj04Bsf2"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import json\n",
+        "\n",
+        "search_results = json.loads( search_results[0].text )"
+      ],
+      "metadata": {
+        "id": "AlYDNfg2BsdQ"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Read Each URL Contents"
+      ],
+      "metadata": {
+        "id": "pHALd3uhIxtQ"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import newspaper\n",
+        "pages_content = []\n",
+        "\n",
+        "for item in search_results['items']:\n",
+        "\n",
+        "    try:\n",
+        "        article = newspaper.Article( item['link'] )\n",
+        "        article.download()\n",
+        "        article.parse()\n",
+        "        if len(article.text) > 0:\n",
+        "            pages_content.append({ \"url\": item['link'], \"text\": article.text, \"title\": item['title'] })\n",
+        "    except:\n",
+        "        continue\n",
+        "\n",
+        "print(len(pages_content))"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "jXz3JFduBsaq",
+        "outputId": "1b795423-26a6-4a61-a878-cca5e27dd5d1"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "8\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Create the Index"
+      ],
+      "metadata": {
+        "id": "iqxa_qRVI3G0"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core import Document\n",
+        "\n",
+        "# Convert the texts to Document objects so the LlamaIndex framework can process them.\n",
+        "documents = [Document(text=row[\"text\"], metadata={\"title\": row[\"title\"], \"url\": row[\"url\"]}) for row in pages_content]"
+      ],
+      "metadata": {
+        "id": "O4PkK8DuBsZT"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from llama_index.core import VectorStoreIndex\n",
+        "from llama_index.core.node_parser import SentenceSplitter\n",
+        "\n",
+        "# Build index / generate embeddings using OpenAI.\n",
+        "index = VectorStoreIndex.from_documents(\n",
+        "    documents,\n",
+        "    transformations=[SentenceSplitter(chunk_size=512, chunk_overlap=64)],\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "2RtMBWpgBsWX"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Define a query engine that is responsible for retrieving related pieces of text,\n",
+        "# and using a LLM to formulate the final answer.\n",
+        "query_engine = index.as_query_engine()"
+      ],
+      "metadata": {
+        "id": "xV_ibEZ_BsM4"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Query"
+      ],
+      "metadata": {
+        "id": "nziwu27MI6ih"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "response = query_engine.query(\n",
+        "    \"How many parameters LLaMA2 model has?\"\n",
+        ")\n",
+        "print(response)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "5K1h2_t-HNPe",
+        "outputId": "58ce5d66-eddc-43fe-e7c8-d78bc0cb8c32"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "LLaMA2 model has sizes ranging from 7 to 70 billion parameters.\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "response = query_engine.query(\n",
+        "    \"How many parameters LLaMA2 model has? list exact sizes.\"\n",
+        ")\n",
+        "print(response)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "Xea7ZeidH27i",
+        "outputId": "d455c379-9c91-4c9e-e9c1-6bd2deb7342e"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "The LLaMA2 model comes in several sizes with different numbers of parameters:\n",
+            "- LLaMA2 7B\n",
+            "- LLaMA2 13B\n",
+            "- LLaMA2 33B\n",
+            "- LLaMA2 65B\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Show the retrieved nodes\n",
+        "for src in response.source_nodes:\n",
+        "  print(\"Title\\t\", src.metadata['title'])\n",
+        "  print(\"Source\\t\", src.metadata['url'])\n",
+        "  print(\"Score\\t\", src.score)\n",
+        "  print(\"-_\"*20)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "4QpGPD5nHORP",
+        "outputId": "8f9fc185-7745-4357-8471-25d34726cdd8"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Title\t Introducing LLaMA: A foundational, 65-billion-parameter language ...\n",
+            "Source\t https://ai.meta.com/blog/large-language-model-llama-meta-ai/\n",
+            "Score\t 0.8124383491026671\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n",
+            "Title\t Llama 2 follow-up: too much RLHF, GPU sizing, technical details\n",
+            "Source\t https://www.interconnects.ai/p/llama-2-part-2\n",
+            "Score\t 0.8046542892214631\n",
+            "-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [],
+      "metadata": {
+        "id": "B5b4nZ-qHpdP"
+      },
+      "execution_count": null,
+      "outputs": []
+    }
+  ]
+}

requirements.txt ADDED Viewed

	@@ -0,0 +1,18 @@

+openai
+llama-index
+llama-index-vector-stores-chroma
+pydantic
+numpy
+cohere
+tiktoken
+chromadb
+kaleido
+python-multipart
+html2text
+sentence_transformers
+ipykernel
+gradio
+instructor
+pydantic
+pyarrow
+pymongo

scripts/basic_tutor.py ADDED Viewed

	@@ -0,0 +1,60 @@

+import sys
+import os
+from openai import OpenAI
+# Retrieve your OpenAI API key from the environment variables and activate the OpenAI client
+openai_api_key = os.environ.get("OPENAI_API_KEY")
+client = OpenAI(api_key=openai_api_key)
+def ask_ai_tutor(question):
+    # Check if OpenAI key has been correctly added
+    if not openai_api_key:
+        return "OpenAI API key not found in environment variables."
+    try:
+        # Formulating the system prompt
+        system_prompt = (
+            "You are an AI tutor specialized in answering artificial intelligence-related questions. "
+            "Only answer AI-related question, else say that you cannot answer this question."
+        )
+        # Combining the system prompt with the user's question
+        prompt = f"Please provide an informative and accurate answer to the following question.\nQuestion: {question}\nAnswer:"
+        # Call the OpenAI API
+        response = client.chat.completions.create(
+            model="gpt-3.5-turbo-0125",
+            messages=[
+                {"role": "system", "content": system_prompt},
+                {"role": "user", "content": prompt},
+            ],
+        )
+        # Return the AI's response
+        return response.choices[0].message.content.strip()
+    except Exception as e:
+        return f"An error occurred: {e}"
+def main():
+    # Check if a question was provided as a command-line argument
+    if len(sys.argv) != 2:
+        print("Usage: python script_name.py 'Your AI-related question'")
+        sys.exit(1)
+    # The user's question is the first command-line argument
+    user_question = sys.argv[1]
+    # Get the AI's response
+    ai_response = ask_ai_tutor(user_question)
+    # Print the AI's response
+    print(f"AI Tutor says: {ai_response}")
+if __name__ == "__main__":
+    main()

scripts/call_openai.py ADDED Viewed

	@@ -0,0 +1,79 @@

+import os
+import logging
+import instructor
+import openai
+from openai import OpenAI, AsyncOpenAI
+from dotenv import load_dotenv
+logger = logging.getLogger(__name__)
+logging.basicConfig(level=logging.INFO)
+load_dotenv(".env")
+OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
+def api_function_call(
+    system_message,
+    query: str,
+    model: str = "gpt-4-0125-preview",
+    response_model=None,
+    max_retries: int = 0,
+    stream: bool = False,
+):
+    client = instructor.patch(OpenAI())
+    try:
+        message_data = {
+            "model": model,
+            "messages": [
+                {"role": "system", "content": system_message},
+                {"role": "user", "content": query},
+            ],
+            "max_retries": max_retries,
+            "stream": stream,
+        }
+        if response_model is not None:
+            message_data["response_model"] = response_model
+        response = client.chat.completions.create(**message_data)
+        error = False
+    except openai.BadRequestError:
+        error = True
+        logger.exception("Invalid request to OpenAI API. See traceback:")
+        error_message = (
+            "Something went wrong while connecting with OpenAI, try again soon!"
+        )
+        return error_message, error
+    except openai.RateLimitError:
+        error = True
+        logger.exception("RateLimit error from OpenAI. See traceback:")
+        error_message = "OpenAI servers seem to be overloaded, try again later!"
+        return error_message, error
+    except Exception as e:
+        error = True
+        logger.exception(
+            "Some kind of error happened trying to generate the response. See traceback:"
+        )
+        error_message = (
+            "Something went wrong with connecting with OpenAI, try again soon!"
+        )
+        return error_message, error
+    if stream is True and response_model is None:
+        def answer_generator():
+            for chunk in response:
+                token = chunk.choices[0].delta.content
+                token = "" if token is None else token
+                yield token
+        return answer_generator(), error
+    else:
+        return response, error

scripts/create_db.ipynb ADDED Viewed

	@@ -0,0 +1,380 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Create AI-Tutor vector database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "# Set the \"OPENAI_API_KEY\" in the Python environment. Will be used by OpenAI client later.\n",
+    "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import nest_asyncio\n",
+    "\n",
+    "nest_asyncio.apply()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import chromadb\n",
+    "\n",
+    "# create client and a new collection\n",
+    "# chromadb.EphemeralClient saves data in-memory.\n",
+    "chroma_client = chromadb.PersistentClient(path=\"./ai-tutor-db\")\n",
+    "chroma_collection = chroma_client.create_collection(\"ai-tutor-db\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+    "from llama_index.core import StorageContext\n",
+    "\n",
+    "# Define a storage context object using the created vector database.\n",
+    "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)\n",
+    "storage_context = StorageContext.from_defaults(vector_store=vector_store)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import json\n",
+    "from llama_index.core.schema import TextNode\n",
+    "\n",
+    "\n",
+    "def load_jsonl_create_nodes(filepath):\n",
+    "    nodes = []  # List to hold the created node objects\n",
+    "    with open(filepath, \"r\") as file:\n",
+    "        for line in file:\n",
+    "            # Load each line as a JSON object\n",
+    "            json_obj = json.loads(line)\n",
+    "            # Extract required information\n",
+    "            title = json_obj.get(\"title\")\n",
+    "            url = json_obj.get(\"url\")\n",
+    "            content = json_obj.get(\"content\")\n",
+    "            source = json_obj.get(\"source\")\n",
+    "            # Create a TextNode object and append to the list\n",
+    "            node = TextNode(\n",
+    "                text=content,\n",
+    "                metadata={\"title\": title, \"url\": url, \"source\": source},\n",
+    "                excluded_embed_metadata_keys=[\"title\", \"url\", \"source\"],\n",
+    "                excluded_llm_metadata_keys=[\"title\", \"url\", \"source\"],\n",
+    "            )\n",
+    "            nodes.append(node)\n",
+    "    return nodes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "filepath = \"../data/ai-tutor-csv-files/combined_data_lines.jsonl\"\n",
+    "nodes = load_jsonl_create_nodes(filepath)\n",
+    "\n",
+    "print(f\"Loaded {len(nodes)} nodes/chunks from the JSONL file\\n \")\n",
+    "\n",
+    "node = nodes[0]\n",
+    "print(f\"ID: {node.id_} \\nText: {node.text}, \\nMetadata: {node.metadata}\")\n",
+    "\n",
+    "print(\"\\n\")\n",
+    "\n",
+    "node = nodes[-10000]\n",
+    "print(f\"ID: {node.id_} \\nText: {node.text}, \\nMetadata: {node.metadata}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# # Create the pipeline to apply the transformation on each chunk,\n",
+    "# # and store the transformed text in the chroma vector store.\n",
+    "# pipeline = IngestionPipeline(\n",
+    "#     transformations=[\n",
+    "#         text_splitter,\n",
+    "#         QuestionsAnsweredExtractor(questions=3, llm=llm),\n",
+    "#         SummaryExtractor(summaries=[\"prev\", \"self\"], llm=llm),\n",
+    "#         KeywordExtractor(keywords=10, llm=llm),\n",
+    "#         OpenAIEmbedding(),\n",
+    "#     ],\n",
+    "#     vector_store=vector_store\n",
+    "# )\n",
+    "\n",
+    "# nodes = pipeline.run(documents=documents, show_progress=True);"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+    "from llama_index.core import VectorStoreIndex\n",
+    "\n",
+    "# embeds = OpenAIEmbedding(model=\"text-embedding-3-small\", mode=\"similarity\")\n",
+    "# embeds = OpenAIEmbedding(model=\"text-embedding-3-large\", mode=\"similarity\")\n",
+    "embeds = OpenAIEmbedding(model=\"text-embedding-3-large\", mode=\"text_search\")\n",
+    "# embeds = OpenAIEmbedding(model=\"text-embedding-ada-002\", mode=\"similarity\")\n",
+    "\n",
+    "# Build index / generate embeddings using OpenAI.\n",
+    "index = VectorStoreIndex(nodes=nodes, show_progress=True, use_async=True, storage_context=storage_context, embed_model=embeds, insert_batch_size=3000,)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from llama_index.llms.openai import OpenAI\n",
+    "\n",
+    "llm = OpenAI(temperature=0, model=\"gpt-3.5-turbo-0125\", max_tokens=None)\n",
+    "query_engine = index.as_query_engine(llm=llm, similarity_top_k=5, embed_model=embeds)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "res = query_engine.query(\"What is the LLama model?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "res.response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for src in res.source_nodes:\n",
+    "  print(\"Node ID\\t\", src.node_id)\n",
+    "  print(\"Title\\t\", src.metadata['title'])\n",
+    "  print(\"Text\\t\", src.text)\n",
+    "  print(\"Score\\t\", src.score)\n",
+    "  print(\"Metadata\\t\", src.metadata) \n",
+    "  print(\"-_\"*20)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Load DB from disk"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import chromadb\n",
+    "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
+    "# Create your index\n",
+    "db2 = chromadb.PersistentClient(path=\"./ai-tutor-db\")\n",
+    "chroma_collection = db2.get_or_create_collection(\"ai-tutor-db\")\n",
+    "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Create your index\n",
+    "from llama_index.core import VectorStoreIndex\n",
+    "index = VectorStoreIndex.from_vector_store(vector_store=vector_store)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
+    "from llama_index.llms.openai import OpenAI\n",
+    "from llama_index.core.vector_stores import (\n",
+    "    ExactMatchFilter,\n",
+    "    MetadataFilters,\n",
+    "    MetadataFilter,\n",
+    "    FilterOperator,\n",
+    "    FilterCondition,\n",
+    ")\n",
+    "\n",
+    "\n",
+    "filters = MetadataFilters(\n",
+    "    filters=[\n",
+    "        MetadataFilter(key=\"source\", value=\"lanchain_course\"),\n",
+    "        MetadataFilter(key=\"source\", value=\"langchain_docs\"),\n",
+    "    ],\n",
+    "    condition=FilterCondition.OR,\n",
+    ")\n",
+    "\n",
+    "llm = OpenAI(temperature=0, model=\"gpt-3.5-turbo-0125\", max_tokens=None)\n",
+    "embeds = OpenAIEmbedding(model=\"text-embedding-3-large\", mode=\"text_search\")\n",
+    "# query_engine = index.as_query_engine(\n",
+    "#     llm=llm, similarity_top_k=5, embed_model=embeds, verbose=True, streaming=True, filters=filters\n",
+    "# )\n",
+    "query_engine = index.as_query_engine(\n",
+    "    llm=llm, similarity_top_k=5, embed_model=embeds, verbose=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "res = query_engine.query(\"What is the LLama model?\")\n",
+    "\n",
+    "# history = \"\"   \n",
+    "# for token in res.response_gen:\n",
+    "#     history += token\n",
+    "#     print(history)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "res.response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for src in res.source_nodes:\n",
+    "  print(\"Node ID\\t\", src.node_id)\n",
+    "  print(\"Source\\t\", src.metadata['source'])\n",
+    "  print(\"Title\\t\", src.metadata['title'])\n",
+    "  print(\"Text\\t\", src.text)\n",
+    "  print(\"Score\\t\", src.score)\n",
+    "  print(\"-_\"*20)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from IPython.display import Markdown, display\n",
+    "# define prompt viewing function\n",
+    "def display_prompt_dict(prompts_dict):\n",
+    "    for k, p in prompts_dict.items():\n",
+    "        text_md = f\"**Prompt Key**: {k}<br>\" f\"**Text:** <br>\"\n",
+    "        display(Markdown(text_md))\n",
+    "        print(p.get_template())\n",
+    "        display(Markdown(\"<br><br>\"))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "prompts_dict = query_engine.get_prompts()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "display_prompt_dict(prompts_dict)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "env",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

scripts/gradio-ui.py ADDED Viewed

	@@ -0,0 +1,295 @@

+import os
+import logging
+from typing import Optional
+from datetime import datetime
+import chromadb
+from llama_index.core.tools import QueryEngineTool, FunctionTool, ToolMetadata
+from llama_index.agent.openai import OpenAIAgent
+from llama_index.vector_stores.chroma import ChromaVectorStore
+from llama_index.core import VectorStoreIndex
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.llms.openai import OpenAI
+from llama_index.core.vector_stores import (
+    MetadataFilters,
+    MetadataFilter,
+    FilterCondition,
+)
+import gradio as gr
+from gradio.themes.utils import (
+    fonts,
+)
+from utils import init_mongo_db
+from tutor_prompts import (
+    TEXT_QA_TEMPLATE,
+    QueryValidation,
+    system_message_validation,
+    system_message_openai_agent,
+)
+from call_openai import api_function_call
+logger = logging.getLogger(__name__)
+logging.basicConfig(level=logging.INFO)
+logging.getLogger("httpx").setLevel(logging.WARNING)
+# # This variables are used to intercept API calls
+# # launch mitmweb
+# cert_file = "/Users/omar/Downloads/mitmproxy-ca-cert.pem"
+# os.environ["REQUESTS_CA_BUNDLE"] = cert_file
+# os.environ["SSL_CERT_FILE"] = cert_file
+# os.environ["HTTPS_PROXY"] = "http://127.0.0.1:8080"
+CONCURRENCY_COUNT = int(os.getenv("CONCURRENCY_COUNT", 64))
+MONGODB_URI = os.getenv("MONGODB_URI")
+AVAILABLE_SOURCES_UI = [
+    "Gen AI 360: LLMs",
+    "Gen AI 360: LangChain",
+    "Gen AI 360: Advanced RAG",
+    "Towards AI Blog",
+    "Activeloop Docs",
+    "HF Transformers Docs",
+    "Wikipedia",
+    "OpenAI Docs",
+    "LangChain Docs",
+]
+AVAILABLE_SOURCES = [
+    "llm_course",
+    "langchain_course",
+    "advanced_rag_course",
+    "towards_ai",
+    "activeloop",
+    "hf_transformers",
+    "wikipedia",
+    "openai",
+    "langchain_docs",
+]
+# Initialize MongoDB
+mongo_db = (
+    init_mongo_db(uri=MONGODB_URI, db_name="towardsai-buster")
+    if MONGODB_URI
+    else logger.warning("No mongodb uri found, you will not be able to save data.")
+)
+# Initialize vector store and index
+db2 = chromadb.PersistentClient(path="scripts/ai-tutor-db")
+chroma_collection = db2.get_or_create_collection("ai-tutor-db")
+vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
+index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
+# Initialize OpenAI models
+llm = OpenAI(temperature=0, model="gpt-3.5-turbo-0125", max_tokens=None)
+# embeds = OpenAIEmbedding(model="text-embedding-3-large", mode="text_search")
+embeds = OpenAIEmbedding(model="text-embedding-3-large", mode="similarity")
+query_engine = index.as_query_engine(
+    llm=llm,
+    similarity_top_k=5,
+    embed_model=embeds,
+    streaming=True,
+    text_qa_template=TEXT_QA_TEMPLATE,
+)
+query_engine_tools = [
+    QueryEngineTool(
+        query_engine=query_engine,
+        metadata=ToolMetadata(
+            name="AI_information",
+            description="""The 'AI_information' tool serves as a comprehensive repository for insights into the field of artificial intelligence. When utilizing this tool, the input should be the user's complete question. The input can also be adapted to focus on specific aspects or further details of the current topic under discussion. This dynamic input approach allows for a tailored exploration of AI subjects, ensuring that responses are relevant and informative. Employ this tool to fetch nuanced information on topics such as model training, fine-tuning, LLM augmentation, and more, thereby facilitating a rich, context-aware dialogue.""",
+        ),
+    )
+]
+def initialize_agent():
+    agent = OpenAIAgent.from_tools(
+        query_engine_tools,
+        llm=llm,
+        verbose=True,
+        system_prompt=system_message_openai_agent,
+    )
+    return agent
+def reset_agent(agent_state):
+    agent_state = initialize_agent()  # Reset the agent by reassigning a new instance
+    chatbot = [[None, None]]
+    return "Agent has been reset.", chatbot
+def log_emails(email: gr.Textbox):
+    collection = "email_data-test"
+    logger.info(f"User reported {email=}")
+    email_document = {"email": email}
+    try:
+        mongo_db[collection].insert_one(email_document)
+        logger.info("")
+    except:
+        logger.info("Something went wrong logging")
+    return ""
+def format_sources(completion) -> str:
+    if len(completion.source_nodes) == 0:
+        return ""
+    # Mapping of source system names to user-friendly names
+    display_source_to_ui = {
+        src: ui for src, ui in zip(AVAILABLE_SOURCES, AVAILABLE_SOURCES_UI)
+    }
+    documents_answer_template: str = (
+        "📝 Here are the sources I used to answer your question:\n\n{documents}"
+    )
+    document_template: str = "[🔗 {source}: {title}]({url}), relevance: {score:2.2f}"
+    documents = "\n".join(
+        [
+            document_template.format(
+                title=src.metadata["title"],
+                score=src.score,
+                source=display_source_to_ui.get(
+                    src.metadata["source"], src.metadata["source"]
+                ),
+                url=src.metadata["url"],
+            )
+            for src in completion.source_nodes
+        ]
+    )
+    return documents_answer_template.format(documents=documents)
+def add_sources(history, completion):
+    if completion is None:
+        yield history
+    formatted_sources = format_sources(completion)
+    if formatted_sources == "":
+        yield history
+    history[-1][1] += "\n\n" + formatted_sources
+    yield history
+def user(user_input, history, agent_state):
+    agent = agent_state
+    return "", history + [[user_input, None]]
+def get_answer(history, agent_state):
+    user_input = history[-1][0]
+    history[-1][1] = ""
+    completion = agent_state.stream_chat(user_input)
+    for token in completion.response_gen:
+        history[-1][1] += token
+        yield history, completion
+    logger.info(f"completion: {history[-1][1]=}")
+example_questions = [
+    "What is the LLama model?",
+    "What is a Large Language Model?",
+    "What is an embedding?",
+]
+theme = gr.themes.Soft()
+with gr.Blocks(
+    theme=gr.themes.Soft(
+        primary_hue="blue",
+        secondary_hue="blue",
+        font=fonts.GoogleFont("Source Sans Pro"),
+        font_mono=fonts.GoogleFont("IBM Plex Mono"),
+    ),
+    fill_height=True,
+) as demo:
+    agent_state = gr.State(initialize_agent())
+    with gr.Row():
+        gr.HTML(
+            "<h3><center>Towards AI 🤖: A Question-Answering Bot for anything AI-related</center></h3>"
+        )
+    chatbot = gr.Chatbot(
+        elem_id="chatbot",
+        show_copy_button=True,
+        scale=2,
+        likeable=True,
+        show_label=False,
+    )
+    with gr.Row():
+        question = gr.Textbox(
+            label="What's your question?",
+            placeholder="Ask a question to the AI tutor here...",
+            lines=1,
+            scale=7,
+            show_label=False,
+        )
+        submit = gr.Button(value="Send", variant="primary", scale=1)
+        reset_button = gr.Button("Reset Chat", variant="secondary", scale=1)
+    with gr.Row():
+        examples = gr.Examples(
+            examples=example_questions,
+            inputs=question,
+        )
+        with gr.Row():
+            email = gr.Textbox(
+                label="Want to receive updates about our AI tutor?",
+                placeholder="Enter your email here...",
+                lines=1,
+                scale=6,
+            )
+            submit_email = gr.Button(value="Submit", variant="secondary", scale=1)
+    gr.Markdown(
+        "This application uses GPT3.5-Turbo to search the docs for relevant information and answer questions."
+    )
+    completion = gr.State()
+    submit.click(
+        user, [question, chatbot, agent_state], [question, chatbot], queue=False
+    ).then(
+        get_answer,
+        inputs=[chatbot, agent_state],
+        outputs=[chatbot, completion],
+    ).then(
+        add_sources, inputs=[chatbot, completion], outputs=[chatbot]
+    )
+    # .then(
+    # save_completion, inputs=[completion, chatbot]
+    # )
+    question.submit(
+        user, [question, chatbot, agent_state], [question, chatbot], queue=False
+    ).then(
+        get_answer,
+        inputs=[chatbot, agent_state],
+        outputs=[chatbot, completion],
+    ).then(
+        add_sources, inputs=[chatbot, completion], outputs=[chatbot]
+    )
+    # .then(
+    #     save_completion, inputs=[completion, chatbot]
+    # )
+    reset_button.click(
+        reset_agent, inputs=[agent_state], outputs=[agent_state, chatbot]
+    )
+    submit_email.click(log_emails, email, email)
+    email.submit(log_emails, email, email)
+demo.queue(default_concurrency_limit=CONCURRENCY_COUNT)
+demo.launch(debug=False, share=False)

scripts/tutor_prompts.py ADDED Viewed

	@@ -0,0 +1,100 @@

+from llama_index.core.llms import ChatMessage, MessageRole
+from llama_index.core import ChatPromptTemplate
+from pydantic import BaseModel, Field
+default_user_prompt = (
+    "Context information is below.\n"
+    "---------------------\n"
+    "{context_str}\n"
+    "---------------------\n"
+    "Given the context information and not prior knowledge, "
+    "answer the question: {query_str}\n"
+)
+system_prompt = (
+    "You are a witty AI teacher, helpfully answering questions from students of an applied artificial intelligence course on Large Language Models (LLMs or llm). Topics covered include training models, fine tuning models, giving memory to LLMs, prompting, hallucinations and bias, vector databases, transformer architectures, embeddings, Langchain, making LLMs interact with tool use, AI agents, reinforcement learning with human feedback. Questions should be understood with this context."
+    "You are provided information found in the json documentation. "
+    "Only respond with information inside the json documentation. DO NOT use additional information, even if you know the answer. "
+    "If the answer is in the documentation, answer the question (depending on the questions and the variety of relevant information in the json documentation, answer in 5 paragraphs."
+    "If the documentation does not discuss the topic related to the question, kindly respond that you cannot answer the question because it is not part of your knowledge. "
+    "Here is the information you can use in order: \n"
+    "---------------------\n"
+    "{context_str}\n"
+    "---------------------\n"
+    "REMEMBER:\n"
+    "You are a witty AI teacher, helpfully answering questions from students of an applied artificial intelligence course on Large Language Models (LLMs or llm). Topics covered include training models, fine tuning models, giving memory to LLMs, prompting, hallucinations and bias, vector databases, transformer architectures, embeddings, Langchain, making LLMs interact with tool use, AI agents, reinforcement learning with human feedback. Questions should be understood with this context."
+    "You are provided information found in the json documentation. "
+    "Here are the rules you must follow:\n"
+    "* Only respond with information inside the json documentation. DO NOT provide additional information, even if you know the answer. "
+    "* If the answer is in the documentation, answer the question (depending on the questions and the variety of relevant information in the json documentation. Your answer needs to be pertinent and not redundant giving a clear explanation as if you were a teacher. "
+    "* If the documentation does not discuss the topic related to the question, kindly respond that you cannot answer the question because it is not part of your knowledge. "
+    "* Only use information summarized from the json documentation, do not respond otherwise. "
+    "* Do not refer to the json documentation directly, but use the instructions provided within it to answer questions. "
+    "* Do not reference any links, urls or hyperlinks in your answers.\n"
+    "* Make sure to format your answers in Markdown format, including code block and snippets.\n"
+    "* If you do not know the answer to a question, or if it is completely irrelevant to the AI courses, simply reply with:\n"
+    "'I'm sorry, but I couldn't find the information that answers you question. Is there anything else I can assist you with?'"
+    "For example:\n"
+    "What is the meaning of life for a qa bot?\n"
+    "I'm sorry, but I couldn't find the information that answers you question. Is there anything else I can assist you with?"
+    "Now answer the following question: \n"
+)
+chat_text_qa_msgs: list[ChatMessage] = [
+    ChatMessage(role=MessageRole.SYSTEM, content=system_prompt),
+    ChatMessage(
+        role=MessageRole.USER,
+        content="{query_str}",
+    ),
+]
+TEXT_QA_TEMPLATE = ChatPromptTemplate(chat_text_qa_msgs)
+system_message_validation = """- You are a witty AI teacher, helpfully answering questions from students studying the field of applied artificial intelligence.
+- Your job is to determine whether user's question is valid or not. Users will not always submit a question either.
+- Users will ask all sorts of questions, and some might be tangentially related to artificial intelligence (AI), machine learning (ML), natural language processing (NLP), computer vision (CV) or generative AI.
+- Users can ask how to build LLM-powered apps, with LangChain, LlamaIndex, Deep Lake, Chroma DB among other technologies including OpenAI, RAG and more.
+- As long as a question is somewhat related to the topic of AI, ML, NLP, RAG, data and techniques used in AI like vector embeddings, memories, embeddings, tokenization, encoding, databases, RAG (Retrieval-Augmented Generation), Langchain, LlamaIndex, LLMs (Large Language Models), Preprocessing techniques, Document loading, Chunking, Indexing of document segments, Embedding models, Chains, Memory modules, Vector stores, Chat models, Sequential chains, Information Retrieval, Data connectors, LlamaHub, Node objects, Query engines, Fine-tuning, Activeloop’s Deep Memory, Prompt engineering, Synthetic training dataset, Inference, Recall rates, Query construction, Query expansion, Query transformation, Re-ranking, Cohere Reranker, Recursive retrieval, Small-to-big retrieval, Hybrid searches, Hit Rate, Mean Reciprocal Rank (MRR), GPT-4, Agents, OpenGPTs, Zero-shot ReAct, Conversational Agent, OpenAI Assistants API, Hugging Face Inference API, Code Interpreter, Knowledge Retrieval, Function Calling, Whisper, Dall-E 3, GPT-4 Vision, Unstructured, Deep Lake, FaithfulnessEvaluator, RAGAS, LangSmith, LangChain Hub, LangServe, REST API, respond 'true'. If a question is on a different subject or unrelated, respond 'false'.
+- Make sure the question is a valid question.
+Here is a list of acronyms and concepts related to Artificial Intelligence AI that are valid. The following terms can be Uppercase or Lowercase:
+You are case insensitive.
+'TQL', 'Deep Memory', 'LLM', 'Llama', 'llamaindex', 'llama-index', 'lang chain', 'langchain', 'llama index', 'GPT', 'NLP', 'RLHF', 'RLAIF', 'Mistral', 'SFT', 'Cohere', 'NanoGPT', 'ReAct', 'LoRA', 'QLoRA', 'LMMOps', 'Alpaca', 'Flan', 'Weights and Biases', 'W&B', 'IDEFICS', 'Flamingo', 'LLaVA', 'BLIP', 'Falcon', 'Gemini'
+"""
+class QueryValidation(BaseModel):
+    """
+    Validate the user query. Use the guidelines given to you.
+    """
+    user_query: str = Field(
+        description="The user query to validate.",
+    )
+    chain_of_thought: str = Field(
+        description="Is the user query valid given the above guidelines? Think step-by-step. Write down your reasoning here.",
+    )
+    is_valid: bool = Field(
+        description="Based on the previous reasoning, answer with True if the query is related to AI. Answer False otherwise.",
+    )
+    reason: str = Field(
+        description="Explain why the query was valid or not. What are the keywords that make it valid or invalid?",
+    )
+system_message_openai_agent = """You are a witty AI teacher, adeptly responding to students' inquiries within the realm of applied artificial intelligence. The scope encompasses training models, fine-tuning models, augmenting LLMs with memory, crafting effective prompts, addressing hallucinations and biases, exploring vector databases, understanding transformer architectures, utilizing embeddings, discovering Langchain, integrating tool use in LLMs, deploying AI agents, and employing reinforcement learning with human feedback. To navigate these discussions:
+Utilize the AI_information tool to gather insights pertinent to the field of AI. This function accepts a string (the complete user question) and returns informative content regarding the domain of AI.
+AI_information: A tool for acquiring knowledge about AI. Directly forward the user's question or a refined version focusing on the current discussion topic to this tool.
+Your responses are exclusively based on the output provided by the AI_information tool. Refrain from incorporating external knowledge or information not directly obtained from the tool's responses.
+When the conversation deepens or shifts focus within a topic, adapt your inquiries to the AI_information tool to reflect these nuances. This means if a user requests further elaboration on a specific aspect of a previously discussed topic, you should reformulate your input to the tool to capture this new angle or more profound layer of inquiry.
+Provide comprehensive answers, ideally structured in up to ten paragraphs, drawing from the variety of relevant details furnished by the tool. The depth and breadth of your responses should align with the scope and specificity of the information retrieved.
+Should the AI_information tool's repository lack information on the queried topic, politely inform the user that the question transcends the bounds of your current knowledge base, citing the absence of relevant content in the tool's documentation.
+"""

scripts/utils.py ADDED Viewed

	@@ -0,0 +1,16 @@

+from pymongo.mongo_client import MongoClient
+from pymongo.server_api import ServerApi
+def init_mongo_db(uri: str, db_name: str):
+    """Initialize the mongodb database."""
+    try:
+        assert uri is not None, "No URI passed"
+        client = MongoClient(uri, server_api=ServerApi("1"))
+        database = client[db_name]
+        print("Connected to MongoDB")
+        return database
+    except Exception as e:
+        print("Something went wrong connecting to mongodb")
+        return