{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Step 3: LLM Model Evaluation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook, you'll deploy the Meta Llama 2 7B model and evaluate it's text generation capabilities and domain knowledge. You'll use the SageMaker Python SDK for Foundation Models and deploy the model for inference. \n",
    "\n",
    "The Llama 2 7B Foundation model performs the task of text generation. It takes a text string as input and predicts next words in the sequence. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Set Up\n",
    "There are some initial steps required for setup. If you recieve warnings after running these cells, you can ignore them as they won't impact the code running in the notebook. Run the cell below to ensure you're using the latest version of the Sagemaker Python client library. Restart the Kernel after you run this cell. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!pip install ipywidgets==7.0.0 --quiet\n",
    "!pip install --upgrade sagemaker datasets --quiet"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***! Restart the notebook kernel now after running the above cell and before you run any cells below !*** "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To deploy the model on Amazon Sagemaker, we need to setup and authenticate the use of AWS services. Yo'll uuse the execution role associated with the current notebook instance as the AWS account role with SageMaker access. Validate your role is the Sagemaker IAM role you created for the project by running the next cell. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml\n",
      "sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml\n",
      "arn:aws:iam::558778471579:role/service-role/SageMaker-ProjectSagemakerRole\n",
      "us-west-2\n",
      "<sagemaker.session.Session object at 0x7ff7ca50a530>\n"
     ]
    }
   ],
   "source": [
    "import sagemaker, boto3, json\n",
    "from sagemaker.session import Session\n",
    "\n",
    "sagemaker_session = Session()\n",
    "aws_role = sagemaker_session.get_caller_identity_arn()\n",
    "aws_region = boto3.Session().region_name\n",
    "sess = sagemaker.Session()\n",
    "print(aws_role)\n",
    "print(aws_region)\n",
    "print(sess)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Select Text Generation Model Meta Llama 2 7B\n",
    "Run the next cell to set variables that contain the values of the name of the model we want to load and the version of the model ."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "(model_id, model_version,) = (\"meta-textgeneration-llama-2-7b\",\"2.*\",)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Running the next cell deploys the model\n",
    "This Python code is used to deploy a machine learning model using Amazon SageMaker's JumpStart library. \n",
    "\n",
    "1. Import the `JumpStartModel` class from the `sagemaker.jumpstart.model` module.\n",
    "\n",
    "2. Create an instance of the `JumpStartModel` class using the `model_id` and `model_version` variables created in the previous cell. This object represents the machine learning model you want to deploy.\n",
    "\n",
    "3. Call the `deploy` method on the `JumpStartModel` instance. This method deploys the model on Amazon SageMaker and returns a `Predictor` object.\n",
    "\n",
    "The `Predictor` object (`predictor`) can be used to make predictions with the deployed model. The `deploy` method will automatically choose an endpoint name, instance type, and other deployment parameters. If you want to specify these parameters, you can pass them as arguments to the `deploy` method.\n",
    "\n",
    "**The next cell will take some time to run.**  It is deploying a large language model, and that takes time.  You'll see dashes (--) while it is being deployed.  Please be patient! You'll see an exclamation point at the end of the dashes (---!) when the model is deployed and then you can continue running the next cells.  \n",
    "\n",
    "You might see a warning \"For forward compatibility, pin to model_version...\" You can ignore this warning, just wait for the model to deploy. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "For forward compatibility, pin to model_version='2.*' in your JumpStartModel or JumpStartEstimator definitions. Note that major version upgrades may have different EULA acceptance terms and input/output signatures.\n",
      "Using vulnerable JumpStart model 'meta-textgeneration-llama-2-7b' and version '2.1.8'.\n",
      "Using model 'meta-textgeneration-llama-2-7b' with wildcard version identifier '2.*'. You can pin to version '2.1.8' for more stable results. Note that models may have different input/output signatures after a major version upgrade.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "----------------!"
     ]
    }
   ],
   "source": [
    "from sagemaker.jumpstart.model import JumpStartModel\n",
    "\n",
    "model = JumpStartModel(model_id=model_id, model_version=model_version, instance_type=\"ml.g5.2xlarge\")\n",
    "predictor = model.deploy()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Invoke the endpoint, query and parse response\n",
    "The next step is to invoke the model endpoint, send a query to the endpoint, and recieve a response from the model. \n",
    "\n",
    "Running the next cell defines a function that will be used to parse and print the response from the model. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def print_response(payload, response):\n",
    "    print(payload[\"inputs\"])\n",
    "    print(f\"> {response[0]['generation']}\")\n",
    "    print(\"\\n==================================\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The model takes a text string as input and predicts next words in the sequence, the input we send it is the prompt. \n",
    "\n",
    "The prompt we send the model should relate to the domain we'd like to fine-tune the model on.  This way we'll identify the model's domain knowledge before it's fine-tuned, and then we can run the same prompts on the fine-tuned model.   \n",
    "\n",
    "**Replace \"inputs\"** in the next cell with the input to send the model based on the domain you've chosen. \n",
    "\n",
    "**For financial domain:**\n",
    "\n",
    "  \"inputs\": \"Replace with sentence below\"  \n",
    "- \"The  investment  tests  performed  indicate\"\n",
    "- \"the  relative  volume  for  the  long  out  of  the  money  options, indicates\"\n",
    "- \"The  results  for  the  short  in  the  money  options\"\n",
    "- \"The  results  are  encouraging  for  aggressive  investors\"\n",
    "\n",
    "**For medical domain:** \n",
    "\n",
    "  \"inputs\": \"Replace with sentence below\" \n",
    "- \"Myeloid neoplasms and acute leukemias derive from\"\n",
    "- \"Genomic characterization is essential for\"\n",
    "- \"Certain germline disorders may be associated with\"\n",
    "- \"In contrast to targeted approaches, genome-wide sequencing\"\n",
    "\n",
    "**For IT domain:** \n",
    "\n",
    "  \"inputs\": \"Replace with sentence below\" \n",
    "- \"Traditional approaches to data management such as\"\n",
    "- \"A second important aspect of ubiquitous computing environments is\"\n",
    "- \"because ubiquitous computing is intended to\" \n",
    "- \"outline the key aspects of ubiquitous computing from a data management perspective.\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "outline the key aspects of ubiquitous computing from a data management perspective.\n",
      ">  The data management aspects of ubiquitous computing are classified into three categories: (1) data management in ubiquitous environments, (2) data management in ubiquitous applications, and (3) data management in ubiquitous services. We discuss the data management aspects of each of these\n",
      "\n",
      "==================================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "payload = {\n",
    "    \"inputs\": \"outline the key aspects of ubiquitous computing from a data management perspective.\",\n",
    "    \"parameters\": {\n",
    "        \"max_new_tokens\": 64,\n",
    "        \"top_p\": 0.9,\n",
    "        \"temperature\": 0.6,\n",
    "        \"return_full_text\": False,\n",
    "    },\n",
    "}\n",
    "try:\n",
    "    response = predictor.predict(payload, custom_attributes=\"accept_eula=true\")\n",
    "    print_response(payload, response)\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The prompt is related to the domain you want to fine-tune your model on. You will see the outputs from the model without fine-tuning are limited in providing insightful or relevant content.\n",
    "\n",
    "**Use the output from this notebook to fill out the \"model evaluation\" section of the project documentation report**\n",
    "\n",
    "Take a screenshot of this file with the cell output for your project documentation report. Download it with cell output by making sure you used Save on the notebook before downloading \n",
    "\n",
    "**After you've filled out the report, run the cells below to delete the model deployment** \n",
    "\n",
    "`IF YOU FAIL TO RUN THE CELLS BELOW YOU WILL RUN OUT OF BUDGET TO COMPLETE THE PROJECT`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Delete the SageMaker endpoint and the attached resources\n",
    "predictor.delete_model()\n",
    "predictor.delete_endpoint()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Verify your model endpoint was deleted by visiting the Sagemaker dashboard and choosing `endpoints` under 'Inference' in the left navigation menu.  If you see your endpoint still there, choose the endpoint, and then under \"Actions\" select **Delete**"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_pytorch_p310",
   "language": "python",
   "name": "conda_pytorch_p310"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}