{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "njb_ProuHiOe" }, "source": [ "# Unit 1: Train your first Deep Reinforcement Learning Agent ๐Ÿค–\n", "\n", "![Cover](https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/thumbnail.jpg)\n", "\n", "In this notebook, you'll train your **first Deep Reinforcement Learning agent** a Lunar Lander agent that will learn to **land correctly on the Moon ๐ŸŒ•**. Using [Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/) a Deep Reinforcement Learning library, share them with the community, and experiment with different configurations\n", "\n", "โฌ‡๏ธ Here is an example of what **you will achieve in just a couple of minutes.** โฌ‡๏ธ\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "PF46MwbZD00b", "colab": { "base_uri": "https://localhost:8080/", "height": 421 }, "outputId": "1826db31-4146-41f5-b6da-8a511cd7dd17" }, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n" ] }, "metadata": {} } ], "source": [ "%%html\n", "" ] }, { "cell_type": "markdown", "source": [ "### The environment ๐ŸŽฎ\n", "\n", "- [LunarLander-v2](https://gymnasium.farama.org/environments/box2d/lunar_lander/)\n", "\n", "### The library used ๐Ÿ“š\n", "\n", "- [Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/)" ], "metadata": { "id": "x7oR6R-ZIbeS" } }, { "cell_type": "markdown", "source": [ "We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)." ], "metadata": { "id": "OwEcFHe9RRZW" } }, { "cell_type": "markdown", "metadata": { "id": "4i6tjI2tHQ8j" }, "source": [ "## Objectives of this notebook ๐Ÿ†\n", "\n", "At the end of the notebook, you will:\n", "\n", "- Be able to use **Gymnasium**, the environment library.\n", "- Be able to use **Stable-Baselines3**, the deep reinforcement learning library.\n", "- Be able to **push your trained agent to the Hub** with a nice video replay and an evaluation score ๐Ÿ”ฅ.\n", "\n", "\n" ] }, { "cell_type": "markdown", "source": [ "## This notebook is from Deep Reinforcement Learning Course\n", "\n", "\"Deep" ], "metadata": { "id": "Ff-nyJdzJPND" } }, { "cell_type": "markdown", "metadata": { "id": "6p5HnEefISCB" }, "source": [ "In this free course, you will:\n", "\n", "- ๐Ÿ“– Study Deep Reinforcement Learning in **theory and practice**.\n", "- ๐Ÿง‘โ€๐Ÿ’ป Learn to **use famous Deep RL libraries** such as Stable Baselines3, RL Baselines3 Zoo, CleanRL and Sample Factory 2.0.\n", "- ๐Ÿค– Train **agents in unique environments**\n", "- ๐ŸŽ“ **Earn a certificate of completion** by completing 80% of the assignments.\n", "\n", "And more!\n", "\n", "Check ๐Ÿ“š the syllabus ๐Ÿ‘‰ https://simoninithomas.github.io/deep-rl-course\n", "\n", "Donโ€™t forget to **sign up to the course** (we are collecting your email to be able toย **send you the links when each Unit is published and give you information about the challenges and updates).**\n", "\n", "The best way to keep in touch and ask questions is **to join our discord server** to exchange with the community and with us ๐Ÿ‘‰๐Ÿป https://discord.gg/ydHrjt3WP5" ] }, { "cell_type": "markdown", "metadata": { "id": "Y-mo_6rXIjRi" }, "source": [ "## Prerequisites ๐Ÿ—๏ธ\n", "\n", "Before diving into the notebook, you need to:\n", "\n", "๐Ÿ”ฒ ๐Ÿ“ **[Read Unit 0](https://huggingface.co/deep-rl-course/unit0/introduction)** that gives you all the **information about the course and helps you to onboard** ๐Ÿค—\n", "\n", "๐Ÿ”ฒ ๐Ÿ“š **Develop an understanding of the foundations of Reinforcement learning** (RL process, Rewards hypothesis...) by [reading Unit 1](https://huggingface.co/deep-rl-course/unit1/introduction)." ] }, { "cell_type": "markdown", "source": [ "## A small recap of Deep Reinforcement Learning ๐Ÿ“š\n", "\n", "\"The" ], "metadata": { "id": "HoeqMnr5LuYE" } }, { "cell_type": "markdown", "metadata": { "id": "xcQYx9ynaFMD" }, "source": [ "Let's do a small recap on what we learned in the first Unit:\n", "\n", "- Reinforcement Learning is a **computational approach to learning from actions**. We build an agent that learns from the environment by **interacting with it through trial and error** and receiving rewards (negative or positive) as feedback.\n", "\n", "- The goal of any RL agent is to **maximize its expected cumulative reward** (also called expected return) because RL is based on the _reward hypothesis_, which is that all goals can be described as the maximization of an expected cumulative reward.\n", "\n", "- The RL process is a **loop that outputs a sequence of state, action, reward, and next state**.\n", "\n", "- To calculate the expected cumulative reward (expected return), **we discount the rewards**: the rewards that come sooner (at the beginning of the game) are more probable to happen since they are more predictable than the long-term future reward.\n", "\n", "- To solve an RL problem, you want to **find an optimal policy**; the policy is the \"brain\" of your AI that will tell us what action to take given a state. The optimal one is the one that gives you the actions that max the expected return.\n", "\n", "There are **two** ways to find your optimal policy:\n", "\n", "- By **training your policy directly**: policy-based methods.\n", "- By **training a value function** that tells us the expected return the agent will get at each state and use this function to define our policy: value-based methods.\n", "\n", "- Finally, we spoke about Deep RL because **we introduce deep neural networks to estimate the action to take (policy-based) or to estimate the value of a state (value-based) hence the name \"deep.\"**" ] }, { "cell_type": "markdown", "source": [ "# Let's train our first Deep Reinforcement Learning agent and upload it to the Hub ๐Ÿš€\n", "\n", "## Get a certificate ๐ŸŽ“\n", "\n", "To validate this hands-on for the [certification process](https://huggingface.co/deep-rl-course/en/unit0/introduction#certification-process), you need to push your trained model to the Hub and **get a result of >= 200**.\n", "\n", "To find your result, go to the [leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) and find your model, **the result = mean_reward - std of reward**\n", "\n", "For more information about the certification process, check this section ๐Ÿ‘‰ https://huggingface.co/deep-rl-course/en/unit0/introduction#certification-process" ], "metadata": { "id": "qDploC3jSH99" } }, { "cell_type": "markdown", "source": [ "## Set the GPU ๐Ÿ’ช\n", "\n", "- To **accelerate the agent's training, we'll use a GPU**. To do that, go to `Runtime > Change Runtime type`\n", "\n", "\"GPU" ], "metadata": { "id": "HqzznTzhNfAC" } }, { "cell_type": "markdown", "metadata": { "id": "38HBd3t1SHJ8" }, "source": [ "- `Hardware Accelerator > GPU`\n", "\n", "\"GPU" ] }, { "cell_type": "markdown", "metadata": { "id": "jeDAH0h0EBiG" }, "source": [ "## Install dependencies and create a virtual screen ๐Ÿ”ฝ\n", "\n", "The first step is to install the dependencies, weโ€™ll install multiple ones.\n", "\n", "- `gymnasium[box2d]`: Contains the LunarLander-v2 environment ๐ŸŒ›\n", "- `stable-baselines3[extra]`: The deep reinforcement learning library.\n", "- `huggingface_sb3`: Additional code for Stable-baselines3 to load and upload models from the Hugging Face ๐Ÿค— Hub.\n", "\n", "To make things easier, we created a script to install all these dependencies." ] }, { "cell_type": "code", "source": [ "!apt install swig cmake" ], "metadata": { "id": "yQIGLPDkGhgG", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "afe338d1-5d22-485f-d4b7-4f162093e512" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Reading package lists... Done\n", "Building dependency tree... Done\n", "Reading state information... Done\n", "cmake is already the newest version (3.22.1-1ubuntu1.22.04.2).\n", "Suggested packages:\n", " swig-doc swig-examples swig4.0-examples swig4.0-doc\n", "The following NEW packages will be installed:\n", " swig swig4.0\n", "0 upgraded, 2 newly installed, 0 to remove and 45 not upgraded.\n", "Need to get 1,116 kB of archives.\n", "After this operation, 5,542 kB of additional disk space will be used.\n", "Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 swig4.0 amd64 4.0.2-1ubuntu1 [1,110 kB]\n", "Get:2 http://archive.ubuntu.com/ubuntu jammy/universe amd64 swig all 4.0.2-1ubuntu1 [5,632 B]\n", "Fetched 1,116 kB in 2s (452 kB/s)\n", "Selecting previously unselected package swig4.0.\n", "(Reading database ... 121925 files and directories currently installed.)\n", "Preparing to unpack .../swig4.0_4.0.2-1ubuntu1_amd64.deb ...\n", "Unpacking swig4.0 (4.0.2-1ubuntu1) ...\n", "Selecting previously unselected package swig.\n", "Preparing to unpack .../swig_4.0.2-1ubuntu1_all.deb ...\n", "Unpacking swig (4.0.2-1ubuntu1) ...\n", "Setting up swig4.0 (4.0.2-1ubuntu1) ...\n", "Setting up swig (4.0.2-1ubuntu1) ...\n", "Processing triggers for man-db (2.10.2-1) ...\n" ] } ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "9XaULfDZDvrC", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "97c3f94a-3205-4c82-aa9f-89b91013f7b8" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Collecting stable-baselines3==2.0.0a5 (from -r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Downloading stable_baselines3-2.0.0a5-py3-none-any.whl (177 kB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m177.5/177.5 kB\u001b[0m \u001b[31m4.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting swig (from -r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 2))\n", " Downloading swig-4.2.1-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.9 MB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m1.9/1.9 MB\u001b[0m \u001b[31m27.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting gymnasium[box2d] (from -r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 3))\n", " Downloading gymnasium-0.29.1-py3-none-any.whl (953 kB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m953.9/953.9 kB\u001b[0m \u001b[31m32.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting huggingface_sb3 (from -r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4))\n", " Downloading huggingface_sb3-3.0-py3-none-any.whl (9.7 kB)\n", "Collecting gymnasium==0.28.1 (from stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Downloading gymnasium-0.28.1-py3-none-any.whl (925 kB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m925.5/925.5 kB\u001b[0m \u001b[31m32.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (1.25.2)\n", "Requirement already satisfied: torch>=1.11 in /usr/local/lib/python3.10/dist-packages (from stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (2.3.0+cu121)\n", "Requirement already satisfied: cloudpickle in /usr/local/lib/python3.10/dist-packages (from stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (2.2.1)\n", "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (2.0.3)\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/dist-packages (from stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (3.7.1)\n", "Collecting jax-jumpy>=1.0.0 (from gymnasium==0.28.1->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Downloading jax_jumpy-1.0.0-py3-none-any.whl (20 kB)\n", "Requirement already satisfied: typing-extensions>=4.3.0 in /usr/local/lib/python3.10/dist-packages (from gymnasium==0.28.1->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (4.12.2)\n", "Collecting farama-notifications>=0.0.1 (from gymnasium==0.28.1->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Downloading Farama_Notifications-0.0.4-py3-none-any.whl (2.5 kB)\n", "INFO: pip is looking at multiple versions of gymnasium[box2d] to determine which version is compatible with other requirements. This could take a while.\n", "Collecting gymnasium[box2d] (from -r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 3))\n", " Downloading gymnasium-0.29.0-py3-none-any.whl (953 kB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m953.8/953.8 kB\u001b[0m \u001b[31m51.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting box2d-py==2.3.5 (from gymnasium==0.28.1->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Downloading box2d-py-2.3.5.tar.gz (374 kB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m374.4/374.4 kB\u001b[0m \u001b[31m31.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25h Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", "Collecting pygame==2.1.3 (from gymnasium==0.28.1->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Downloading pygame-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.7 MB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m13.7/13.7 MB\u001b[0m \u001b[31m40.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: huggingface-hub~=0.8 in /usr/local/lib/python3.10/dist-packages (from huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (0.23.4)\n", "Requirement already satisfied: pyyaml~=6.0 in /usr/local/lib/python3.10/dist-packages (from huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (6.0.1)\n", "Requirement already satisfied: wasabi in /usr/local/lib/python3.10/dist-packages (from huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (1.1.3)\n", "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (3.15.1)\n", "Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (2023.6.0)\n", "Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (24.1)\n", "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (2.31.0)\n", "Requirement already satisfied: tqdm>=4.42.1 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (4.66.4)\n", "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (1.12.1)\n", "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (3.3)\n", "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (3.1.4)\n", "Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)\n", "Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)\n", "Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)\n", "Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)\n", "Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)\n", "Collecting nvidia-cufft-cu12==11.0.2.54 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)\n", "Collecting nvidia-curand-cu12==10.3.2.106 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)\n", "Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)\n", "Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)\n", "Collecting nvidia-nccl-cu12==2.20.5 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)\n", "Collecting nvidia-nvtx-cu12==12.1.105 (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Using cached nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)\n", "Requirement already satisfied: triton==2.3.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (2.3.0)\n", "Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1))\n", " Downloading nvidia_nvjitlink_cu12-12.5.40-py3-none-manylinux2014_x86_64.whl (21.3 MB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m21.3/21.3 MB\u001b[0m \u001b[31m57.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (1.2.1)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (0.12.1)\n", "Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (4.53.0)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (1.4.5)\n", "Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (9.4.0)\n", "Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (3.1.2)\n", "Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (2.8.2)\n", "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (2023.4)\n", "Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (2024.1)\n", "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (1.16.0)\n", "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (2.1.5)\n", "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (3.3.2)\n", "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (3.7)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (2.0.7)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface-hub~=0.8->huggingface_sb3->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 4)) (2024.6.2)\n", "Requirement already satisfied: mpmath<1.4.0,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.11->stable-baselines3==2.0.0a5->-r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt (line 1)) (1.3.0)\n", "Building wheels for collected packages: box2d-py\n", " Building wheel for box2d-py (setup.py) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for box2d-py: filename=box2d_py-2.3.5-cp310-cp310-linux_x86_64.whl size=2349146 sha256=3c3e693f3256758e0671725ea8beb9ba7af9b6d7be4ad8563ce31790db09cf10\n", " Stored in directory: /root/.cache/pip/wheels/db/8f/6a/eaaadf056fba10a98d986f6dce954e6201ba3126926fc5ad9e\n", "Successfully built box2d-py\n", "Installing collected packages: swig, farama-notifications, box2d-py, pygame, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, jax-jumpy, nvidia-cusparse-cu12, nvidia-cudnn-cu12, gymnasium, nvidia-cusolver-cu12, huggingface_sb3, stable-baselines3\n", " Attempting uninstall: pygame\n", " Found existing installation: pygame 2.5.2\n", " Uninstalling pygame-2.5.2:\n", " Successfully uninstalled pygame-2.5.2\n", "Successfully installed box2d-py-2.3.5 farama-notifications-0.0.4 gymnasium-0.28.1 huggingface_sb3-3.0 jax-jumpy-1.0.0 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 pygame-2.1.3 stable-baselines3-2.0.0a5 swig-4.2.1\n" ] } ], "source": [ "!pip install -r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit1/requirements-unit1.txt" ] }, { "cell_type": "markdown", "source": [ "During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames).\n", "\n", "Hence the following cell will install virtual screen libraries and create and run a virtual screen ๐Ÿ–ฅ" ], "metadata": { "id": "BEKeXQJsQCYm" } }, { "cell_type": "code", "source": [ "!sudo apt-get update\n", "!sudo apt-get install -y python3-opengl\n", "!apt install ffmpeg\n", "!apt install xvfb\n", "!pip3 install pyvirtualdisplay" ], "metadata": { "id": "j5f2cGkdP-mb", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "b89c7883-ea83-4cda-b2d7-1bcd5f4fade7" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\r0% [Working]\r \rGet:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,626 B]\n", "Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 InRelease [1,581 B]\n", "Get:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 Packages [929 kB]\n", "Get:4 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]\n", "Hit:5 http://archive.ubuntu.com/ubuntu jammy InRelease\n", "Get:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]\n", "Hit:7 https://ppa.launchpadcontent.net/c2d4u.team/c2d4u4.0+/ubuntu jammy InRelease\n", "Get:8 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [1,922 kB]\n", "Hit:9 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease\n", "Hit:10 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease\n", "Hit:11 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease\n", "Hit:12 http://archive.ubuntu.com/ubuntu jammy-backports InRelease\n", "Get:13 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [1,392 kB]\n", "Get:14 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [2,474 kB]\n", "Get:15 http://security.ubuntu.com/ubuntu jammy-security/universe amd64 Packages [1,093 kB]\n", "Get:16 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [2,187 kB]\n", "Get:17 http://archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages [51.8 kB]\n", "Get:18 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages [2,552 kB]\n", "Fetched 12.9 MB in 5s (2,815 kB/s)\n", "Reading package lists... Done\n", "Reading package lists... Done\n", "Building dependency tree... Done\n", "Reading state information... Done\n", "The following additional packages will be installed:\n", " freeglut3 libglu1-mesa\n", "Suggested packages:\n", " libgle3 python3-numpy\n", "The following NEW packages will be installed:\n", " freeglut3 libglu1-mesa python3-opengl\n", "0 upgraded, 3 newly installed, 0 to remove and 45 not upgraded.\n", "Need to get 824 kB of archives.\n", "After this operation, 8,092 kB of additional disk space will be used.\n", "Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 freeglut3 amd64 2.8.1-6 [74.0 kB]\n", "Get:2 http://archive.ubuntu.com/ubuntu jammy/main amd64 libglu1-mesa amd64 9.0.2-1 [145 kB]\n", "Get:3 http://archive.ubuntu.com/ubuntu jammy/universe amd64 python3-opengl all 3.1.5+dfsg-1 [605 kB]\n", "Fetched 824 kB in 2s (403 kB/s)\n", "debconf: unable to initialize frontend: Dialog\n", "debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 78, <> line 3.)\n", "debconf: falling back to frontend: Readline\n", "debconf: unable to initialize frontend: Readline\n", "debconf: (This frontend requires a controlling tty.)\n", "debconf: falling back to frontend: Teletype\n", "dpkg-preconfigure: unable to re-open stdin: \n", "Selecting previously unselected package freeglut3:amd64.\n", "(Reading database ... 122678 files and directories currently installed.)\n", "Preparing to unpack .../freeglut3_2.8.1-6_amd64.deb ...\n", "Unpacking freeglut3:amd64 (2.8.1-6) ...\n", "Selecting previously unselected package libglu1-mesa:amd64.\n", "Preparing to unpack .../libglu1-mesa_9.0.2-1_amd64.deb ...\n", "Unpacking libglu1-mesa:amd64 (9.0.2-1) ...\n", "Selecting previously unselected package python3-opengl.\n", "Preparing to unpack .../python3-opengl_3.1.5+dfsg-1_all.deb ...\n", "Unpacking python3-opengl (3.1.5+dfsg-1) ...\n", "Setting up freeglut3:amd64 (2.8.1-6) ...\n", "Setting up libglu1-mesa:amd64 (9.0.2-1) ...\n", "Setting up python3-opengl (3.1.5+dfsg-1) ...\n", "Processing triggers for libc-bin (2.35-0ubuntu3.4) ...\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_0.so.3 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_5.so.3 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbb.so.12 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc.so.2 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbbind.so.3 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc_proxy.so.2 is not a symbolic link\n", "\n", "Reading package lists... Done\n", "Building dependency tree... Done\n", "Reading state information... Done\n", "ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).\n", "0 upgraded, 0 newly installed, 0 to remove and 45 not upgraded.\n", "Reading package lists... Done\n", "Building dependency tree... Done\n", "Reading state information... Done\n", "The following additional packages will be installed:\n", " libfontenc1 libxfont2 libxkbfile1 x11-xkb-utils xfonts-base xfonts-encodings xfonts-utils\n", " xserver-common\n", "The following NEW packages will be installed:\n", " libfontenc1 libxfont2 libxkbfile1 x11-xkb-utils xfonts-base xfonts-encodings xfonts-utils\n", " xserver-common xvfb\n", "0 upgraded, 9 newly installed, 0 to remove and 45 not upgraded.\n", "Need to get 7,813 kB of archives.\n", "After this operation, 11.9 MB of additional disk space will be used.\n", "Get:1 http://archive.ubuntu.com/ubuntu jammy/main amd64 libfontenc1 amd64 1:1.1.4-1build3 [14.7 kB]\n", "Get:2 http://archive.ubuntu.com/ubuntu jammy/main amd64 libxfont2 amd64 1:2.0.5-1build1 [94.5 kB]\n", "Get:3 http://archive.ubuntu.com/ubuntu jammy/main amd64 libxkbfile1 amd64 1:1.1.0-1build3 [71.8 kB]\n", "Get:4 http://archive.ubuntu.com/ubuntu jammy/main amd64 x11-xkb-utils amd64 7.7+5build4 [172 kB]\n", "Get:5 http://archive.ubuntu.com/ubuntu jammy/main amd64 xfonts-encodings all 1:1.0.5-0ubuntu2 [578 kB]\n", "Get:6 http://archive.ubuntu.com/ubuntu jammy/main amd64 xfonts-utils amd64 1:7.7+6build2 [94.6 kB]\n", "Get:7 http://archive.ubuntu.com/ubuntu jammy/main amd64 xfonts-base all 1:1.0.5 [5,896 kB]\n", "Get:8 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 xserver-common all 2:21.1.4-2ubuntu1.7~22.04.10 [28.5 kB]\n", "Get:9 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 xvfb amd64 2:21.1.4-2ubuntu1.7~22.04.10 [863 kB]\n", "Fetched 7,813 kB in 3s (2,578 kB/s)\n", "Selecting previously unselected package libfontenc1:amd64.\n", "(Reading database ... 125762 files and directories currently installed.)\n", "Preparing to unpack .../0-libfontenc1_1%3a1.1.4-1build3_amd64.deb ...\n", "Unpacking libfontenc1:amd64 (1:1.1.4-1build3) ...\n", "Selecting previously unselected package libxfont2:amd64.\n", "Preparing to unpack .../1-libxfont2_1%3a2.0.5-1build1_amd64.deb ...\n", "Unpacking libxfont2:amd64 (1:2.0.5-1build1) ...\n", "Selecting previously unselected package libxkbfile1:amd64.\n", "Preparing to unpack .../2-libxkbfile1_1%3a1.1.0-1build3_amd64.deb ...\n", "Unpacking libxkbfile1:amd64 (1:1.1.0-1build3) ...\n", "Selecting previously unselected package x11-xkb-utils.\n", "Preparing to unpack .../3-x11-xkb-utils_7.7+5build4_amd64.deb ...\n", "Unpacking x11-xkb-utils (7.7+5build4) ...\n", "Selecting previously unselected package xfonts-encodings.\n", "Preparing to unpack .../4-xfonts-encodings_1%3a1.0.5-0ubuntu2_all.deb ...\n", "Unpacking xfonts-encodings (1:1.0.5-0ubuntu2) ...\n", "Selecting previously unselected package xfonts-utils.\n", "Preparing to unpack .../5-xfonts-utils_1%3a7.7+6build2_amd64.deb ...\n", "Unpacking xfonts-utils (1:7.7+6build2) ...\n", "Selecting previously unselected package xfonts-base.\n", "Preparing to unpack .../6-xfonts-base_1%3a1.0.5_all.deb ...\n", "Unpacking xfonts-base (1:1.0.5) ...\n", "Selecting previously unselected package xserver-common.\n", "Preparing to unpack .../7-xserver-common_2%3a21.1.4-2ubuntu1.7~22.04.10_all.deb ...\n", "Unpacking xserver-common (2:21.1.4-2ubuntu1.7~22.04.10) ...\n", "Selecting previously unselected package xvfb.\n", "Preparing to unpack .../8-xvfb_2%3a21.1.4-2ubuntu1.7~22.04.10_amd64.deb ...\n", "Unpacking xvfb (2:21.1.4-2ubuntu1.7~22.04.10) ...\n", "Setting up libfontenc1:amd64 (1:1.1.4-1build3) ...\n", "Setting up xfonts-encodings (1:1.0.5-0ubuntu2) ...\n", "Setting up libxkbfile1:amd64 (1:1.1.0-1build3) ...\n", "Setting up libxfont2:amd64 (1:2.0.5-1build1) ...\n", "Setting up x11-xkb-utils (7.7+5build4) ...\n", "Setting up xfonts-utils (1:7.7+6build2) ...\n", "Setting up xfonts-base (1:1.0.5) ...\n", "Setting up xserver-common (2:21.1.4-2ubuntu1.7~22.04.10) ...\n", "Setting up xvfb (2:21.1.4-2ubuntu1.7~22.04.10) ...\n", "Processing triggers for man-db (2.10.2-1) ...\n", "Processing triggers for fontconfig (2.13.1-4.2ubuntu5) ...\n", "Processing triggers for libc-bin (2.35-0ubuntu3.4) ...\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_0.so.3 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_5.so.3 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbb.so.12 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc.so.2 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbbind.so.3 is not a symbolic link\n", "\n", "/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc_proxy.so.2 is not a symbolic link\n", "\n", "Collecting pyvirtualdisplay\n", " Downloading PyVirtualDisplay-3.0-py3-none-any.whl (15 kB)\n", "Installing collected packages: pyvirtualdisplay\n", "Successfully installed pyvirtualdisplay-3.0\n" ] } ] }, { "cell_type": "code", "source": [ "!pip install optuna\n", "\n", "import numpy as np\n", "from stable_baselines3 import PPO\n", "\n" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "PszQ9ncB-sGY", "outputId": "8adb2197-760f-47a6-fcdf-2dff9d9bb235" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Collecting optuna\n", " Downloading optuna-3.6.1-py3-none-any.whl (380 kB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m380.1/380.1 kB\u001b[0m \u001b[31m9.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting alembic>=1.5.0 (from optuna)\n", " Downloading alembic-1.13.1-py3-none-any.whl (233 kB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m233.4/233.4 kB\u001b[0m \u001b[31m23.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting colorlog (from optuna)\n", " Downloading colorlog-6.8.2-py3-none-any.whl (11 kB)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from optuna) (1.25.2)\n", "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from optuna) (24.1)\n", "Requirement already satisfied: sqlalchemy>=1.3.0 in /usr/local/lib/python3.10/dist-packages (from optuna) (2.0.30)\n", "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from optuna) (4.66.4)\n", "Requirement already satisfied: PyYAML in /usr/local/lib/python3.10/dist-packages (from optuna) (6.0.1)\n", "Collecting Mako (from alembic>=1.5.0->optuna)\n", " Downloading Mako-1.3.5-py3-none-any.whl (78 kB)\n", "\u001b[2K \u001b[90mโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”\u001b[0m \u001b[32m78.6/78.6 kB\u001b[0m \u001b[31m11.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: typing-extensions>=4 in /usr/local/lib/python3.10/dist-packages (from alembic>=1.5.0->optuna) (4.12.2)\n", "Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from sqlalchemy>=1.3.0->optuna) (3.0.3)\n", "Requirement already satisfied: MarkupSafe>=0.9.2 in /usr/local/lib/python3.10/dist-packages (from Mako->alembic>=1.5.0->optuna) (2.1.5)\n", "Installing collected packages: Mako, colorlog, alembic, optuna\n", "Successfully installed Mako-1.3.5 alembic-1.13.1 colorlog-6.8.2 optuna-3.6.1\n" ] } ] }, { "cell_type": "markdown", "source": [ "To make sure the new installed libraries are used, **sometimes it's required to restart the notebook runtime**. The next cell will force the **runtime to crash, so you'll need to connect again and run the code starting from here**. Thanks to this trick, **we will be able to run our virtual screen.**" ], "metadata": { "id": "TCwBTAwAW9JJ" } }, { "cell_type": "code", "source": [ "import os\n", "os.kill(os.getpid(), 9)" ], "metadata": { "id": "cYvkbef7XEMi" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# Virtual display\n", "from pyvirtualdisplay import Display\n", "\n", "virtual_display = Display(visible=0, size=(1400, 900))\n", "virtual_display.start()" ], "metadata": { "id": "BE5JWP5rQIKf", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "51d34ff2-73f9-42da-fb61-be840e8d9fa4" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": {}, "execution_count": 1 } ] }, { "cell_type": "markdown", "metadata": { "id": "wrgpVFqyENVf" }, "source": [ "## Import the packages ๐Ÿ“ฆ\n", "\n", "One additional library we import is huggingface_hub **to be able to upload and download trained models from the hub**.\n", "\n", "\n", "The Hugging Face Hub ๐Ÿค— works as a central place where anyone can share and explore models and datasets. It has versioning, metrics, visualizations and other features that will allow you to easily collaborate with others.\n", "\n", "You can see here all the Deep reinforcement Learning models available here๐Ÿ‘‰ https://huggingface.co/models?pipeline_tag=reinforcement-learning&sort=downloads\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "cygWLPGsEQ0m" }, "outputs": [], "source": [ "import gymnasium\n", "\n", "from huggingface_sb3 import load_from_hub, package_to_hub\n", "from huggingface_hub import notebook_login # To log to our Hugging Face account to be able to upload models to the Hub.\n", "\n", "from stable_baselines3 import PPO\n", "from stable_baselines3.common.env_util import make_vec_env\n", "from stable_baselines3.common.evaluation import evaluate_policy\n", "from stable_baselines3.common.monitor import Monitor" ] }, { "cell_type": "markdown", "metadata": { "id": "MRqRuRUl8CsB" }, "source": [ "## Understand Gymnasium and how it works ๐Ÿค–\n", "\n", "๐Ÿ‹ The library containing our environment is called Gymnasium.\n", "**You'll use Gymnasium a lot in Deep Reinforcement Learning.**\n", "\n", "Gymnasium is the **new version of Gym library** [maintained by the Farama Foundation](https://farama.org/).\n", "\n", "The Gymnasium library provides two things:\n", "\n", "- An interface that allows you to **create RL environments**.\n", "- A **collection of environments** (gym-control, atari, box2D...).\n", "\n", "Let's look at an example, but first let's recall the RL loop.\n", "\n", "\"The" ] }, { "cell_type": "markdown", "metadata": { "id": "-TzNN0bQ_j-3" }, "source": [ "At each step:\n", "- Our Agent receivesย a **state (S0)**ย from theย **Environment**ย โ€” we receive the first frame of our game (Environment).\n", "- Based on thatย **state (S0),**ย the Agent takes anย **action (A0)**ย โ€” our Agent will move to the right.\n", "- The environment transitions to aย **new**ย **state (S1)**ย โ€” new frame.\n", "- The environment gives someย **reward (R1)**ย to the Agent โ€” weโ€™re not deadย *(Positive Reward +1)*.\n", "\n", "\n", "With Gymnasium:\n", "\n", "1๏ธโƒฃ We create our environment using `gymnasium.make()`\n", "\n", "2๏ธโƒฃ We reset the environment to its initial state with `observation = env.reset()`\n", "\n", "At each step:\n", "\n", "3๏ธโƒฃ Get an action using our model (in our example we take a random action)\n", "\n", "4๏ธโƒฃ Using `env.step(action)`, we perform this action in the environment and get\n", "- `observation`: The new state (st+1)\n", "- `reward`: The reward we get after executing the action\n", "- `terminated`: Indicates if the episode terminated (agent reach the terminal state)\n", "- `truncated`: Introduced with this new version, it indicates a timelimit or if an agent go out of bounds of the environment for instance.\n", "- `info`: A dictionary that provides additional information (depends on the environment).\n", "\n", "For more explanations check this ๐Ÿ‘‰ https://gymnasium.farama.org/api/env/#gymnasium.Env.step\n", "\n", "If the episode is terminated:\n", "- We reset the environment to its initial state with `observation = env.reset()`\n", "\n", "**Let's look at an example!** Make sure to read the code\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "w7vOFlpA_ONz", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "455e58bc-f348-4ced-cfc0-76b12cb9dbad" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Action taken: 2\n", "Action taken: 2\n", "Action taken: 2\n", "Action taken: 0\n", "Action taken: 1\n", "Action taken: 0\n", "Action taken: 1\n", "Action taken: 1\n", "Action taken: 1\n", "Action taken: 2\n", "Action taken: 2\n", "Action taken: 0\n", "Action taken: 0\n", "Action taken: 3\n", "Action taken: 2\n", "Action taken: 0\n", "Action taken: 0\n", "Action taken: 0\n", "Action taken: 2\n", "Action taken: 0\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "/usr/local/lib/python3.10/dist-packages/ipykernel/ipkernel.py:283: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n", " and should_run_async(code)\n" ] } ], "source": [ "import gymnasium as gym\n", "\n", "# First, we create our environment called LunarLander-v2\n", "env = gym.make(\"LunarLander-v2\")\n", "\n", "# Then we reset this environment\n", "observation, info = env.reset()\n", "\n", "for _ in range(20):\n", " # Take a random action\n", " action = env.action_space.sample()\n", " print(\"Action taken:\", action)\n", "\n", " # Do this action in the environment and get\n", " # next_state, reward, terminated, truncated and info\n", " observation, reward, terminated, truncated, info = env.step(action)\n", "\n", " # If the game is terminated (in our case we land, crashed) or truncated (timeout)\n", " if terminated or truncated:\n", " # Reset the environment\n", " print(\"Environment is reset\")\n", " observation, info = env.reset()\n", "\n", "env.close()" ] }, { "cell_type": "code", "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from IPython import display as ipythondisplay\n", "\n", "from pyvirtualdisplay import Display\n", "display = Display(visible=0, size=(400, 300))\n", "display.start()\n", "\n", "\n", "envr = gym.make(\"LunarLander-v2\", render_mode=\"rgb_array\")\n", "envr.reset()\n", "prev_screen = envr.render()\n", "plt.imshow(prev_screen)\n", "\n", "for i in range(50):\n", " action = envr.action_space.sample()\n", " obs, reward, done, done2, info = envr.step(action)\n", " screen = envr.render()\n", "\n", " plt.imshow(screen)\n", " sdfgh\n", " ipythondisplay.clear_output()\n", " ipythondisplay.display(plt.gcf())\n", "\n", " if done or done2:\n", " break\n", "\n", "ipythondisplay.clear_output(wait=True)\n", "env.close()\n" ], "metadata": { "id": "rTVF15l-oesw", "colab": { "base_uri": "https://localhost:8080/", "height": 590 }, "outputId": "b9135ca8-b0d0-4ec9-ba29-c749e6c62d31" }, "execution_count": null, "outputs": [ { "output_type": "error", "ename": "NameError", "evalue": "name 'sdfgh' is not defined", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 19\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mimshow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mscreen\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 21\u001b[0;31m \u001b[0msdfgh\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 22\u001b[0m \u001b[0mipythondisplay\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mclear_output\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 23\u001b[0m \u001b[0mipythondisplay\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdisplay\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgcf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mNameError\u001b[0m: name 'sdfgh' is not defined" ] }, { "output_type": "display_data", "data": { "text/plain": [ "
" ], "image/png": "iVBORw0KGgoAAAANSUhEUgAAAigAAAF7CAYAAAD4/3BBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA5NUlEQVR4nO3deXiU5b3/8c9MliELk5BAMgQS9sXIIgWNU9ujLSmriEotKrWUQ7VQsFVaa+kRLD16YvV3tbWnirUqek5FVAouCFoMEkDDFojsYSmQIJkEiJlJAtlm7t8fKXOMgiYQmCfJ+3Vd34uZ57nnme/cAeaTZ5mxGWOMAAAALMQe6gYAAAA+j4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsJ6QB5amnnlLPnj3VoUMHZWRkaPPmzaFsBwAAWETIAsqrr76qOXPm6OGHH9a2bds0dOhQjR49WqWlpaFqCQAAWIQtVF8WmJGRoauvvlp//vOfJUmBQECpqam699579atf/SoULQEAAIsID8WT1tbWKi8vT3Pnzg0us9vtyszMVG5u7hfG19TUqKamJng/EAiorKxMiYmJstlsl6VnAABwcYwxqqioUEpKiuz2Lz+IE5KAcvLkSfn9fiUnJzdanpycrH379n1hfFZWlhYsWHC52gMAAJdQUVGRunfv/qVjQhJQmmvu3LmaM2dO8L7X61VaWloIOwLarx/etFTV9lPq1vFqRYTF6NMzR3S09EO9+8Fvg2MyrvmBvt5/thxhTuky7OU85tuodz6Yp08/LVTHaJe+MWy24jonKaXj1xQZFqvTdSe1seBZ5e9cqtra05e8HwBfrmPHjl85JiQBpXPnzgoLC1NJSUmj5SUlJXK5XF8Y73A45HA4Lld7AM7jm1fdK0XWq0uHAeoY5VKd/4xq5dWOPcsbjYsIdygyIlaO8I6X5TBsZES07PYwSVLlmVL989g6Xd15ik4HTqhjVFdFRkYrvdd4FX2yVcWevZJCcuodgH9pyv8LIbmKJzIyUsOHD1d2dnZwWSAQUHZ2ttxudyhaAvAVbLYwxcV1U8BWp/gOvWSMka/2E3lK9ujUp0fO8YjLGQJs/yrJmIAOFn0g36cn5D1TpFp/hey2cHWJHagBfb6jiIgOl7EvABcqZJcZz5kzR3/961/10ksvae/evZo5c6aqqqo0bdq0ULUE4Etcc+U0de06UIlR/WW3hasucFqV1R7l712qmtqKz42+/Cevf/Y3Mn+gTrk7nlVkwClP5ccKGL+iIzqrX+q3lZw84LL3BqD5QnYOyuTJk3XixAnNnz9fHo9HV111ld59990vnDgLIPRiojorPj5Fkl3REZ0lGZ2uO6kjx3NVUVFyjkcYVdWVqj5w5l/7UUyjdY1Hms8tPs/6zz3efGZ8rd8XPMRz1ony/dpZ8LYGXTlWFTXFiuvQXZ2jB2rQgPEqLT2g2tqqr37hAEImpCfJzp49W7Nnzw5lCwCaIMHZQ73TrpMjvKPC7Q75Ta1KfXv1yfEdqjx94gvjO4THq6beqzp/U05IPffeFtv5b3xBVHii7OfYIbzjwFJ17zpMEeFRionsLEd4R/VOvkGH0tbpwMH1TegNQKi0iqt4AISOI7KjvpXxC9UFzighoo8kmyprPTrx6X4d/eTcX0+R9/Fr2r33XdnOFSq+5OS4c44/b4D5v+VGRqfKjn5hTL2/Rpt3LNKYG+bLV/OJEqL6KCG6rwb2GaNPyz/RyZP/PG8vAEKLgALgK9kcAdlkk80WJiO/jpVvVvEne89x7kmDT71fDAuhcrL8oHbse0PDB9+umoguqqotVUA1iotLIaAAFkZAAfDVaiJUE1GhT3ybZLdFqq66VrsOvhnqrpqk3l+jI8c2Kimpr07E7FNZWaE2bl3E56EAFhey7+K5GD6fT3FxcaFuA2g3ohzxGtz3VqWmDFdsXCd9uPlZHSxcG+q2mqVb1yGKie6sQ0c2yO+vDXU7QLvm9XrldDq/dAwBBUCTxcd2V2ry1dp9+C0FAv5QtwOglWpKQOEQD4AmK688pvLKY6FuA0A7ELIPagMAADgfAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALCcFg8ov/nNb2Sz2RrVwIEDg+urq6s1a9YsJSYmKjY2VpMmTVJJSUlLtwEAAFqxS7IH5corr1RxcXGwNmzYEFx3//336+2339brr7+unJwcHT9+XLfeeuulaAMAALRS4Zdko+HhcrlcX1ju9Xr1/PPPa/Hixfr2t78tSVq0aJGuuOIKbdy4Uddee+2laAcAALQyl2QPyoEDB5SSkqLevXtrypQpKiwslCTl5eWprq5OmZmZwbEDBw5UWlqacnNzz7u9mpoa+Xy+RgUAANquFg8oGRkZevHFF/Xuu+9q4cKFOnz4sL75zW+qoqJCHo9HkZGRio+Pb/SY5ORkeTye824zKytLcXFxwUpNTW3ptgEAgIW0+CGesWPHBm8PGTJEGRkZ6tGjh1577TVFRUVd0Dbnzp2rOXPmBO/7fD5CCgAAbdglv8w4Pj5e/fv318GDB+VyuVRbW6vy8vJGY0pKSs55zspZDodDTqezUQEAgLbrkgeUyspKHTp0SF27dtXw4cMVERGh7Ozs4PqCggIVFhbK7XZf6lYAAEAr0eKHeH7xi19owoQJ6tGjh44fP66HH35YYWFhuuOOOxQXF6fp06drzpw5SkhIkNPp1L333iu3280VPAAAIKjFA8qxY8d0xx136NSpU+rSpYu+8Y1vaOPGjerSpYsk6Q9/+IPsdrsmTZqkmpoajR49Wk8//XRLtwEAAFoxmzHGhLqJ5vL5fIqLiwt1GwAA4AJ4vd6vPJ+U7+IBAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACWQ0ABAACW0+yAsm7dOk2YMEEpKSmy2Wx64403Gq03xmj+/Pnq2rWroqKilJmZqQMHDjQaU1ZWpilTpsjpdCo+Pl7Tp09XZWXlRb0QAADQdjQ7oFRVVWno0KF66qmnzrn+8ccf15/+9Cc988wz2rRpk2JiYjR69GhVV1cHx0yZMkW7d+/W6tWrtWLFCq1bt0733HPPhb8KAADQtpiLIMksX748eD8QCBiXy2WeeOKJ4LLy8nLjcDjMK6+8YowxZs+ePUaS2bJlS3DMqlWrjM1mM5988kmTntfr9RpJFEVRFEW1wvJ6vV/5Xt+i56AcPnxYHo9HmZmZwWVxcXHKyMhQbm6uJCk3N1fx8fEaMWJEcExmZqbsdrs2bdp0zu3W1NTI5/M1KgAA0Ha1aEDxeDySpOTk5EbLk5OTg+s8Ho+SkpIarQ8PD1dCQkJwzOdlZWUpLi4uWKmpqS3ZNgAAsJhWcRXP3Llz5fV6g1VUVBTqlgAAwCXUogHF5XJJkkpKShotLykpCa5zuVwqLS1ttL6+vl5lZWXBMZ/ncDjkdDobFQAAaLtaNKD06tVLLpdL2dnZwWU+n0+bNm2S2+2WJLndbpWXlysvLy84Zs2aNQoEAsrIyGjJdgAAQCsV3twHVFZW6uDBg8H7hw8fVn5+vhISEpSWlqb77rtPjzzyiPr166devXpp3rx5SklJ0c033yxJuuKKKzRmzBjdfffdeuaZZ1RXV6fZs2fr9ttvV0pKSou9MAAA0Io18YrioA8++OCclwxNnTrVGNNwqfG8efNMcnKycTgcZuTIkaagoKDRNk6dOmXuuOMOExsba5xOp5k2bZqpqKhocg9cZkxRFEVRrbeacpmxzRhj1Mr4fD7FxcWFug0AAHABvF7vV55P2iqu4gEAAO0LAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFgOAQUAAFhOswPKunXrNGHCBKWkpMhms+mNN95otP6HP/yhbDZboxozZkyjMWVlZZoyZYqcTqfi4+M1ffp0VVZWXtQLAQAAbUezA0pVVZWGDh2qp5566rxjxowZo+Li4mC98sorjdZPmTJFu3fv1urVq7VixQqtW7dO99xzT/O7BwAAbZO5CJLM8uXLGy2bOnWqmThx4nkfs2fPHiPJbNmyJbhs1apVxmazmU8++aRJz+v1eo0kiqIoiqJaYXm93q98r78k56CsXbtWSUlJGjBggGbOnKlTp04F1+Xm5io+Pl4jRowILsvMzJTdbtemTZvOub2amhr5fL5GBQAA2q4WDyhjxozR//zP/yg7O1u/+93vlJOTo7Fjx8rv90uSPB6PkpKSGj0mPDxcCQkJ8ng859xmVlaW4uLigpWamtrSbQMAAAsJb+kN3n777cHbgwcP1pAhQ9SnTx+tXbtWI0eOvKBtzp07V3PmzAne9/l8hBQAANqwS36Zce/evdW5c2cdPHhQkuRyuVRaWtpoTH19vcrKyuRyuc65DYfDIafT2agAAEDbdckDyrFjx3Tq1Cl17dpVkuR2u1VeXq68vLzgmDVr1igQCCgjI+NStwMAAFqBZh/iqaysDO4NkaTDhw8rPz9fCQkJSkhI0IIFCzRp0iS5XC4dOnRIv/zlL9W3b1+NHj1aknTFFVdozJgxuvvuu/XMM8+orq5Os2fP1u23366UlJSWe2UAAKD1atJ1vZ/xwQcfnPOSoalTp5rTp0+bUaNGmS5dupiIiAjTo0cPc/fddxuPx9NoG6dOnTJ33HGHiY2NNU6n00ybNs1UVFQ0uQcuM6YoiqKo1ltNuczYZowxamV8Pp/i4uJC3QYAALgAXq/3K88n5bt4AACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5RBQAACA5TQroGRlZenqq69Wx44dlZSUpJtvvlkFBQWNxlRXV2vWrFlKTExUbGysJk2apJKSkkZjCgsLNX78eEVHRyspKUkPPPCA6uvrL/7VAACANqFZASUnJ0ezZs3Sxo0btXr1atXV1WnUqFGqqqoKjrn//vv19ttv6/XXX1dOTo6OHz+uW2+9Nbje7/dr/Pjxqq2t1UcffaSXXnpJL774oubPn99yrwoAALRu5iKUlpYaSSYnJ8cYY0x5ebmJiIgwr7/+enDM3r17jSSTm5trjDFm5cqVxm63G4/HExyzcOFC43Q6TU1NTZOe1+v1GkkURVEURbXC8nq9X/lef1HnoHi9XklSQkKCJCkvL091dXXKzMwMjhk4cKDS0tKUm5srScrNzdXgwYOVnJwcHDN69Gj5fD7t3r37nM9TU1Mjn8/XqAAAQNt1wQElEAjovvvu03XXXadBgwZJkjwejyIjIxUfH99obHJysjweT3DMZ8PJ2fVn151LVlaW4uLigpWamnqhbQMAgFbgggPKrFmztGvXLi1ZsqQl+zmnuXPnyuv1BquoqOiSPycAAAid8At50OzZs7VixQqtW7dO3bt3Dy53uVyqra1VeXl5o70oJSUlcrlcwTGbN29utL2zV/mcHfN5DodDDofjQloFAACtULP2oBhjNHv2bC1fvlxr1qxRr169Gq0fPny4IiIilJ2dHVxWUFCgwsJCud1uSZLb7dbOnTtVWloaHLN69Wo5nU6lp6dfzGsBAABtRTMu2jEzZ840cXFxZu3ataa4uDhYp0+fDo6ZMWOGSUtLM2vWrDFbt241brfbuN3u4Pr6+nozaNAgM2rUKJOfn2/effdd06VLFzN37twm98FVPBRFURTVeqspV/E0K6Cc74kWLVoUHHPmzBnzk5/8xHTq1MlER0ebW265xRQXFzfazpEjR8zYsWNNVFSU6dy5s/n5z39u6urqmtwHAYWiKIqiWm81JaDY/hU8WhWfz6e4uLhQtwEAAC6A1+uV0+n80jF8Fw8AALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALCc8FA30F7ZbDZFR0crIiJCxpiLKgAA2hoCSgjExcVpxIgR+uUvf6lvfetbqqysbFRVVVWqrKxURUVF8Pa51p+t+vp61dfXy+/3N/rzfLc/vywQCIR6SgAAaISAchlFRERoxIgR+t73vqe77rpLiYmJkqROnTqpU6dOF7RNY4xqamp05syZYFVXV3/h/unTp7+w/GzV1NSopqZGtbW1F3wbAICW1KyAkpWVpWXLlmnfvn2KiorS17/+df3ud7/TgAEDgmNuuOEG5eTkNHrcj3/8Yz3zzDPB+4WFhZo5c6Y++OADxcbGaurUqcrKylJ4eNvNS926ddM999yjW265Renp6QoLC2uR7dpsNnXo0EEdOnS44JDj9/tVV1en2tpa1dbWBm+fa9n5bn/wwQfKzs5WYWFhi7wuAEA7Z5ph9OjRZtGiRWbXrl0mPz/fjBs3zqSlpZnKysrgmOuvv97cfffdpri4OFherze4vr6+3gwaNMhkZmaa7du3m5UrV5rOnTubuXPnNrkPr9drJLWKstvt5vbbbzfbtm1rNE9tSSAQMKdOnTK7du0yzzzzjLnuuutMREREyOeeoiiKsmZ9NhecT7MCyueVlpYaSSYnJye47Prrrzc/+9nPzvuYlStXGrvdbjweT3DZwoULjdPpNDU1NU163tYQUBwOhxk6dKh54403TE1NjQkEAhc8z61FIBAw9fX1prq62mzbts3ce++9pkePHiY2NtbY7faQ/0zaQ/361zIbNshkZ8u88ILMuHEyiYkyCQky8fEyDkfoe2wvNX58w89i7VqZV1+VmTat4WeRmCjTqZNMdHToe6SoUFVTAorNmAu/DOTgwYPq16+fdu7cqUGDBklqOMSze/duGWPkcrk0YcIEzZs3T9HR0ZKk+fPn66233lJ+fn5wO4cPH1bv3r21bds2DRs27AvP8/nzHHw+n1JTUy+07UsqLCxM/fr106233qqf/vSnSk5ODnVLIXXq1CmtXLlSb7zxhvbt26eioiJVVFSEuq02a/586aabGi8zRqqvl0pLpbVrpQ8/bFhWWyudOCEdPx6SVtu8m25q+Hl8ljGS3y95vdKWLdJbbzUsq6uTysulI0dC0Slw+Xm9Xjmdzi8dc8EnfQQCAd1333267rrrguFEku6880716NFDKSkp2rFjhx588EEVFBRo2bJlkiSPx/OFN+2z9z0ezzmfKysrSwsWLLjQVi+blJQU3XLLLfr+97+va6+9NtTtWEJiYqLuuusu3Xnnndq+fbvWr1+vTZs2acuWLTp69Kj8fn+oW2zzbDYpIkLq1k2aMkW6804pEJAqK6W9e6WtWxvunzkjHT0qbd4c6o7bLptNCg+XEhOlMWOk0aMbAsrp0w1zv3Ztw8+iuloqLpZycxvCC9AeXXBAmTVrlnbt2qUNGzY0Wn7PPfcEbw8ePFhdu3bVyJEjdejQIfXp0+eCnmvu3LmaM2dO8L7V9qBERkZq3Lhxuvvuu/WNb3zjK1NhexQWFqYRI0Zo2LBhOnnypA4cOKCtW7fqnXfe0dq1a1VfXx/qFtsNm00KC5Pi4qRrr5UyMhrvUdm/v+FNsqqq4fbf/97wWz9ans3WULGx0pVXSunpDcvr6qSyMqmgoOH2mTMNe1feekv69NOQtgxcNhcUUGbPnq0VK1Zo3bp16t69+5eOzcjIkNRwOKhPnz5yuVza/Llf0UpKSiRJLpfrnNtwOBxyOBwX0uolZbPZ1L17dz3yyCMaN26cEhISZLfz4bxfJiwsTMnJyUpOTtbVV1+tO++8U4WFhVq8eLFeffVVeTwePpflMjv7Jtmhg5SaKp39J+33N/wmP3asNG1aaHtsL2y2hj8jIyWXSzq7szkQkGpqpFtuadgLVlUVuh6By6VZAcUYo3vvvVfLly/X2rVr1atXr698zNlzTbp27SpJcrvdevTRR1VaWqqkpCRJ0urVq+V0OpV+9tcHiwsLC1OnTp30/e9/X/PmzVN8fDzB5AI4HA4lJSWpS5cuGjZsmBYsWKBVq1bppZdeUl5ennw+n6qrq/m03Evs7PSePczj9Tbc9vmk/Hxp4cKQtteufPZnUVPT8LPw+xsCyd690gsvEE7QfjQroMyaNUuLFy/Wm2++qY4dOwbPGYmLi1NUVJQOHTqkxYsXa9y4cUpMTNSOHTt0//3369/+7d80ZMgQSdKoUaOUnp6uu+66S48//rg8Ho8eeughzZo1y5J7ST4vISFB3/rWtzR79mx985vfbLHPM2nPbDabwsLC1LFjR33ve9/TpEmTtGvXLr311ltav369Dhw4oCOcPdhijGmomhrJ45EKCxveBCsqpB07pDffDHWH7cfZn8XZQzqfPbx24IC0dGnDzwloj5oVUBb+61epG264odHyRYsW6Yc//KEiIyP1/vvv649//KOqqqqUmpqqSZMm6aGHHgqODQsL04oVKzRz5ky53W7FxMRo6tSp+u1vf3vxr+YSstvtyszM1J133qlx48apS5cuoW6pzQoLC9PQoUM1dOhQlZSUaNu2bcrNzVVubq62bt2q8vLyULfYqhjT8KZXXi7t3Cnt2dP4HJOPPw51h+3H2Z9FVZV08KC0adP/7bkqKmq4wgpAg4u6zDhUfD6f4uLiLtvz9e7dW3PmzNHYsWPVu3fvy/a8+D/19fU6evSo9u/frzVr1uiNN97QwYMHQ92W5cyfL02Y0PAb+fHj0urVDXtFjGl4E/R4pH+d8oVL7KabpHnzGvZOnTrVED7OXqVTWyudPNmw9wpoj5pymTEB5UvExMRo8uTJ+sUvfqE+ffooIiJCtrNnsSEkAoGAzpw5o4qKCq1bt06LFi3Shg0bdPr0aU6ulfTss/9Pixc/r92796q+vuE39draUHfVPk2dOlmJiRH63//9m/z+hoB45kyouwKs4ZJ+Dkpb1rFjR1155ZVasGCBvvOd70gSwcQi7Ha7YmJiFB0drdtuu03f/e53dfDgQb366qt67bXXVFJSovLyctW10w+PCA9P0KefRurEiVB3Ars9WlVV/CyAC0VA+QyHw6FBgwbp9ttv14wZMxQbGxvqlnAeZwOjzWZT//79NW/ePD3wwAN69913tXr1am3btk379+9XWVlZiDsFAFwIAsq/dO3aVdOmTdNtt92mwYMHc3VOK9ShQwfdfPPNGjt2rP75z38qLy9PH330kdatW6e9e/dyCAgAWhECiqS77rpLP/3pTzVgwAB17Ngx1O3gIjkcDl1xxRUaOHCgbrzxRh07dkwff/yx/v73v2vVqlWqrq4OdYsAgK/QbgNKeHi4Bg0apEceeUTXX389h3PaIJvNpvj4eMXHx2vgwIGaOHGiTpw4ob/97W9asmSJDh06pLq6OvasAIAFtbuAYrfb1bt3b02ePFn33nuvkpKSOAG2HQgPD1dsbKxiYmL00EMPac6cOVq/fr1efvllbd26VcXFxfJ6vaFuEwDwL+0qoHTr1k2jR4/Wj370I7nd7lC3gxA4G0ZjYmI0ZswYjRkzRgUFBXr//feVnZ2t9evX6+TJkyHuEgDQLgJKdHS0Ro0apbvuuktjxoxRdHR0qFuChQwYMEADBgzQ5MmT9f777+vVV1/VihUr+IZlAAihNv8NdykpKfrTn/6kp556ShMnTiSc4Lw6d+6s733ve3r66af16quv6pprrgl1SwDQbrXJgGK32xUbG6v7779f+/bt09SpU5WSksKlw/hKdrtdXbt21cSJE/XOO+/o6aefVv/+/RURERHq1gCgXWlzASUhIUETJkzQe++9pyeeeEIdO3ZUeHi7OJKFFhQWFqbOnTtr5syZeu+99zRnzhwNHDhQkZGRoW4NANqFNvPObbPZ9O1vf1uTJ0/WzTffzLcNo8X07NlTWVlZ+u53v6tXXnlFK1eu1P79+7k8GQAuoTYRUAYMGKAZM2Zo3Lhx6t+/f6jbQRtks9k0YsQIXXnllbr11lu1fPlyPfvss6qoqAh1awDQJrXqgBITE6M77rhDP/nJT3TllVdyngAuuaioKF133XUaPHiwpk2bpqysLL388suhbgsA2pxWfQ7K//7v/+rpp5/WVVddpcjISD5wDZeN0+lUenq6nn/+eX344YfKzMxUbGwsfwcBoIW06oAycuRIRURE8KaAkLDZbHI4HPr617+u5cuX66mnntL111+v+Pj4ULcGAK1eqw4ogFXExsbqBz/4gRYvXqz//M//1MiRIxUTExPqtgCg1SKgAC2oa9eumjFjhhYuXKjHH39cQ4cOld3OPzMAaC7+5wRaWHh4uPr166fp06dr5cqVysrKUlxcHIciAaAZCCjAJeJwOJSSkqJf/OIX2rdvn2bMmKGkpCQ+0RgAmoCAAlxidrtdLpdLTz/9tN58803deeed6tmzZ6jbAgBLa9WfgwK0Ntdee62GDRum999/X3//+9+1atUqeTyeULcFAJbDHhTgMnM4HBo3bpyeeOIJPf/88/rud7+rDh06hLotALAUAgoQAjabTYmJiRo9erSee+45LV26VMOHD+f8FAD4FwIKEEJhYWGKi4vTuHHjtH79ej311FMaNmyYoqOjQ90aAIQUAQWwAJvNpqioKP34xz/WsmXL9OCDD+qaa64JdVsAEDIEFMBievbsqfnz5+vZZ5/VI488osGDB4e6JQC47LiKB7CooUOHKj09XTfeeKOWLVumhQsX6sSJE6FuCwAuC/agABYWERGhIUOG6Ne//rXWr1+vH/7wh4qKiuLj8wG0ec36X27hwoUaMmSInE6nnE6n3G63Vq1aFVxfXV2tWbNmKTExUbGxsZo0aZJKSkoabaOwsFDjx49XdHS0kpKS9MADD6i+vr5lXg3QBp391uT+/fvr+eef1z/+8Q/ddNNNSkpKIqgAaLOa9b9b9+7d9dhjjykvL09bt27Vt7/9bU2cOFG7d++WJN1///16++239frrrysnJ0fHjx/XrbfeGny83+/X+PHjVVtbq48++kgvvfSSXnzxRc2fP79lXxXQBtlsNtntdn3jG9/QokWL9Mc//lETJ05UfHx8qFsDgJZnLlKnTp3Mc889Z8rLy01ERIR5/fXXg+v27t1rJJnc3FxjjDErV640drvdeDye4JiFCxcap9NpampqmvycXq/XSDJer/di2wdatWPHjpmXX37ZjBs3zkRGRpoXXnjBDB061EiiQlzTpk0zP/7xj0PeB0VZsZry/n3B+4f9fr+WLFmiqqoqud1u5eXlqa6uTpmZmcExAwcOVFpamnJzcyVJubm5Gjx4sJKTk4NjRo8eLZ/PF9wLcy41NTXy+XyNCoDUrVs33X777frLX/6iRYsWaenSpdq/f3+o24Kkt99+W8uWLQt1G0Cr1eyreHbu3Cm3263q6mrFxsZq+fLlSk9PV35+viIjI7+wuzk5OTn4XSMej6dRODm7/uy688nKytKCBQua2yrQLtjtdnXv3l2TJ0/WxIkT5ff7Q90S/uXYsWPKzc3Vhg0btGnTJh0/flx1dXXBAnB+zQ4oAwYMUH5+vrxer5YuXaqpU6cqJyfnUvQWNHfuXM2ZMyd43+fzKTU19ZI+J9DahIWFKSYmJtRt4DPS09OVnp6u6dOnq76+XoWFhcrLy9O2bduUl5en0tJSlZeXq7y8XD6fT8aYULcMWEazA0pkZKT69u0rSRo+fLi2bNmiJ598UpMnT1Ztba3Ky8sb7UUpKSmRy+WSJLlcLm3evLnR9s5e5XN2zLk4HA45HI7mtgoAlhEeHq7evXurd+/euu222+T3+1VUVKT9+/dr//79OnjwoI4ePaqioiIVFRXp1KlT7A1Du3bRH9QWCARUU1Oj4cOHKyIiQtnZ2Zo0aZIkqaCgQIWFhXK73ZIkt9utRx99VKWlpUpKSpIkrV69Wk6nU+np6RfbCgC0GmFhYerZs6d69uypUaNGye/3q6SkRB6PRx6PR0VFRdq7d6927dqlPXv2yOPxsIcF7UqzAsrcuXM1duxYpaWlqaKiQosXL9batWv13nvvKS4uTtOnT9ecOXOUkJAgp9Ope++9V263W9dee60kadSoUUpPT9ddd92lxx9/XB6PRw899JBmzZrFHhIA7VpYWJhSUlKUkpIiSaqvr9fp06dVWVmpyspKFRYWasuWLdq4caO2bt2q4uJiAgvatGYFlNLSUv3gBz9QcXGx4uLiNGTIEL333nv6zne+I0n6wx/+ILvdrkmTJqmmpkajR4/W008/HXx8WFiYVqxYoZkzZ8rtdismJkZTp07Vb3/725Z9VQDQyoWHhwc/FFOS+vXrp29961vy+/2qq6vT0aNH9dFHHyk3N1e5ubk6ceKEamtrVVNTo5qamhB3D1w8m2mFEdzn8ykuLk5erzf4jxcA2iu/369Dhw4pPz9f+fn52rFjh0pKSlRWVqZTp06poqJCgUAg1G0CQU15/yagAEAbEwgEdOTIER06dEiHDh3SP//5Tx05ckSHDx/W0aNHdfLkSQ4PIaQIKAAA1dbWqqysTKWlpTpx4oQKCwu1a9cu7dixQzt37vzCd6YBl1pT3r8v+ioeAIC1RUZGyuVyBT/Oob6+XmfOnNGZM2dUVVWlo0ePKjc3Vxs3btTGjRt18uRJSZIxhj0tCBn2oABAO3b2LeBsGKmvr9f+/fu1cePG4Am4Xq9Xp0+f1pkzZ1RbWxvijtEWcIgHAHBRjDHavXu3tm/frm3btmn//v3yeDwqKSnRyZMnuWIIF4SAAgBoUT6fT0eOHNHBgwd16NAhHTx4UAcOHNCBAwd07NixULeHVoKAAgC4ZPx+vyoqKnTixAmdPHlSx44d09atW7V161bl5eXJ6/WGukVYFAEFAHBZnD2Hpbq6OvhhcR9//LFycnK0bt067dq1S6dPn1YgEOA7hkBAAQCEzmdPwC0rK9O2bdu0bt06bdy4UYcPH1ZlZaWqqqpUVVUV4k5xuRFQAACW4/f7dfz4ce3YsSNYxcXFOn78uI4fP05gaQcIKAAAy6uvr9fRo0d19OjR4Am4e/fu1b59+3TkyBFVV1eHukW0MD6oDQBgeeHh4erTp4/69OkjSTp9+rTKysr06aefyuPx6OOPP9amTZu0efNmFRYWhrhbXC7sQQEAWFYgEFB9fb3q6upUU1OjoqIi5eTkKCcnRxs2bFBFRYX8fr/q6+v5QsRWhEM8AIA24/NvV36/X9u3b9fGjRv10UcfqaCgQJ9++qm8Xq8qKipUX18fok7xVQgoAIB2o6ysTHv27NGuXbu0Z88eHTlyREVFRSoqKtKpU6dC3R4+g4ACAGiX6uvrgx8eV1hYqB07duiVV17RgQMH+AJECyCgAAAg6cyZMzp69KiWL1+u3//+98FvbEZoEFAAAPiMuro6VVZW6sknn9QLL7ygkpISvqE5BAgoAACcR2Fhof7617/qnXfeUUFBgU6fPh3qltoNAgoAAF9h3759Wr58ud577z3l5eWpsrIy1C21eQQUAACawO/368CBA1q3bp2WLFmiDz/8kEM/lxABBQCAZqitrdXJkye1fv16/f73v9fmzZtD3VKb1JT3b/tl6gUAAMuLjIxUSkqKbrvtNmVnZ2vJkiVyu92KjY2VzWYLdXvtCntQAAD4Ep9++qnefPNNLV68WB9//LFKS0tD3VKrxyEeAABayKlTp7Ry5UqtWrVK2dnZBJWLQEABAKAFBQIBnThxQnl5eVq6dKlefvllTqa9AAQUAAAugUAgIJ/Pp8LCQj366KP6+9//Lr/fH+q2Wg1OkgUA4BKw2+2Kj4/X4MGDtXjxYm3dulW33HKLkpKSZLfz1toS2IMCAEALWbt2rZ5//nlt2rRJ//znP9mrch4c4gEA4DKrr69XTk6O3n33Xb399ts6cOCAAoFAqNuylBY/xLNw4UINGTJETqdTTqdTbrdbq1atCq6/4YYbZLPZGtWMGTMabaOwsFDjx49XdHS0kpKS9MADD6i+vr45bQAAYFnh4eEaOXKk5s+fr8WLF+s3v/mNunfvHuq2Wp3w5gzu3r27HnvsMfXr10/GGL300kuaOHGitm/friuvvFKSdPfdd+u3v/1t8DHR0dHB236/X+PHj5fL5dJHH32k4uJi/eAHP1BERIT+67/+q4VeEgAAodexY0d97WtfU3p6uqZOnarnn39ezz//vIqLi9vNHhWbzSa73R78U1KTr3q66EM8CQkJeuKJJzR9+nTdcMMNuuqqq/THP/7xnGNXrVqlG2+8UcePH1dycrIk6ZlnntGDDz6oEydOKDIysknPySEeAEBrcvat9ujRo3r22We1bNkyFRYW6syZMyHurGVEREQoKipKHTp0UIcOHeRwOBQREaGrr75aAwYMCFZycrK6dOlyac9B8fv9ev311zV16lRt375d6enpuuGGG7R7924ZY+RyuTRhwgTNmzcvuBdl/vz5euutt5Sfnx/czuHDh9W7d29t27ZNw4YNO+dz1dTUqKamJnjf5/MpNTWVgAIAaJUKCgr0t7/9TatXr9bu3btb1Tcod+rUSYmJiUpISFB8fLzi4+PVrVs39ejRQz169FBaWprS0tKUkJDwhSuamrODoVmHeCRp586dcrvdqq6uVmxsrJYvX6709HRJ0p133qkePXooJSVFO3bs0IMPPqiCggItW7ZMkuTxeIJ7Ts46e9/j8Zz3ObOysrRgwYLmtgoAgCUNGDBADz/8sG6//Xa9//77evPNN7V+/XpLnZPZsWNHpaamqlu3burWrZtSUlKUmJgol8sll8ulzp07q0uXLkpMTGzyEZDmaPYelNraWhUWFsrr9Wrp0qV67rnnlJOTEwwpn7VmzRqNHDlSBw8eVJ8+fXTPPffo6NGjeu+994JjTp8+rZiYGK1cuVJjx44953OyBwUA0FbV1NTok08+UW5urv785z9r48aNl/X5IyIi1KNHD/Xt21d9+/ZVv3791KtXL3Xp0kUdO3ZUTExMsDp06HBRn/NySfegREZGqm/fvpKk4cOHa8uWLXryySf1l7/85QtjMzIyJCkYUFwu1xe+urqkpESS5HK5zvucDodDDoejua0CAGB5DodDvXv3VlpamsaMGaPVq1fr8ccf1+7duy/qY/TtdnuwwsLCZLfbFRMTo+HDh+uKK67QFVdcofT0dPXp00cOh0Ph4eEKCwsL/hnqD5xrdkD5vEAg0GjvxmedPdeka9eukiS3261HH31UpaWlSkpKkiStXr1aTqfznHtgAABoL8LDw5WQkKDJkyfrxhtv1NKlS/X000/r0KFDKisrO+/jwsLCFBUVpejo6OCf0dHR6t27t/r27av+/furX79+6tevn7p06XLObdhstkv1si5Ysw7xzJ07V2PHjlVaWpoqKiq0ePFi/e53v9N7772n3r17a/HixRo3bpwSExO1Y8cO3X///erevbtycnIkNZxYe9VVVyklJUWPP/64PB6P7rrrLv3oRz9q1mXGXMUDAGgPfD6fXnvtNb399tvasGGD6urqlJiYGDxJNTExUV26dFFqaqq6d++u7t27B88bCQ8Pt1zwaM77d7MCyvTp05Wdna3i4mLFxcVpyJAhevDBB/Wd73xHRUVF+v73v69du3apqqpKqampuuWWW/TQQw81auLo0aOaOXOm1q5dq5iYGE2dOlWPPfaYwsObvjOHgAIAaC+MMSopKdFHH32k2traLwSUmJiYkB+OaapLFlCsgoACAEDr05z379YRuQAAQLtCQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJZDQAEAAJYTHuoGLoQxRpLk8/lC3AkAAGiqs+/bZ9/Hv0yrDCgVFRWSpNTU1BB3AgAAmquiokJxcXFfOsZmmhJjLCYQCKigoEDp6ekqKiqS0+kMdUutls/nU2pqKvPYApjLlsNctgzmseUwly3DGKOKigqlpKTIbv/ys0xa5R4Uu92ubt26SZKcTid/WVoA89hymMuWw1y2DOax5TCXF++r9pycxUmyAADAcggoAADAclptQHE4HHr44YflcDhC3Uqrxjy2HOay5TCXLYN5bDnM5eXXKk+SBQAAbVur3YMCAADaLgIKAACwHAIKAACwHAIKAACwnFYZUJ566in17NlTHTp0UEZGhjZv3hzqlixn3bp1mjBhglJSUmSz2fTGG280Wm+M0fz589W1a1dFRUUpMzNTBw4caDSmrKxMU6ZMkdPpVHx8vKZPn67KysrL+CpCLysrS1dffbU6duyopKQk3XzzzSooKGg0prq6WrNmzVJiYqJiY2M1adIklZSUNBpTWFio8ePHKzo6WklJSXrggQdUX19/OV9KSC1cuFBDhgwJfsiV2+3WqlWrguuZwwv32GOPyWaz6b777gsuYz6b5je/+Y1sNlujGjhwYHA98xhippVZsmSJiYyMNC+88ILZvXu3ufvuu018fLwpKSkJdWuWsnLlSvMf//EfZtmyZUaSWb58eaP1jz32mImLizNvvPGG+fjjj81NN91kevXqZc6cORMcM2bMGDN06FCzceNGs379etO3b19zxx13XOZXElqjR482ixYtMrt27TL5+flm3LhxJi0tzVRWVgbHzJgxw6Smpprs7GyzdetWc+2115qvf/3rwfX19fVm0KBBJjMz02zfvt2sXLnSdO7c2cydOzcULykk3nrrLfPOO++Y/fv3m4KCAvPrX//aREREmF27dhljmMMLtXnzZtOzZ08zZMgQ87Of/Sy4nPlsmocffthceeWVpri4OFgnTpwIrmceQ6vVBZRrrrnGzJo1K3jf7/eblJQUk5WVFcKurO3zASUQCBiXy2WeeOKJ4LLy8nLjcDjMK6+8YowxZs+ePUaS2bJlS3DMqlWrjM1mM5988sll691qSktLjSSTk5NjjGmYt4iICPP6668Hx+zdu9dIMrm5ucaYhrBot9uNx+MJjlm4cKFxOp2mpqbm8r4AC+nUqZN57rnnmMMLVFFRYfr162dWr15trr/++mBAYT6b7uGHHzZDhw495zrmMfRa1SGe2tpa5eXlKTMzM7jMbrcrMzNTubm5IeysdTl8+LA8Hk+jeYyLi1NGRkZwHnNzcxUfH68RI0YEx2RmZsput2vTpk2XvWer8Hq9kqSEhARJUl5enurq6hrN5cCBA5WWltZoLgcPHqzk5OTgmNGjR8vn82n37t2XsXtr8Pv9WrJkiaqqquR2u5nDCzRr1iyNHz++0bxJ/J1srgMHDiglJUW9e/fWlClTVFhYKIl5tIJW9WWBJ0+elN/vb/SXQZKSk5O1b9++EHXV+ng8Hkk65zyeXefxeJSUlNRofXh4uBISEoJj2ptAIKD77rtP1113nQYNGiSpYZ4iIyMVHx/faOzn5/Jcc312XXuxc+dOud1uVVdXKzY2VsuXL1d6erry8/OZw2ZasmSJtm3bpi1btnxhHX8nmy4jI0MvvviiBgwYoOLiYi1YsEDf/OY3tWvXLubRAlpVQAFCadasWdq1a5c2bNgQ6lZapQEDBig/P19er1dLly7V1KlTlZOTE+q2Wp2ioiL97Gc/0+rVq9WhQ4dQt9OqjR07Nnh7yJAhysjIUI8ePfTaa68pKioqhJ1BamVX8XTu3FlhYWFfOIu6pKRELpcrRF21Pmfn6svm0eVyqbS0tNH6+vp6lZWVtcu5nj17tlasWKEPPvhA3bt3Dy53uVyqra1VeXl5o/Gfn8tzzfXZde1FZGSk+vbtq+HDhysrK0tDhw7Vk08+yRw2U15enkpLS/W1r31N4eHhCg8PV05Ojv70pz8pPDxcycnJzOcFio+PV//+/XXw4EH+XlpAqwookZGRGj58uLKzs4PLAoGAsrOz5Xa7Q9hZ69KrVy+5XK5G8+jz+bRp06bgPLrdbpWXlysvLy84Zs2aNQoEAsrIyLjsPYeKMUazZ8/W8uXLtWbNGvXq1avR+uHDhysiIqLRXBYUFKiwsLDRXO7cubNR4Fu9erWcTqfS09MvzwuxoEAgoJqaGuawmUaOHKmdO3cqPz8/WCNGjNCUKVOCt5nPC1NZWalDhw6pa9eu/L20glCfpdtcS5YsMQ6Hw7z44otmz5495p577jHx8fGNzqJGwxn+27dvN9u3bzeSzO9//3uzfft2c/ToUWNMw2XG8fHx5s033zQ7duwwEydOPOdlxsOGDTObNm0yGzZsMP369Wt3lxnPnDnTxMXFmbVr1za6FPH06dPBMTNmzDBpaWlmzZo1ZuvWrcbtdhu32x1cf/ZSxFGjRpn8/Hzz7rvvmi5durSrSxF/9atfmZycHHP48GGzY8cO86tf/crYbDbzj3/8wxjDHF6sz17FYwzz2VQ///nPzdq1a83hw4fNhx9+aDIzM03nzp1NaWmpMYZ5DLVWF1CMMea///u/TVpamomMjDTXXHON2bhxY6hbspwPPvjASPpCTZ061RjTcKnxvHnzTHJysnE4HGbkyJGmoKCg0TZOnTpl7rjjDhMbG2ucTqeZNm2aqaioCMGrCZ1zzaEks2jRouCYM2fOmJ/85CemU6dOJjo62txyyy2muLi40XaOHDlixo4da6Kiokznzp3Nz3/+c1NXV3eZX03o/Pu//7vp0aOHiYyMNF26dDEjR44MhhNjmMOL9fmAwnw2zeTJk03Xrl1NZGSk6datm5k8ebI5ePBgcD3zGFo2Y4wJzb4bAACAc2tV56AAAID2gYACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAs5/8DDrVdBLR0dasAAAAASUVORK5CYII=\n" }, "metadata": {} } ] }, { "cell_type": "code", "source": [ "action = envr.action_space.sample()\n", "obs, reward, done, done2, info = envr.step(action)\n", "screen = envr.render()\n", "\n", "plt.imshow(screen)" ], "metadata": { "id": "QRCUBcLcqkyq", "colab": { "base_uri": "https://localhost:8080/", "height": 413 }, "outputId": "0c8daef4-9189-40d7-84ff-89de15983dde" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": {}, "execution_count": 9 }, { "output_type": "display_data", "data": { "text/plain": [ "
" ], "image/png": "iVBORw0KGgoAAAANSUhEUgAAAigAAAF7CAYAAAD4/3BBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA5MklEQVR4nO3deXiU5b3/8c9MliELk5BAMgQSdsHIIgXFqVptSVlFVGpRqaUcqoWCrdJaS49g6dETq7+rrT1FbCtFz6mISsEFWYpBAkjYApE9LAIJkkmAmJkkkHXu3x9ppkZACQTmSfJ+Xdf3yszz3DPznTuB53M9y4zNGGMEAABgIfZgNwAAAPBFBBQAAGA5BBQAAGA5BBQAAGA5BBQAAGA5BBQAAGA5BBQAAGA5BBQAAGA5BBQAAGA5BBQAAGA5QQ0oc+fOVdeuXdWmTRsNGTJEW7ZsCWY7AADAIoIWUN544w3NmDFDTz31lLZv364BAwZo+PDhKioqClZLAADAImzB+rLAIUOG6IYbbtCf/vQnSZLf71dycrIeeeQR/fKXvwxGSwAAwCJCg/GiVVVVys7O1syZMwPL7Ha70tLSlJWVdc74yspKVVZWBu77/X4VFxcrPj5eNpvtqvQMAAAujzFGpaWlSkpKkt3+5QdxghJQTp06pdraWiUmJjZYnpiYqP37958zPj09XXPmzLla7QEAgCsoPz9fnTt3/tIxQQkojTVz5kzNmDEjcN/r9SolJSWIHQGt05hbn1NUfKwSo/oqIixeZ6tP6+jJ9VqZ+YwqK32SpNQ+I/SNAY8ptk2KZLvyp7l9duaQVm/7b33ySZZCQ8J1y/XT1Tm5n5xtkuV0JKnGX6GcYwuVlT1f5eWnr3g/AL5a27Ztv3JMUAJK+/btFRISosLCwgbLCwsL5XK5zhnvcDjkcDiuVnsAzuPabqOVkNhL9lC7YiNTZGRUUnVE+w6vUnVVeWBcSEiYwsOj5AhvK9tVCCjhNdEKsYfJZrOpprZC+4+ulCuxj2oizig0zCGHLVq9Ow9TXsFWHTy0TsbUXvGeAHy5izk9IyhX8YSHh2vQoEHKyMgILPP7/crIyJDb7Q5GSwC+hE02xUR3lN9epbiInpJsqqgpUdFn+1RcfFT+Bht9m6Sre+69zWZT/f93hcX7dDRvkyqqS3S2+jNJNjnbJOuarmmKjo6/qn0BuHRBu8x4xowZ+utf/6pXX31V+/bt09SpU1VeXq5JkyYFqyUAF5DiulFDrv+B2oS2U3hItIz8Kq08oX2Hl+vUZ580GBuc09b//arG1GrvkeUq/eyUTp3ZK7+pUXhItLomfl2dOg6Q3R4SlA4BNE7QzkEZP368Tp48qdmzZ8vj8ej666/XypUrzzlxFkBwhdjD1aXTTTpbUyKnI0l2W6gqa33KP7lFp04dPWe8kV9nqk/Lbjvffy8m8MOcs5fFfO7WhddJdVcC1CurLvzX1QD/Dim+8hPae2iFBg64R97KPMVF9FCMI1nX9hiu4ydyVFZ28qvfOICgCupJstOnT9f06dOD2QKArxAeFqXBfb+nsqpCRYTGSZJOlx+Qp2i/ThYfPGe8IzRWNklnqz9Tw8PMF7Nv5dwx53/Uv5e2CYlRmC1aX/xIp4P5Gers+prCwtsoOryjwkMildL+JnXv5tbOXe9eRC8AgqlZXMUDIHjGDf0f+SrzFePoohB7uCprfDpZul/7Dqw87/gDhz7QiYJdsgVCxIUihu2LCy5+bMMHqdh75JyTX/3+Gm3eNV/jkv6o4rOH5IruL6cjWX26D1dh0X4VFh64wHMCsAICCoAv1SYqWmeri5UQ1V9+UytP+U4VeQ6r/Oz5L9n1lXvkK/dc5S7Pr+xMkTbtmK+v3/hDnakuVq2/UjX+Crlc1xJQAIsjoAD4UuVlnykswqHjvixFhMar7GzdRr+5OHpikzp/OlDeuOPy+k5oW85Ceb0FwW4LwFcgoAD4Um+selh9e4xVj+RbZWJt+uTIRzLGH+y2LlplVam27XpNnTr205Fjm3S2oiTYLQG4CEH7ssDL4fP5FBMTE+w2gFYlsk28Ulw3KM+zWWcqPgt2OwCaMa/XK6fT+aVjCCgAAOCqupiAErQPagMAALgQAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALAcAgoAALCcJg8ov/71r2Wz2RpUnz59AusrKio0bdo0xcfHKzo6WuPGjVNhYWFTtwEAAJqxK7IH5brrrlNBQUGgNmzYEFj32GOP6b333tNbb72lzMxMnThxQvfcc8+VaAMAADRToVfkSUND5XK5zlnu9Xo1f/58LVy4UN/61rckSQsWLNC1116rTZs26aabbroS7QAAgGbmiuxBOXjwoJKSktS9e3dNmDBBeXl5kqTs7GxVV1crLS0tMLZPnz5KSUlRVlbWBZ+vsrJSPp+vQQEAgJaryQPKkCFD9Morr2jlypWaN2+ejhw5oltvvVWlpaXyeDwKDw9XbGxsg8ckJibK4/Fc8DnT09MVExMTqOTk5KZuGwAAWEiTH+IZOXJk4Hb//v01ZMgQdenSRW+++aYiIiIu6TlnzpypGTNmBO77fD5CCgAALdgVv8w4NjZW11xzjQ4dOiSXy6WqqiqVlJQ0GFNYWHjec1bqORwOOZ3OBgUAAFquKx5QysrKdPjwYXXs2FGDBg1SWFiYMjIyAutzc3OVl5cnt9t9pVsBAADNRJMf4vn5z3+uMWPGqEuXLjpx4oSeeuophYSE6P7771dMTIwmT56sGTNmKC4uTk6nU4888ojcbjdX8AAAgIAmDyjHjx/X/fffr9OnT6tDhw665ZZbtGnTJnXo0EGS9Pvf/152u13jxo1TZWWlhg8frhdffLGp2wAAAM2YzRhjgt1EY/l8PsXExAS7DQAAcAm8Xu9Xnk/Kd/EAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLIaAAAADLaXRAWbduncaMGaOkpCTZbDa9/fbbDdYbYzR79mx17NhRERERSktL08GDBxuMKS4u1oQJE+R0OhUbG6vJkyerrKzsst4IAABoORodUMrLyzVgwADNnTv3vOufe+45/fGPf9RLL72kzZs3KyoqSsOHD1dFRUVgzIQJE7Rnzx6tXr1ay5Yt07p16/Twww9f+rsAAAAti7kMkszSpUsD9/1+v3G5XOb5558PLCspKTEOh8O8/vrrxhhj9u7daySZrVu3BsasWLHC2Gw28+mnn17U63q9XiOJoiiKoqhmWF6v9yu39U16DsqRI0fk8XiUlpYWWBYTE6MhQ4YoKytLkpSVlaXY2FgNHjw4MCYtLU12u12bN28+7/NWVlbK5/M1KAAA0HI1aUDxeDySpMTExAbLExMTA+s8Ho8SEhIarA8NDVVcXFxgzBelp6crJiYmUMnJyU3ZNgAAsJhmcRXPzJkz5fV6A5Wfnx/slgAAwBXUpAHF5XJJkgoLCxssLywsDKxzuVwqKipqsL6mpkbFxcWBMV/kcDjkdDobFAAAaLmaNKB069ZNLpdLGRkZgWU+n0+bN2+W2+2WJLndbpWUlCg7OzswZs2aNfL7/RoyZEhTtgMAAJqp0MY+oKysTIcOHQrcP3LkiHJychQXF6eUlBQ9+uijevrpp9WrVy9169ZNs2bNUlJSku666y5J0rXXXqsRI0booYce0ksvvaTq6mpNnz5d9913n5KSkprsjQEAgGbsIq8oDvjwww/Pe8nQxIkTjTF1lxrPmjXLJCYmGofDYYYOHWpyc3MbPMfp06fN/fffb6Kjo43T6TSTJk0ypaWlF90DlxlTFEVRVPOti7nM2GaMMWpmfD6fYmJigt0GAAC4BF6v9yvPJ20WV/EAAIDWhYACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsp9EBZd26dRozZoySkpJks9n09ttvN1j/gx/8QDabrUGNGDGiwZji4mJNmDBBTqdTsbGxmjx5ssrKyi7rjQAAgJaj0QGlvLxcAwYM0Ny5cy84ZsSIESooKAjU66+/3mD9hAkTtGfPHq1evVrLli3TunXr9PDDDze+ewAA0DKZyyDJLF26tMGyiRMnmrFjx17wMXv37jWSzNatWwPLVqxYYWw2m/n0008v6nW9Xq+RRFEURVFUMyyv1/uV2/orcg7K2rVrlZCQoN69e2vq1Kk6ffp0YF1WVpZiY2M1ePDgwLK0tDTZ7XZt3rz5vM9XWVkpn8/XoAAAQMvV5AFlxIgR+t///V9lZGTot7/9rTIzMzVy5EjV1tZKkjwejxISEho8JjQ0VHFxcfJ4POd9zvT0dMXExAQqOTm5qdsGAAAWEtrUT3jfffcFbvfr10/9+/dXjx49tHbtWg0dOvSSnnPmzJmaMWNG4L7P5yOkAADQgl3xy4y7d++u9u3b69ChQ5Ikl8uloqKiBmNqampUXFwsl8t13udwOBxyOp0NCgAAtFxXPKAcP35cp0+fVseOHSVJbrdbJSUlys7ODoxZs2aN/H6/hgwZcqXbAQAAzUCjD/GUlZUF9oZI0pEjR5STk6O4uDjFxcVpzpw5GjdunFwulw4fPqxf/OIX6tmzp4YPHy5JuvbaazVixAg99NBDeumll1RdXa3p06frvvvuU1JSUtO9MwAA0Hxd1HW9n/Phhx+e95KhiRMnmjNnzphhw4aZDh06mLCwMNOlSxfz0EMPGY/H0+A5Tp8+be6//34THR1tnE6nmTRpkiktLb3oHrjMmKIoiqKab13MZcY2Y4xRM+Pz+RQTExPsNgAAwCXwer1feT4p38UDAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsh4ACAAAsp1EBJT09XTfccIPatm2rhIQE3XXXXcrNzW0wpqKiQtOmTVN8fLyio6M1btw4FRYWNhiTl5en0aNHKzIyUgkJCXr88cdVU1Nz+e8GAAC0CI0KKJmZmZo2bZo2bdqk1atXq7q6WsOGDVN5eXlgzGOPPab33ntPb731ljIzM3XixAndc889gfW1tbUaPXq0qqqqtHHjRr366qt65ZVXNHv27KZ7VwAAoHkzl6GoqMhIMpmZmcYYY0pKSkxYWJh56623AmP27dtnJJmsrCxjjDHLly83drvdeDyewJh58+YZp9NpKisrL+p1vV6vkURRFEVRVDMsr9f7ldv6yzoHxev1SpLi4uIkSdnZ2aqurlZaWlpgTJ8+fZSSkqKsrCxJUlZWlvr166fExMTAmOHDh8vn82nPnj3nfZ3Kykr5fL4GBQAAWq5LDih+v1+PPvqobr75ZvXt21eS5PF4FB4ertjY2AZjExMT5fF4AmM+H07q19evO5/09HTFxMQEKjk5+VLbBgAAzcAlB5Rp06Zp9+7dWrRoUVP2c14zZ86U1+sNVH5+/hV/TQAAEDyhl/Kg6dOna9myZVq3bp06d+4cWO5yuVRVVaWSkpIGe1EKCwvlcrkCY7Zs2dLg+eqv8qkf80UOh0MOh+NSWgUAAM1Qo/agGGM0ffp0LV26VGvWrFG3bt0arB80aJDCwsKUkZERWJabm6u8vDy53W5Jktvt1q5du1RUVBQYs3r1ajmdTqWmpl7OewEAAC1FIy7aMVOnTjUxMTFm7dq1pqCgIFBnzpwJjJkyZYpJSUkxa9asMdu2bTNut9u43e7A+pqaGtO3b18zbNgwk5OTY1auXGk6dOhgZs6cedF9cBUPRVEURTXfupireBoVUC70QgsWLAiMOXv2rPnxj39s2rVrZyIjI83dd99tCgoKGjzP0aNHzciRI01ERIRp3769+dnPfmaqq6svug8CCkVRFEU137qYgGL7V/BoVnw+n2JiYoLdBgAAuARer1dOp/NLx/BdPAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHIIKAAAwHJCg91Aa2Wz2RQZGamwsDAZYy6rAABoaQgoQRATE6PBgwfrF7/4hb75zW+qrKysQZWXl6usrEylpaWB2+dbX181NTWqqalRbW1tg58Xuv3FZX6/P9hTAgBAAwSUqygsLEyDBw/Wd7/7XT344IOKj4+XJLVr107t2rW7pOc0xqiyslJnz54NVEVFxTn3z5w5c87y+qqsrFRlZaWqqqou+TYAAE2pUQElPT1dS5Ys0f79+xUREaGvf/3r+u1vf6vevXsHxtx+++3KzMxs8Lgf/ehHeumllwL38/LyNHXqVH344YeKjo7WxIkTlZ6ertDQlpuXOnXqpIcfflh33323UlNTFRIS0iTPa7PZ1KZNG7Vp0+aSQ05tba2qq6tVVVWlqqqqwO3zLbvQ7Q8//FAZGRnKy8trkvcFAGjlTCMMHz7cLFiwwOzevdvk5OSYUaNGmZSUFFNWVhYYc9ttt5mHHnrIFBQUBMrr9QbW19TUmL59+5q0tDSzY8cOs3z5ctO+fXszc+bMi+7D6/UaSc2i7Ha7ue+++8z27dsbzFNL4vf7zenTp83u3bvNSy+9ZG6++WYTFhYW9LmnKIqirFmfzwUX0qiA8kVFRUVGksnMzAwsu+2228xPf/rTCz5m+fLlxm63G4/HE1g2b94843Q6TWVl5UW9bnMIKA6HwwwYMMC8/fbbprKy0vj9/kue5+bC7/ebmpoaU1FRYbZv324eeeQR06VLFxMdHW3sdnvQfyetoX71K5kNG2QyM2XeeEPm7rtl4uPrql07mTZtgt9ja6nRo+t+F+vWybz7rsz06f/+XcTFyURFBb9HigpWXUxAsRlz6ZeBHDp0SL169dKuXbvUt29fSXWHePbs2SNjjFwul8aMGaNZs2YpMjJSkjR79my9++67ysnJCTzPkSNH1L17d23fvl0DBw4853W+eJ6Dz+dTcnLypbZ9RYWEhKhXr16655579JOf/ESJiYnBbimoTp8+reXLl+vtt9/W/v37lZ+fr9LS0mC31WLNni3deWfDZcZIfr/k9Upr10offFC3rKZGOnVKys8PSqst3p131v0+Ps+YujpzRtqxQ3r99brltbV1v59Dh65+n0AweL1eOZ3OLx1zySd9+P1+Pfroo7r55psD4USSHnjgAXXp0kVJSUnauXOnnnjiCeXm5mrJkiWSJI/Hc85Gu/6+x+M572ulp6drzpw5l9rqVZOUlKS7775b3/ve93TTTTcFux1LiI+P14MPPqgHHnhAO3bs0Pr167V582Zt3bpVx44dU21tbbBbbPFsNikkRIqLk+65R7r77rqNZEWFlJsrffRR3f2qKun4cWn9+mB33HLZbHUVHS3deqt0yy3/nvtPP5VWrqwLkzU1UlGRlJlZtw5ojS45oEybNk27d+/Whg0bGix/+OGHA7f79eunjh07aujQoTp8+LB69OhxSa81c+ZMzZgxI3DfantQwsPDNWrUKD300EO65ZZbvjIVtkYhISEaPHiwBg4cqFOnTungwYPatm2b3n//fa1du1Y1NTXBbrHVqN9IRkZKAwdK119ft7x+j8odd9RtNCsrpSNHpL//vW4dml7976JNG6lHD+nHP65bXr9HZcQIqbq6rj79VFq8uO53BLQGlxRQpk+frmXLlmndunXq3Lnzl44dMmSIpLrDQT169JDL5dKWLVsajCksLJQkuVyu8z6Hw+GQw+G4lFavKJvNps6dO+vpp5/WqFGjFBcXJ7udD+f9MiEhIUpMTFRiYqJuuOEGPfDAA8rLy9PChQv1xhtvyOPx8LksV5nNVvczLEzq2FGq/2dYv5flttukSZOC119rUv+7CA2V4uOlb3yj7n79XpaRI6UHHpDKy4PXI3C1NCqgGGP0yCOPaOnSpVq7dq26dev2lY+pP9ekY8eOkiS3261nnnlGRUVFSkhIkCStXr1aTqdTqampjWw/OEJCQtSuXTt973vf06xZsxQbG0swuQQOh0MJCQnq0KGDBg4cqDlz5mjFihV69dVXlZ2dLZ/Pp4qKCj4t9wqrn976QPLZZ3W3z56V9u2TnnsuuP21Jp//XVRV1f0uamvrbn/yiTR3LuEErUejAsq0adO0cOFCvfPOO2rbtm3gnJGYmBhFRETo8OHDWrhwoUaNGqX4+Hjt3LlTjz32mL7xjW+of//+kqRhw4YpNTVVDz74oJ577jl5PB49+eSTmjZtmiX3knxRXFycvvnNb2r69Om69dZbm+zzTFozm82mkJAQtW3bVt/97nc1btw47d69W++++67Wr1+vgwcP6ujRo8Fus8Wo3whWV0uFhXUbPr+/Lpzs2yctXBjc/lqT+t9FTY1UUiLt3fvvw2vHjkn/+791vxegNWpUQJk3b56kuit1Pm/BggX6wQ9+oPDwcH3wwQf6wx/+oPLyciUnJ2vcuHF68sknA2NDQkK0bNkyTZ06VW63W1FRUZo4caJ+85vfXP67uYLsdrvS0tL0wAMPaNSoUerQoUOwW2qxQkJCNGDAAA0YMECFhYXavn27srKylJWVpW3btqmkpCTYLTYr9VeOlJdLu3ZJOTn/3ggeOiR94YgrrqD630VlZV0wXLfu33tLCgqkjIxgdwhYx2VdZhwsPp9PMTExV+31unfvrhkzZmjkyJHq3r37VXtd/FtNTY2OHTumAwcOaM2aNXr77bd1iGsyzzF7tjRmTN1hgdOn664K2batbl39RvDEieD22Frceac0a1ZdACktlTZulFas+Pcl3qdP14UUoDW6mMuMCShfIioqSuPHj9fPf/5z9ejRQ2FhYbLVn8WGoPD7/Tp79qxKS0u1bt06LViwQBs2bNCZM2c4uVbSX/7y/7Rw4Xzt2bNPtbV1e034qqTgmDhxvOLjw/T3v/9dfn/dOT2cPwLUuaKfg9KStW3bVtddd53mzJmjb3/725JEMLEIu92uqKgoRUZG6t5779V3vvMdHTp0SG+88YbefPNNFRYWqqSkRNXV1cFuNShCQ+P02WfhOnky2J3Abo9UeXm4ioqC3QnQPBFQPsfhcKhv37667777NGXKFEVHRwe7JVxAfWC02Wy65pprNGvWLD3++ONauXKlVq9ere3bt+vAgQMqLi4OcqcAgEtBQPmXjh07atKkSbr33nvVr18/rs5phtq0aaO77rpLI0eO1CeffKLs7Gxt3LhR69at0759+zgEBADNCAFF0oMPPqif/OQn6t27t9q2bRvsdnCZHA6Hrr32WvXp00d33HGHjh8/ro8//lj/+Mc/tGLFClVw3SYAWF6rDSihoaHq27evnn76ad12220czmmBbDabYmNjFRsbqz59+mjs2LE6efKk/v73v2vRokU6fPiwqqur2bMCABbU6gKK3W5X9+7dNX78eD3yyCNKSEjgBNhWIDQ0VNHR0YqKitKTTz6pGTNmaP369Xrttde0bds2FRQUyOv1BrtNAMC/tKqA0qlTJw0fPlw//OEP5Xa7g90OgqA+jEZFRWnEiBEaMWKEcnNz9cEHHygjI0Pr16/XKb6NDQCCrlUElMjISA0bNkwPPvigRowYocjIyGC3BAvp3bu3evfurfHjx+uDDz7QG2+8oWXLlvENywAQRC3+G+6SkpL0xz/+UXPnztXYsWMJJ7ig9u3b67vf/a5efPFFvfHGG7rxxhuD3RIAtFotMqDY7XZFR0frscce0/79+zVx4kQlJSVx6TC+kt1uV8eOHTV27Fi9//77evHFF3XNNdcoLCws2K0BQKvS4gJKXFycxowZo1WrVun5559X27ZtFRraKo5koQmFhISoffv2mjp1qlatWqUZM2aoT58+Cg8PD3ZrANAqtJgtt81m07e+9S2NHz9ed911F982jCbTtWtXpaen6zvf+Y5ef/11LV++XAcOHODyZAC4glpEQOndu7emTJmiUaNG6Zprrgl2O2iBbDabBg8erOuuu0733HOPli5dqr/85S8qLS0NdmsA0CI164ASFRWl+++/Xz/+8Y913XXXcZ4ArriIiAjdfPPN6tevnyZNmqT09HS99tprwW4LAFqcZn0Oyv/93//pxRdf1PXXX6/w8HA+cA1XjdPpVGpqqubPn6+PPvpIaWlpio6O5m8QAJpIsw4oQ4cOVVhYGBsFBIXNZpPD4dDXv/51LV26VHPnztVtt92m2NjYYLcGAM1esw4ogFVER0fr+9//vhYuXKj/+q//0tChQxUVFRXstgCg2SKgAE2oY8eOmjJliubNm6fnnntOAwYMkN3OPzMAaCz+5wSaWGhoqHr16qXJkydr+fLlSk9PV0xMDIciAaARCCjAFeJwOJSUlKSf//zn2r9/v6ZMmaKEhAQ+0RgALgIBBbjC7Ha7XC6XXnzxRb3zzjt64IEH1LVr12C3BQCW1qw/BwVobm666SYNHDhQH3zwgf7xj39oxYoV8ng8wW4LACyHPSjAVeZwODRq1Cg9//zzmj9/vr7zne+oTZs2wW4LACyFgAIEgc1mU3x8vIYPH66XX35Zixcv1qBBgzg/BQD+hYACBFFISIhiYmI0atQorV+/XnPnztXAgQMVGRkZ7NYAIKgIKIAF2Gw2RURE6Ec/+pGWLFmiJ554QjfeeGOw2wKAoCGgABbTtWtXzZ49W3/5y1/09NNPq1+/fsFuCQCuOq7iASxqwIABSk1N1R133KElS5Zo3rx5OnnyZLDbAoCrgj0ogIWFhYWpf//++tWvfqX169frBz/4gSIiIvj4fAAtXqP+l5s3b5769+8vp9Mpp9Mpt9utFStWBNZXVFRo2rRpio+PV3R0tMaNG6fCwsIGz5GXl6fRo0crMjJSCQkJevzxx1VTU9M07wZogeq/Nfmaa67R/Pnz9c9//lN33nmnEhISCCoAWqxG/e/WuXNnPfvss8rOzta2bdv0rW99S2PHjtWePXskSY899pjee+89vfXWW8rMzNSJEyd0zz33BB5fW1ur0aNHq6qqShs3btSrr76qV155RbNnz27adwW0QDabTXa7XbfccosWLFigP/zhDxo7dqxiY2OD3RoAND1zmdq1a2defvllU1JSYsLCwsxbb70VWLdv3z4jyWRlZRljjFm+fLmx2+3G4/EExsybN884nU5TWVl50a/p9XqNJOP1ei+3faBZO378uHnttdfMqFGjTHh4uPnb3/5mBgwYYCRRQa5JkyaZH/3oR0Hvg6KsWBez/b7k/cO1tbVatGiRysvL5Xa7lZ2drerqaqWlpQXG9OnTRykpKcrKypIkZWVlqV+/fkpMTAyMGT58uHw+X2AvzPlUVlbK5/M1KABSp06ddN999+nPf/6zFixYoMWLF+vAgQPBbguS3nvvPS1ZsiTYbQDNVqOv4tm1a5fcbrcqKioUHR2tpUuXKjU1VTk5OQoPDz9nd3NiYmLgu0Y8Hk+DcFK/vn7dhaSnp2vOnDmNbRVoFex2uzp37qzx48dr7Nixqq2tDXZL+Jfjx48rKytLGzZs0ObNm3XixAlVV1cHCsCFNTqg9O7dWzk5OfJ6vVq8eLEmTpyozMzMK9FbwMyZMzVjxozAfZ/Pp+Tk5Cv6mkBzExISoqioqGC3gc9JTU1VamqqJk+erJqaGuXl5Sk7O1vbt29Xdna2ioqKVFJSopKSEvl8Phljgt0yYBmNDijh4eHq2bOnJGnQoEHaunWrXnjhBY0fP15VVVUqKSlpsBelsLBQLpdLkuRyubRly5YGz1d/lU/9mPNxOBxyOByNbRUALCM0NFTdu3dX9+7dde+996q2tlb5+fk6cOCADhw4oEOHDunYsWPKz89Xfn6+Tp8+zd4wtGqX/UFtfr9flZWVGjRokMLCwpSRkaFx48ZJknJzc5WXlye32y1JcrvdeuaZZ1RUVKSEhARJ0urVq+V0OpWamnq5rQBAsxESEqKuXbuqa9euGjZsmGpra1VYWCiPxyOPx6P8/Hzt27dPu3fv1t69e+XxeNjDglalUQFl5syZGjlypFJSUlRaWqqFCxdq7dq1WrVqlWJiYjR58mTNmDFDcXFxcjqdeuSRR+R2u3XTTTdJkoYNG6bU1FQ9+OCDeu655+TxePTkk09q2rRp7CEB0KqFhIQoKSlJSUlJkqSamhqdOXNGZWVlKisrU15enrZu3apNmzZp27ZtKigoILCgRWtUQCkqKtL3v/99FRQUKCYmRv3799eqVav07W9/W5L0+9//Xna7XePGjVNlZaWGDx+uF198MfD4kJAQLVu2TFOnTpXb7VZUVJQmTpyo3/zmN037rgCgmQsNDQ18KKYk9erVS9/85jdVW1ur6upqHTt2TBs3blRWVpaysrJ08uRJVVVVqbKyUpWVlUHuHrh8NtMMI7jP51NMTIy8Xm/gHy8AtFa1tbU6fPiwcnJylJOTo507d6qwsFDFxcU6ffq0SktL5ff7g90mEHAx228CCgC0MH6/X0ePHtXhw4d1+PBhffLJJzp69KiOHDmiY8eO6dSpUxweQlARUAAAqqqqUnFxsYqKinTy5Enl5eVp9+7d2rlzp3bt2nXOd6YBV9rFbL8v+yoeAIC1hYeHy+VyBT7OoaamRmfPntXZs2dVXl6uY8eOKSsrS5s2bdKmTZt06tQpSZIxhj0tCBr2oABAK1a/CagPIzU1NTpw4IA2bdoUOAHX6/XqzJkzOnv2rKqqqoLcMVoCDvEAAC6LMUZ79uzRjh07tH37dh04cEAej0eFhYU6deoUVwzhkhBQAABNyufz6ejRozp06JAOHz6sQ4cO6eDBgzp48KCOHz8e7PbQTBBQAABXTG1trUpLS3Xy5EmdOnVKx48f17Zt27Rt2zZlZ2fL6/UGu0VYFAEFAHBV1J/DUlFREfiwuI8//liZmZlat26ddu/erTNnzsjv9/MdQyCgAACC5/Mn4BYXF2v79u1at26dNm3apCNHjqisrEzl5eUqLy8Pcqe42ggoAADLqa2t1YkTJ7Rz585AFRQU6MSJEzpx4gSBpRUgoAAALK+mpkbHjh3TsWPHAifg7tu3T/v379fRo0dVUVER7BbRxPigNgCA5YWGhqpHjx7q0aOHJOnMmTMqLi7WZ599Jo/Ho48//libN2/Wli1blJeXF+RucbWwBwUAYFl+v181NTWqrq5WZWWl8vPzlZmZqczMTG3YsEGlpaWqra1VTU0NX4jYjHCIBwDQYnxxc1VbW6sdO3Zo06ZN2rhxo3Jzc/XZZ5/J6/WqtLRUNTU1QeoUX4WAAgBoNYqLi7V3717t3r1be/fu1dGjR5Wfn6/8/HydPn062O3hcwgoAIBWqaamJvDhcXl5edq5c6def/11HTx4kC9AtAACCgAAks6ePatjx45p6dKl+t3vfhf4xmYEBwEFAIDPqa6uVllZmV544QX97W9/U2FhId/QHAQEFAAALiAvL09//etf9f777ys3N1dnzpwJdkutBgEFAICvsH//fi1dulSrVq1Sdna2ysrKgt1Si0dAAQDgItTW1urgwYNat26dFi1apI8++ohDP1cQAQUAgEaoqqrSqVOntH79ev3ud7/Tli1bgt1Si3Qx22/7VeoFAADLCw8PV1JSku69915lZGRo0aJFcrvdio6Ols1mC3Z7rQp7UAAA+BKfffaZ3nnnHS1cuFAff/yxioqKgt1Ss8chHgAAmsjp06e1fPlyrVixQhkZGQSVy0BAAQCgCfn9fp08eVLZ2dlavHixXnvtNU6mvQQEFAAArgC/3y+fz6e8vDw988wz+sc//qHa2tpgt9VscJIsAABXgN1uV2xsrPr166eFCxdq27Ztuvvuu5WQkCC7nU1rU2APCgAATWTt2rWaP3++Nm/erE8++YS9KhfAIR4AAK6ympoaZWZmauXKlXrvvfd08OBB+f3+YLdlKU1+iGfevHnq37+/nE6nnE6n3G63VqxYEVh/++23y2azNagpU6Y0eI68vDyNHj1akZGRSkhI0OOPP66amprGtAEAgGWFhoZq6NChmj17thYuXKhf//rX6ty5c7DbanZCGzO4c+fOevbZZ9WrVy8ZY/Tqq69q7Nix2rFjh6677jpJ0kMPPaTf/OY3gcdERkYGbtfW1mr06NFyuVzauHGjCgoK9P3vf19hYWH67//+7yZ6SwAABF/btm31ta99TampqZo4caLmz5+v+fPnq6CgoNXsUbHZbLLb7YGfki76qqfLPsQTFxen559/XpMnT9btt9+u66+/Xn/4wx/OO3bFihW64447dOLECSUmJkqSXnrpJT3xxBM6efKkwsPDL+o1OcQDAGhO6je1x44d01/+8hctWbJEeXl5Onv2bJA7axphYWGKiIhQmzZt1KZNGzkcDoWFhemGG25Q7969A5WYmKgOHTpc2XNQamtr9dZbb2nixInasWOHUlNTdfvtt2vPnj0yxsjlcmnMmDGaNWtWYC/K7Nmz9e677yonJyfwPEeOHFH37t21fft2DRw48LyvVVlZqcrKysB9n8+n5ORkAgoAoFnKzc3V3//+d61evVp79uxpVt+g3K5dO8XHxysuLk6xsbGKjY1Vp06d1KVLF3Xp0kUpKSlKSUlRXFzcOVc0NWYHQ6MO8UjSrl275Ha7VVFRoejoaC1dulSpqamSpAceeEBdunRRUlKSdu7cqSeeeEK5ublasmSJJMnj8QT2nNSrv+/xeC74munp6ZozZ05jWwUAwJJ69+6tp556Svfdd58++OADvfPOO1q/fr2lzsls27atkpOT1alTJ3Xq1ElJSUmKj4+Xy+WSy+VS+/bt1aFDB8XHx1/0EZDGaPQelKqqKuXl5cnr9Wrx4sV6+eWXlZmZGQgpn7dmzRoNHTpUhw4dUo8ePfTwww/r2LFjWrVqVWDMmTNnFBUVpeXLl2vkyJHnfU32oAAAWqrKykp9+umnysrK0p/+9Cdt2rTpqr5+WFiYunTpop49e6pnz57q1auXunXrpg4dOqht27aKiooKVJs2bS7rc16u6B6U8PBw9ezZU5I0aNAgbd26VS+88IL+/Oc/nzN2yJAhkhQIKC6X65yvri4sLJQkuVyuC76mw+GQw+FobKsAAFiew+FQ9+7dlZKSohEjRmj16tV67rnntGfPnsv6GH273R6okJAQ2e12RUVFadCgQbr22mt17bXXKjU1VT169JDD4VBoaKhCQkICP4P9gXONDihf5Pf7G+zd+Lz6c006duwoSXK73XrmmWdUVFSkhIQESdLq1avldDrPuwcGAIDWIjQ0VHFxcRo/frzuuOMOLV68WC+++KIOHz6s4uLiCz4uJCREERERioyMDPyMjIxU9+7d1bNnT11zzTXq1auXevXqpQ4dOpz3OWw225V6W5esUYd4Zs6cqZEjRyolJUWlpaVauHChfvvb32rVqlXq3r27Fi5cqFGjRik+Pl47d+7UY489ps6dOyszM1NS3Ym1119/vZKSkvTcc8/J4/HowQcf1A9/+MNGXWbMVTwAgNbA5/PpzTff1HvvvacNGzaourpa8fHxgZNU4+Pj1aFDByUnJ6tz587q3Llz4LyR0NBQywWPxmy/GxVQJk+erIyMDBUUFCgmJkb9+/fXE088oW9/+9vKz8/X9773Pe3evVvl5eVKTk7W3XffrSeffLJBE8eOHdPUqVO1du1aRUVFaeLEiXr22WcVGnrxO3MIKACA1sIYo8LCQm3cuFFVVVXnBJSoqKigH465WFcsoFgFAQUAgOanMdvv5hG5AABAq0JAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlkNAAQAAlhMa7AYuhTFGkuTz+YLcCQAAuFj12+367fiXaZYBpbS0VJKUnJwc5E4AAEBjlZaWKiYm5kvH2MzFxBiL8fv9ys3NVWpqqvLz8+V0OoPdUrPl8/mUnJzMPDYB5rLpMJdNg3lsOsxl0zDGqLS0VElJSbLbv/wsk2a5B8Vut6tTp06SJKfTyR9LE2Aemw5z2XSYy6bBPDYd5vLyfdWek3qcJAsAACyHgAIAACyn2QYUh8Ohp556Sg6HI9itNGvMY9NhLpsOc9k0mMemw1xefc3yJFkAANCyNds9KAAAoOUioAAAAMshoAAAAMshoAAAAMtplgFl7ty56tq1q9q0aaMhQ4Zoy5YtwW7JctatW6cxY8YoKSlJNptNb7/9doP1xhjNnj1bHTt2VEREhNLS0nTw4MEGY4qLizVhwgQ5nU7FxsZq8uTJKisru4rvIvjS09N1ww03qG3btkpISNBdd92l3NzcBmMqKio0bdo0xcfHKzo6WuPGjVNhYWGDMXl5eRo9erQiIyOVkJCgxx9/XDU1NVfzrQTVvHnz1L9//8CHXLndbq1YsSKwnjm8dM8++6xsNpseffTRwDLm8+L8+te/ls1ma1B9+vQJrGceg8w0M4sWLTLh4eHmb3/7m9mzZ4956KGHTGxsrCksLAx2a5ayfPly85//+Z9myZIlRpJZunRpg/XPPvusiYmJMW+//bb5+OOPzZ133mm6detmzp49GxgzYsQIM2DAALNp0yazfv1607NnT3P//fdf5XcSXMOHDzcLFiwwu3fvNjk5OWbUqFEmJSXFlJWVBcZMmTLFJCcnm4yMDLNt2zZz0003ma9//euB9TU1NaZv374mLS3N7Nixwyxfvty0b9/ezJw5MxhvKSjeffdd8/7775sDBw6Y3Nxc86tf/cqEhYWZ3bt3G2OYw0u1ZcsW07VrV9O/f3/z05/+NLCc+bw4Tz31lLnuuutMQUFBoE6ePBlYzzwGV7MLKDfeeKOZNm1a4H5tba1JSkoy6enpQezK2r4YUPx+v3G5XOb5558PLCspKTEOh8O8/vrrxhhj9u7daySZrVu3BsasWLHC2Gw28+mnn1613q2mqKjISDKZmZnGmLp5CwsLM2+99VZgzL59+4wkk5WVZYypC4t2u914PJ7AmHnz5hmn02kqKyuv7huwkHbt2pmXX36ZObxEpaWlplevXmb16tXmtttuCwQU5vPiPfXUU2bAgAHnXcc8Bl+zOsRTVVWl7OxspaWlBZbZ7XalpaUpKysriJ01L0eOHJHH42kwjzExMRoyZEhgHrOyshQbG6vBgwcHxqSlpclut2vz5s1XvWer8Hq9kqS4uDhJUnZ2tqqrqxvMZZ8+fZSSktJgLvv166fExMTAmOHDh8vn82nPnj1XsXtrqK2t1aJFi1ReXi63280cXqJp06Zp9OjRDeZN4m+ysQ4ePKikpCR1795dEyZMUF5eniTm0Qqa1ZcFnjp1SrW1tQ3+GCQpMTFR+/fvD1JXzY/H45Gk885j/TqPx6OEhIQG60NDQxUXFxcY09r4/X49+uijuvnmm9W3b19JdfMUHh6u2NjYBmO/OJfnm+v6da3Frl275Ha7VVFRoejoaC1dulSpqanKyclhDhtp0aJF2r59u7Zu3XrOOv4mL96QIUP0yiuvqHfv3iooKNCcOXN06623avfu3cyjBTSrgAIE07Rp07R7925t2LAh2K00S71791ZOTo68Xq8WL16siRMnKjMzM9htNTv5+fn66U9/qtWrV6tNmzbBbqdZGzlyZOB2//79NWTIEHXp0kVvvvmmIiIigtgZpGZ2FU/79u0VEhJyzlnUhYWFcrlcQeqq+amfqy+bR5fLpaKiogbra2pqVFxc3Crnevr06Vq2bJk+/PBDde7cObDc5XKpqqpKJSUlDcZ/cS7PN9f161qL8PBw9ezZU4MGDVJ6eroGDBigF154gTlspOzsbBUVFelrX/uaQkNDFRoaqszMTP3xj39UaGioEhMTmc9LFBsbq2uuuUaHDh3i79ICmlVACQ8P16BBg5SRkRFY5vf7lZGRIbfbHcTOmpdu3brJ5XI1mEefz6fNmzcH5tHtdqukpETZ2dmBMWvWrJHf79eQIUOues/BYozR9OnTtXTpUq1Zs0bdunVrsH7QoEEKCwtrMJe5ubnKy8trMJe7du1qEPhWr14tp9Op1NTUq/NGLMjv96uyspI5bKShQ4dq165dysnJCdTgwYM1YcKEwG3m89KUlZXp8OHD6tixI3+XVhDss3Qba9GiRcbhcJhXXnnF7N271zz88MMmNja2wVnUqDvDf8eOHWbHjh1Gkvnd735nduzYYY4dO2aMqbvMODY21rzzzjtm586dZuzYsee9zHjgwIFm8+bNZsOGDaZXr16t7jLjqVOnmpiYGLN27doGlyKeOXMmMGbKlCkmJSXFrFmzxmzbts243W7jdrsD6+svRRw2bJjJyckxK1euNB06dGhVlyL+8pe/NJmZmebIkSNm586d5pe//KWx2Wzmn//8pzGGObxcn7+Kxxjm82L97Gc/M2vXrjVHjhwxH330kUlLSzPt27c3RUVFxhjmMdiaXUAxxpj/+Z//MSkpKSY8PNzceOONZtOmTcFuyXI+/PBDI+mcmjhxojGm7lLjWbNmmcTERONwOMzQoUNNbm5ug+c4ffq0uf/++010dLRxOp1m0qRJprS0NAjvJnjON4eSzIIFCwJjzp49a3784x+bdu3amcjISHP33XebgoKCBs9z9OhRM3LkSBMREWHat29vfvazn5nq6uqr/G6C5z/+4z9Mly5dTHh4uOnQoYMZOnRoIJwYwxxeri8GFObz4owfP9507NjRhIeHm06dOpnx48ebQ4cOBdYzj8FlM8aY4Oy7AQAAOL9mdQ4KAABoHQgoAADAcggoAADAcggoAADAcggoAADAcggoAADAcggoAADAcggoAADAcggoAADAcggoAADAcggoAADAcggoAADAcv4/j89a/ZauyUoAAAAASUVORK5CYII=\n" }, "metadata": {} } ] }, { "cell_type": "markdown", "metadata": { "id": "XIrKGGSlENZB" }, "source": [ "## Create the LunarLander environment ๐ŸŒ› and understand how it works\n", "\n", "### [The environment ๐ŸŽฎ](https://gymnasium.farama.org/environments/box2d/lunar_lander/)\n", "\n", "In this first tutorial, weโ€™re going to train our agent, a [Lunar Lander](https://gymnasium.farama.org/environments/box2d/lunar_lander/), **to land correctly on the moon**. To do that, the agent needs to learn **to adapt its speed and position (horizontal, vertical, and angular) to land correctly.**\n", "\n", "---\n", "\n", "\n", "๐Ÿ’ก A good habit when you start to use an environment is to check its documentation\n", "\n", "๐Ÿ‘‰ https://gymnasium.farama.org/environments/box2d/lunar_lander/\n", "\n", "---\n" ] }, { "cell_type": "markdown", "metadata": { "id": "poLBgRocF9aT" }, "source": [ "Let's see what the Environment looks like:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZNPG0g_UGCfh", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "c6435444-b016-4e86-80fe-b0c980d673e8" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "_____OBSERVATION SPACE_____ \n", "\n", "Observation Space Shape (8,)\n", "Sample observation [83.86315 41.9888 4.1694527 -0.48541385 -1.7971293 4.2255435\n", " 0.27640635 0.5078781 ]\n" ] } ], "source": [ "# We create our environment with gym.make(\"\")\n", "env = gym.make(\"LunarLander-v2\")\n", "env.reset()\n", "print(\"_____OBSERVATION SPACE_____ \\n\")\n", "print(\"Observation Space Shape\", env.observation_space.shape)\n", "print(\"Sample observation\", env.observation_space.sample()) # Get a random observation" ] }, { "cell_type": "markdown", "metadata": { "id": "2MXc15qFE0M9" }, "source": [ "We see with `Observation Space Shape (8,)` that the observation is a vector of size 8, where each value contains different information about the lander:\n", "- Horizontal pad coordinate (x)\n", "- Vertical pad coordinate (y)\n", "- Horizontal speed (x)\n", "- Vertical speed (y)\n", "- Angle\n", "- Angular speed\n", "- If the left leg contact point has touched the land (boolean)\n", "- If the right leg contact point has touched the land (boolean)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "We5WqOBGLoSm", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "02bfb6d5-6e6f-48ba-9a45-e88b082a3954" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\n", " _____ACTION SPACE_____ \n", "\n", "Action Space Shape 4\n", "Action Space Sample 2\n" ] } ], "source": [ "print(\"\\n _____ACTION SPACE_____ \\n\")\n", "print(\"Action Space Shape\", env.action_space.n)\n", "print(\"Action Space Sample\", env.action_space.sample()) # Take a random action" ] }, { "cell_type": "markdown", "metadata": { "id": "MyxXwkI2Magx" }, "source": [ "The action space (the set of possible actions the agent can take) is discrete with 4 actions available ๐ŸŽฎ:\n", "\n", "- Action 0: Do nothing,\n", "- Action 1: Fire left orientation engine,\n", "- Action 2: Fire the main engine,\n", "- Action 3: Fire right orientation engine.\n", "\n", "Reward function (the function that will give a reward at each timestep) ๐Ÿ’ฐ:\n", "\n", "After every step a reward is granted. The total reward of an episode is the **sum of the rewards for all the steps within that episode**.\n", "\n", "For each step, the reward:\n", "\n", "- Is increased/decreased the closer/further the lander is to the landing pad.\n", "- Is increased/decreased the slower/faster the lander is moving.\n", "- Is decreased the more the lander is tilted (angle not horizontal).\n", "- Is increased by 10 points for each leg that is in contact with the ground.\n", "- Is decreased by 0.03 points each frame a side engine is firing.\n", "- Is decreased by 0.3 points each frame the main engine is firing.\n", "\n", "The episode receive an **additional reward of -100 or +100 points for crashing or landing safely respectively.**\n", "\n", "An episode is **considered a solution if it scores at least 200 points.**" ] }, { "cell_type": "markdown", "metadata": { "id": "dFD9RAFjG8aq" }, "source": [ "#### Vectorized Environment\n", "\n", "- We create a vectorized environment (a method for stacking multiple independent environments into a single environment) of 16 environments, this way, **we'll have more diverse experiences during the training.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "99hqQ_etEy1N" }, "outputs": [], "source": [ "# Create the environment\n", "env = make_vec_env('LunarLander-v2', n_envs=16)" ] }, { "cell_type": "markdown", "metadata": { "id": "VgrE86r5E5IK" }, "source": [ "## Create the Model ๐Ÿค–\n", "- We have studied our environment and we understood the problem: **being able to land the Lunar Lander to the Landing Pad correctly by controlling left, right and main orientation engine**. Now let's build the algorithm we're going to use to solve this Problem ๐Ÿš€.\n", "\n", "- To do so, we're going to use our first Deep RL library, [Stable Baselines3 (SB3)](https://stable-baselines3.readthedocs.io/en/master/).\n", "\n", "- SB3 is a set of **reliable implementations of reinforcement learning algorithms in PyTorch**.\n", "\n", "---\n", "\n", "๐Ÿ’ก A good habit when using a new library is to dive first on the documentation: https://stable-baselines3.readthedocs.io/en/master/ and then try some tutorials.\n", "\n", "----" ] }, { "cell_type": "markdown", "source": [ "\"Stable" ], "metadata": { "id": "HLlClRW37Q7e" } }, { "cell_type": "markdown", "metadata": { "id": "HV4yiUM_9_Ka" }, "source": [ "To solve this problem, we're going to use SB3 **PPO**. [PPO (aka Proximal Policy Optimization) is one of the SOTA (state of the art) Deep Reinforcement Learning algorithms that you'll study during this course](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html#example%5D).\n", "\n", "PPO is a combination of:\n", "- *Value-based reinforcement learning method*: learning an action-value function that will tell us the **most valuable action to take given a state and action**.\n", "- *Policy-based reinforcement learning method*: learning a policy that will **give us a probability distribution over actions**." ] }, { "cell_type": "markdown", "metadata": { "id": "5qL_4HeIOrEJ" }, "source": [ "Stable-Baselines3 is easy to set up:\n", "\n", "1๏ธโƒฃ You **create your environment** (in our case it was done above)\n", "\n", "2๏ธโƒฃ You define the **model you want to use and instantiate this model** `model = PPO(\"MlpPolicy\")`\n", "\n", "3๏ธโƒฃ You **train the agent** with `model.learn` and define the number of training timesteps\n", "\n", "```\n", "# Create environment\n", "env = gym.make('LunarLander-v2')\n", "\n", "# Instantiate the agent\n", "model = PPO('MlpPolicy', env, verbose=1)\n", "# Train the agent\n", "model.learn(total_timesteps=int(2e5))\n", "```\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "nxI6hT1GE4-A", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "41425b77-2159-4a31-b6c9-115bc663bcef" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\u001b[1;30;43mDie letzten 5000ย Zeilen der Streamingausgabe wurden abgeschnitten.\u001b[0m\n", "| value_loss | 53.9 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 499 |\n", "| ep_rew_mean | 137 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 496 |\n", "| time_elapsed | 2378 |\n", "| total_timesteps | 1015808 |\n", "| train/ | |\n", "| approx_kl | 0.008301543 |\n", "| clip_fraction | 0.112 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.506 |\n", "| explained_variance | 0.848 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.74 |\n", "| n_updates | 4950 |\n", "| policy_gradient_loss | 0.0021 |\n", "| value_loss | 31.9 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 494 |\n", "| ep_rew_mean | 134 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 497 |\n", "| time_elapsed | 2383 |\n", "| total_timesteps | 1017856 |\n", "| train/ | |\n", "| approx_kl | 0.0056571467 |\n", "| clip_fraction | 0.0535 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.427 |\n", "| explained_variance | 0.776 |\n", "| learning_rate | 0.0003 |\n", "| loss | 24.4 |\n", "| n_updates | 4960 |\n", "| policy_gradient_loss | -0.00165 |\n", "| value_loss | 57.6 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 492 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 498 |\n", "| time_elapsed | 2388 |\n", "| total_timesteps | 1019904 |\n", "| train/ | |\n", "| approx_kl | 0.006539243 |\n", "| clip_fraction | 0.0342 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.4 |\n", "| explained_variance | 0.602 |\n", "| learning_rate | 0.0003 |\n", "| loss | 45.7 |\n", "| n_updates | 4970 |\n", "| policy_gradient_loss | -0.00162 |\n", "| value_loss | 165 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 488 |\n", "| ep_rew_mean | 146 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 499 |\n", "| time_elapsed | 2392 |\n", "| total_timesteps | 1021952 |\n", "| train/ | |\n", "| approx_kl | 0.004461292 |\n", "| clip_fraction | 0.0331 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.407 |\n", "| explained_variance | 0.826 |\n", "| learning_rate | 0.0003 |\n", "| loss | 20.1 |\n", "| n_updates | 4980 |\n", "| policy_gradient_loss | -0.00213 |\n", "| value_loss | 46.3 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 487 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 500 |\n", "| time_elapsed | 2397 |\n", "| total_timesteps | 1024000 |\n", "| train/ | |\n", "| approx_kl | 0.0037206972 |\n", "| clip_fraction | 0.0657 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.449 |\n", "| explained_variance | 0.875 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.22 |\n", "| n_updates | 4990 |\n", "| policy_gradient_loss | -0.000565 |\n", "| value_loss | 50 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 481 |\n", "| ep_rew_mean | 140 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 501 |\n", "| time_elapsed | 2402 |\n", "| total_timesteps | 1026048 |\n", "| train/ | |\n", "| approx_kl | 0.005997791 |\n", "| clip_fraction | 0.0662 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.454 |\n", "| explained_variance | 0.778 |\n", "| learning_rate | 0.0003 |\n", "| loss | 10.7 |\n", "| n_updates | 5000 |\n", "| policy_gradient_loss | -0.00115 |\n", "| value_loss | 131 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 491 |\n", "| ep_rew_mean | 142 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 502 |\n", "| time_elapsed | 2406 |\n", "| total_timesteps | 1028096 |\n", "| train/ | |\n", "| approx_kl | 0.008173959 |\n", "| clip_fraction | 0.0615 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.465 |\n", "| explained_variance | 0.842 |\n", "| learning_rate | 0.0003 |\n", "| loss | 14.7 |\n", "| n_updates | 5010 |\n", "| policy_gradient_loss | -0.00238 |\n", "| value_loss | 75 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 482 |\n", "| ep_rew_mean | 133 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 503 |\n", "| time_elapsed | 2411 |\n", "| total_timesteps | 1030144 |\n", "| train/ | |\n", "| approx_kl | 0.012109974 |\n", "| clip_fraction | 0.0654 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.391 |\n", "| explained_variance | 0.625 |\n", "| learning_rate | 0.0003 |\n", "| loss | 50.3 |\n", "| n_updates | 5020 |\n", "| policy_gradient_loss | -0.00338 |\n", "| value_loss | 74.2 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 485 |\n", "| ep_rew_mean | 131 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 504 |\n", "| time_elapsed | 2416 |\n", "| total_timesteps | 1032192 |\n", "| train/ | |\n", "| approx_kl | 0.0056428784 |\n", "| clip_fraction | 0.0532 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.425 |\n", "| explained_variance | 0.864 |\n", "| learning_rate | 0.0003 |\n", "| loss | 26.2 |\n", "| n_updates | 5030 |\n", "| policy_gradient_loss | 0.000837 |\n", "| value_loss | 88.1 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 491 |\n", "| ep_rew_mean | 131 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 505 |\n", "| time_elapsed | 2420 |\n", "| total_timesteps | 1034240 |\n", "| train/ | |\n", "| approx_kl | 0.013307229 |\n", "| clip_fraction | 0.0717 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.454 |\n", "| explained_variance | 0.897 |\n", "| learning_rate | 0.0003 |\n", "| loss | 87.4 |\n", "| n_updates | 5040 |\n", "| policy_gradient_loss | 0.00081 |\n", "| value_loss | 59 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 484 |\n", "| ep_rew_mean | 131 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 506 |\n", "| time_elapsed | 2425 |\n", "| total_timesteps | 1036288 |\n", "| train/ | |\n", "| approx_kl | 0.0047076233 |\n", "| clip_fraction | 0.0721 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.415 |\n", "| explained_variance | 0.951 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.84 |\n", "| n_updates | 5050 |\n", "| policy_gradient_loss | 0.000487 |\n", "| value_loss | 16.4 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 478 |\n", "| ep_rew_mean | 131 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 507 |\n", "| time_elapsed | 2430 |\n", "| total_timesteps | 1038336 |\n", "| train/ | |\n", "| approx_kl | 0.004532858 |\n", "| clip_fraction | 0.0487 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.495 |\n", "| explained_variance | 0.738 |\n", "| learning_rate | 0.0003 |\n", "| loss | 9.91 |\n", "| n_updates | 5060 |\n", "| policy_gradient_loss | -0.000681 |\n", "| value_loss | 63.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 474 |\n", "| ep_rew_mean | 130 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 508 |\n", "| time_elapsed | 2435 |\n", "| total_timesteps | 1040384 |\n", "| train/ | |\n", "| approx_kl | 0.005483578 |\n", "| clip_fraction | 0.0621 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.461 |\n", "| explained_variance | 0.917 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.2 |\n", "| n_updates | 5070 |\n", "| policy_gradient_loss | -0.00195 |\n", "| value_loss | 25.9 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 470 |\n", "| ep_rew_mean | 127 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 509 |\n", "| time_elapsed | 2440 |\n", "| total_timesteps | 1042432 |\n", "| train/ | |\n", "| approx_kl | 0.0054120137 |\n", "| clip_fraction | 0.0611 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.442 |\n", "| explained_variance | 0.859 |\n", "| learning_rate | 0.0003 |\n", "| loss | 12.1 |\n", "| n_updates | 5080 |\n", "| policy_gradient_loss | -0.000845 |\n", "| value_loss | 62.5 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 474 |\n", "| ep_rew_mean | 127 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 510 |\n", "| time_elapsed | 2444 |\n", "| total_timesteps | 1044480 |\n", "| train/ | |\n", "| approx_kl | 0.011429458 |\n", "| clip_fraction | 0.0962 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.463 |\n", "| explained_variance | 0.675 |\n", "| learning_rate | 0.0003 |\n", "| loss | 23.8 |\n", "| n_updates | 5090 |\n", "| policy_gradient_loss | -0.00188 |\n", "| value_loss | 90.2 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 467 |\n", "| ep_rew_mean | 129 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 511 |\n", "| time_elapsed | 2449 |\n", "| total_timesteps | 1046528 |\n", "| train/ | |\n", "| approx_kl | 0.00515816 |\n", "| clip_fraction | 0.0602 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.482 |\n", "| explained_variance | 0.922 |\n", "| learning_rate | 0.0003 |\n", "| loss | 3.13 |\n", "| n_updates | 5100 |\n", "| policy_gradient_loss | 0.000621 |\n", "| value_loss | 19.4 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 467 |\n", "| ep_rew_mean | 128 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 512 |\n", "| time_elapsed | 2454 |\n", "| total_timesteps | 1048576 |\n", "| train/ | |\n", "| approx_kl | 0.0064999023 |\n", "| clip_fraction | 0.0872 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.472 |\n", "| explained_variance | 0.883 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.26 |\n", "| n_updates | 5110 |\n", "| policy_gradient_loss | -0.00259 |\n", "| value_loss | 26.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 460 |\n", "| ep_rew_mean | 132 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 513 |\n", "| time_elapsed | 2458 |\n", "| total_timesteps | 1050624 |\n", "| train/ | |\n", "| approx_kl | 0.010653468 |\n", "| clip_fraction | 0.0903 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.572 |\n", "| explained_variance | 0.788 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.76 |\n", "| n_updates | 5120 |\n", "| policy_gradient_loss | 0.000836 |\n", "| value_loss | 26.3 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 445 |\n", "| ep_rew_mean | 133 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 514 |\n", "| time_elapsed | 2463 |\n", "| total_timesteps | 1052672 |\n", "| train/ | |\n", "| approx_kl | 0.005393303 |\n", "| clip_fraction | 0.0725 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.463 |\n", "| explained_variance | 0.701 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.38 |\n", "| n_updates | 5130 |\n", "| policy_gradient_loss | -0.00304 |\n", "| value_loss | 108 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 454 |\n", "| ep_rew_mean | 131 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 515 |\n", "| time_elapsed | 2468 |\n", "| total_timesteps | 1054720 |\n", "| train/ | |\n", "| approx_kl | 0.0052941935 |\n", "| clip_fraction | 0.0557 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.445 |\n", "| explained_variance | 0.868 |\n", "| learning_rate | 0.0003 |\n", "| loss | 14.8 |\n", "| n_updates | 5140 |\n", "| policy_gradient_loss | -0.0012 |\n", "| value_loss | 57.8 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 455 |\n", "| ep_rew_mean | 128 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 516 |\n", "| time_elapsed | 2472 |\n", "| total_timesteps | 1056768 |\n", "| train/ | |\n", "| approx_kl | 0.0060110656 |\n", "| clip_fraction | 0.0501 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.458 |\n", "| explained_variance | 0.526 |\n", "| learning_rate | 0.0003 |\n", "| loss | 46.3 |\n", "| n_updates | 5150 |\n", "| policy_gradient_loss | -0.000451 |\n", "| value_loss | 105 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 458 |\n", "| ep_rew_mean | 130 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 517 |\n", "| time_elapsed | 2478 |\n", "| total_timesteps | 1058816 |\n", "| train/ | |\n", "| approx_kl | 0.006670952 |\n", "| clip_fraction | 0.076 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.434 |\n", "| explained_variance | 0.808 |\n", "| learning_rate | 0.0003 |\n", "| loss | 39.7 |\n", "| n_updates | 5160 |\n", "| policy_gradient_loss | -0.00273 |\n", "| value_loss | 71.1 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 457 |\n", "| ep_rew_mean | 125 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 518 |\n", "| time_elapsed | 2482 |\n", "| total_timesteps | 1060864 |\n", "| train/ | |\n", "| approx_kl | 0.0070992587 |\n", "| clip_fraction | 0.101 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.501 |\n", "| explained_variance | 0.877 |\n", "| learning_rate | 0.0003 |\n", "| loss | 37.8 |\n", "| n_updates | 5170 |\n", "| policy_gradient_loss | -0.00143 |\n", "| value_loss | 17.9 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 463 |\n", "| ep_rew_mean | 125 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 519 |\n", "| time_elapsed | 2487 |\n", "| total_timesteps | 1062912 |\n", "| train/ | |\n", "| approx_kl | 0.0024157302 |\n", "| clip_fraction | 0.0399 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.422 |\n", "| explained_variance | 0.75 |\n", "| learning_rate | 0.0003 |\n", "| loss | 12.1 |\n", "| n_updates | 5180 |\n", "| policy_gradient_loss | -0.00116 |\n", "| value_loss | 96.2 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 463 |\n", "| ep_rew_mean | 132 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 520 |\n", "| time_elapsed | 2492 |\n", "| total_timesteps | 1064960 |\n", "| train/ | |\n", "| approx_kl | 0.0037717647 |\n", "| clip_fraction | 0.0534 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.512 |\n", "| explained_variance | 0.653 |\n", "| learning_rate | 0.0003 |\n", "| loss | 10.9 |\n", "| n_updates | 5190 |\n", "| policy_gradient_loss | 0.00118 |\n", "| value_loss | 40.3 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 467 |\n", "| ep_rew_mean | 134 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 521 |\n", "| time_elapsed | 2496 |\n", "| total_timesteps | 1067008 |\n", "| train/ | |\n", "| approx_kl | 0.005939821 |\n", "| clip_fraction | 0.0647 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.425 |\n", "| explained_variance | 0.802 |\n", "| learning_rate | 0.0003 |\n", "| loss | 12.6 |\n", "| n_updates | 5200 |\n", "| policy_gradient_loss | -0.000256 |\n", "| value_loss | 50.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 467 |\n", "| ep_rew_mean | 132 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 522 |\n", "| time_elapsed | 2501 |\n", "| total_timesteps | 1069056 |\n", "| train/ | |\n", "| approx_kl | 0.005414643 |\n", "| clip_fraction | 0.059 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.443 |\n", "| explained_variance | 0.874 |\n", "| learning_rate | 0.0003 |\n", "| loss | 26.4 |\n", "| n_updates | 5210 |\n", "| policy_gradient_loss | -0.000349 |\n", "| value_loss | 44.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 471 |\n", "| ep_rew_mean | 136 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 523 |\n", "| time_elapsed | 2506 |\n", "| total_timesteps | 1071104 |\n", "| train/ | |\n", "| approx_kl | 0.004514157 |\n", "| clip_fraction | 0.0214 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.454 |\n", "| explained_variance | 0.739 |\n", "| learning_rate | 0.0003 |\n", "| loss | 56.8 |\n", "| n_updates | 5220 |\n", "| policy_gradient_loss | -0.000676 |\n", "| value_loss | 93.7 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 472 |\n", "| ep_rew_mean | 143 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 524 |\n", "| time_elapsed | 2510 |\n", "| total_timesteps | 1073152 |\n", "| train/ | |\n", "| approx_kl | 0.0045242887 |\n", "| clip_fraction | 0.0487 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.426 |\n", "| explained_variance | 0.882 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.81 |\n", "| n_updates | 5230 |\n", "| policy_gradient_loss | -0.000772 |\n", "| value_loss | 26.8 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 469 |\n", "| ep_rew_mean | 143 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 525 |\n", "| time_elapsed | 2515 |\n", "| total_timesteps | 1075200 |\n", "| train/ | |\n", "| approx_kl | 0.002891739 |\n", "| clip_fraction | 0.0568 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.471 |\n", "| explained_variance | 0.762 |\n", "| learning_rate | 0.0003 |\n", "| loss | 37.4 |\n", "| n_updates | 5240 |\n", "| policy_gradient_loss | -0.000138 |\n", "| value_loss | 54.5 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 475 |\n", "| ep_rew_mean | 150 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 526 |\n", "| time_elapsed | 2520 |\n", "| total_timesteps | 1077248 |\n", "| train/ | |\n", "| approx_kl | 0.0076642334 |\n", "| clip_fraction | 0.0672 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.458 |\n", "| explained_variance | 0.831 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.35 |\n", "| n_updates | 5250 |\n", "| policy_gradient_loss | -0.00156 |\n", "| value_loss | 33.3 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 471 |\n", "| ep_rew_mean | 156 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 527 |\n", "| time_elapsed | 2524 |\n", "| total_timesteps | 1079296 |\n", "| train/ | |\n", "| approx_kl | 0.004282906 |\n", "| clip_fraction | 0.0358 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.42 |\n", "| explained_variance | 0.621 |\n", "| learning_rate | 0.0003 |\n", "| loss | 36.1 |\n", "| n_updates | 5260 |\n", "| policy_gradient_loss | -0.00418 |\n", "| value_loss | 101 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 464 |\n", "| ep_rew_mean | 157 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 528 |\n", "| time_elapsed | 2529 |\n", "| total_timesteps | 1081344 |\n", "| train/ | |\n", "| approx_kl | 0.0035337412 |\n", "| clip_fraction | 0.0499 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.428 |\n", "| explained_variance | 0.657 |\n", "| learning_rate | 0.0003 |\n", "| loss | 111 |\n", "| n_updates | 5270 |\n", "| policy_gradient_loss | -0.00215 |\n", "| value_loss | 130 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 457 |\n", "| ep_rew_mean | 157 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 529 |\n", "| time_elapsed | 2534 |\n", "| total_timesteps | 1083392 |\n", "| train/ | |\n", "| approx_kl | 0.0052973675 |\n", "| clip_fraction | 0.0561 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.443 |\n", "| explained_variance | 0.672 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.77 |\n", "| n_updates | 5280 |\n", "| policy_gradient_loss | -0.00231 |\n", "| value_loss | 102 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 457 |\n", "| ep_rew_mean | 160 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 530 |\n", "| time_elapsed | 2538 |\n", "| total_timesteps | 1085440 |\n", "| train/ | |\n", "| approx_kl | 0.0063736076 |\n", "| clip_fraction | 0.0683 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.464 |\n", "| explained_variance | 0.776 |\n", "| learning_rate | 0.0003 |\n", "| loss | 24.5 |\n", "| n_updates | 5290 |\n", "| policy_gradient_loss | -0.000645 |\n", "| value_loss | 74.6 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 454 |\n", "| ep_rew_mean | 165 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 531 |\n", "| time_elapsed | 2543 |\n", "| total_timesteps | 1087488 |\n", "| train/ | |\n", "| approx_kl | 0.007256862 |\n", "| clip_fraction | 0.0622 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.465 |\n", "| explained_variance | 0.796 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.56 |\n", "| n_updates | 5300 |\n", "| policy_gradient_loss | -0.000339 |\n", "| value_loss | 38.4 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 448 |\n", "| ep_rew_mean | 165 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 532 |\n", "| time_elapsed | 2548 |\n", "| total_timesteps | 1089536 |\n", "| train/ | |\n", "| approx_kl | 0.0073278416 |\n", "| clip_fraction | 0.0629 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.454 |\n", "| explained_variance | 0.885 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.67 |\n", "| n_updates | 5310 |\n", "| policy_gradient_loss | -0.000978 |\n", "| value_loss | 29.3 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 448 |\n", "| ep_rew_mean | 161 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 533 |\n", "| time_elapsed | 2552 |\n", "| total_timesteps | 1091584 |\n", "| train/ | |\n", "| approx_kl | 0.005328345 |\n", "| clip_fraction | 0.0432 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.458 |\n", "| explained_variance | 0.674 |\n", "| learning_rate | 0.0003 |\n", "| loss | 17.5 |\n", "| n_updates | 5320 |\n", "| policy_gradient_loss | -0.00175 |\n", "| value_loss | 108 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 445 |\n", "| ep_rew_mean | 167 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 534 |\n", "| time_elapsed | 2557 |\n", "| total_timesteps | 1093632 |\n", "| train/ | |\n", "| approx_kl | 0.00842984 |\n", "| clip_fraction | 0.089 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.424 |\n", "| explained_variance | 0.648 |\n", "| learning_rate | 0.0003 |\n", "| loss | 42.2 |\n", "| n_updates | 5330 |\n", "| policy_gradient_loss | 0.00179 |\n", "| value_loss | 114 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 437 |\n", "| ep_rew_mean | 169 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 535 |\n", "| time_elapsed | 2562 |\n", "| total_timesteps | 1095680 |\n", "| train/ | |\n", "| approx_kl | 0.010399334 |\n", "| clip_fraction | 0.0453 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.452 |\n", "| explained_variance | 0.685 |\n", "| learning_rate | 0.0003 |\n", "| loss | 47.8 |\n", "| n_updates | 5340 |\n", "| policy_gradient_loss | -0.00378 |\n", "| value_loss | 76.3 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 441 |\n", "| ep_rew_mean | 171 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 536 |\n", "| time_elapsed | 2567 |\n", "| total_timesteps | 1097728 |\n", "| train/ | |\n", "| approx_kl | 0.008099465 |\n", "| clip_fraction | 0.081 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.47 |\n", "| explained_variance | 0.829 |\n", "| learning_rate | 0.0003 |\n", "| loss | 10.7 |\n", "| n_updates | 5350 |\n", "| policy_gradient_loss | 0.00151 |\n", "| value_loss | 27.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 444 |\n", "| ep_rew_mean | 177 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 537 |\n", "| time_elapsed | 2571 |\n", "| total_timesteps | 1099776 |\n", "| train/ | |\n", "| approx_kl | 0.012389656 |\n", "| clip_fraction | 0.0731 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.441 |\n", "| explained_variance | 0.909 |\n", "| learning_rate | 0.0003 |\n", "| loss | 10.4 |\n", "| n_updates | 5360 |\n", "| policy_gradient_loss | -0.00315 |\n", "| value_loss | 23.5 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 445 |\n", "| ep_rew_mean | 170 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 538 |\n", "| time_elapsed | 2576 |\n", "| total_timesteps | 1101824 |\n", "| train/ | |\n", "| approx_kl | 0.0068274755 |\n", "| clip_fraction | 0.0667 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.503 |\n", "| explained_variance | 0.756 |\n", "| learning_rate | 0.0003 |\n", "| loss | 9.19 |\n", "| n_updates | 5370 |\n", "| policy_gradient_loss | 4.59e-05 |\n", "| value_loss | 24.4 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 444 |\n", "| ep_rew_mean | 166 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 539 |\n", "| time_elapsed | 2581 |\n", "| total_timesteps | 1103872 |\n", "| train/ | |\n", "| approx_kl | 0.005098396 |\n", "| clip_fraction | 0.0624 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.536 |\n", "| explained_variance | 0.704 |\n", "| learning_rate | 0.0003 |\n", "| loss | 28.6 |\n", "| n_updates | 5380 |\n", "| policy_gradient_loss | -0.00052 |\n", "| value_loss | 90.7 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 440 |\n", "| ep_rew_mean | 169 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 540 |\n", "| time_elapsed | 2585 |\n", "| total_timesteps | 1105920 |\n", "| train/ | |\n", "| approx_kl | 0.0050902776 |\n", "| clip_fraction | 0.0606 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.411 |\n", "| explained_variance | 0.633 |\n", "| learning_rate | 0.0003 |\n", "| loss | 23.5 |\n", "| n_updates | 5390 |\n", "| policy_gradient_loss | -0.0017 |\n", "| value_loss | 64 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 445 |\n", "| ep_rew_mean | 167 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 541 |\n", "| time_elapsed | 2590 |\n", "| total_timesteps | 1107968 |\n", "| train/ | |\n", "| approx_kl | 0.009280775 |\n", "| clip_fraction | 0.0801 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.491 |\n", "| explained_variance | 0.759 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.83 |\n", "| n_updates | 5400 |\n", "| policy_gradient_loss | -0.00237 |\n", "| value_loss | 67.1 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 442 |\n", "| ep_rew_mean | 165 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 542 |\n", "| time_elapsed | 2595 |\n", "| total_timesteps | 1110016 |\n", "| train/ | |\n", "| approx_kl | 0.0029693714 |\n", "| clip_fraction | 0.0259 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.366 |\n", "| explained_variance | 0.757 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.02 |\n", "| n_updates | 5410 |\n", "| policy_gradient_loss | -0.00102 |\n", "| value_loss | 56.7 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 445 |\n", "| ep_rew_mean | 164 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 543 |\n", "| time_elapsed | 2599 |\n", "| total_timesteps | 1112064 |\n", "| train/ | |\n", "| approx_kl | 0.00690945 |\n", "| clip_fraction | 0.0866 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.543 |\n", "| explained_variance | 0.906 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.32 |\n", "| n_updates | 5420 |\n", "| policy_gradient_loss | 0.000484 |\n", "| value_loss | 24.7 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 452 |\n", "| ep_rew_mean | 162 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 544 |\n", "| time_elapsed | 2604 |\n", "| total_timesteps | 1114112 |\n", "| train/ | |\n", "| approx_kl | 0.0036692936 |\n", "| clip_fraction | 0.0421 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.494 |\n", "| explained_variance | 0.807 |\n", "| learning_rate | 0.0003 |\n", "| loss | 9.58 |\n", "| n_updates | 5430 |\n", "| policy_gradient_loss | 0.000265 |\n", "| value_loss | 50 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 464 |\n", "| ep_rew_mean | 159 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 545 |\n", "| time_elapsed | 2609 |\n", "| total_timesteps | 1116160 |\n", "| train/ | |\n", "| approx_kl | 0.0068115145 |\n", "| clip_fraction | 0.0474 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.536 |\n", "| explained_variance | 0.846 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.18 |\n", "| n_updates | 5440 |\n", "| policy_gradient_loss | 0.000361 |\n", "| value_loss | 17.9 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 464 |\n", "| ep_rew_mean | 159 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 546 |\n", "| time_elapsed | 2614 |\n", "| total_timesteps | 1118208 |\n", "| train/ | |\n", "| approx_kl | 0.011959298 |\n", "| clip_fraction | 0.0914 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.538 |\n", "| explained_variance | 0.77 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.01 |\n", "| n_updates | 5450 |\n", "| policy_gradient_loss | 0.000449 |\n", "| value_loss | 46 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 466 |\n", "| ep_rew_mean | 156 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 547 |\n", "| time_elapsed | 2619 |\n", "| total_timesteps | 1120256 |\n", "| train/ | |\n", "| approx_kl | 0.004132043 |\n", "| clip_fraction | 0.0497 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.428 |\n", "| explained_variance | 0.685 |\n", "| learning_rate | 0.0003 |\n", "| loss | 12.2 |\n", "| n_updates | 5460 |\n", "| policy_gradient_loss | -0.000349 |\n", "| value_loss | 106 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 457 |\n", "| ep_rew_mean | 153 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 548 |\n", "| time_elapsed | 2623 |\n", "| total_timesteps | 1122304 |\n", "| train/ | |\n", "| approx_kl | 0.0057540955 |\n", "| clip_fraction | 0.0654 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.455 |\n", "| explained_variance | 0.744 |\n", "| learning_rate | 0.0003 |\n", "| loss | 108 |\n", "| n_updates | 5470 |\n", "| policy_gradient_loss | -0.00146 |\n", "| value_loss | 93.9 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 462 |\n", "| ep_rew_mean | 154 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 549 |\n", "| time_elapsed | 2628 |\n", "| total_timesteps | 1124352 |\n", "| train/ | |\n", "| approx_kl | 0.0041732434 |\n", "| clip_fraction | 0.0522 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.431 |\n", "| explained_variance | 0.878 |\n", "| learning_rate | 0.0003 |\n", "| loss | 9.43 |\n", "| n_updates | 5480 |\n", "| policy_gradient_loss | -0.00454 |\n", "| value_loss | 93 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 466 |\n", "| ep_rew_mean | 149 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 550 |\n", "| time_elapsed | 2633 |\n", "| total_timesteps | 1126400 |\n", "| train/ | |\n", "| approx_kl | 0.0044614472 |\n", "| clip_fraction | 0.0734 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.538 |\n", "| explained_variance | 0.753 |\n", "| learning_rate | 0.0003 |\n", "| loss | 15.8 |\n", "| n_updates | 5490 |\n", "| policy_gradient_loss | -0.00121 |\n", "| value_loss | 44.4 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 466 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 551 |\n", "| time_elapsed | 2637 |\n", "| total_timesteps | 1128448 |\n", "| train/ | |\n", "| approx_kl | 0.01227468 |\n", "| clip_fraction | 0.0924 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.471 |\n", "| explained_variance | 0.845 |\n", "| learning_rate | 0.0003 |\n", "| loss | 21.2 |\n", "| n_updates | 5500 |\n", "| policy_gradient_loss | -0.0026 |\n", "| value_loss | 81.2 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 474 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 552 |\n", "| time_elapsed | 2642 |\n", "| total_timesteps | 1130496 |\n", "| train/ | |\n", "| approx_kl | 0.0061698486 |\n", "| clip_fraction | 0.0491 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.437 |\n", "| explained_variance | 0.847 |\n", "| learning_rate | 0.0003 |\n", "| loss | 9.46 |\n", "| n_updates | 5510 |\n", "| policy_gradient_loss | -0.00399 |\n", "| value_loss | 59.5 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 474 |\n", "| ep_rew_mean | 144 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 553 |\n", "| time_elapsed | 2647 |\n", "| total_timesteps | 1132544 |\n", "| train/ | |\n", "| approx_kl | 0.006071769 |\n", "| clip_fraction | 0.0852 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.526 |\n", "| explained_variance | 0.81 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.42 |\n", "| n_updates | 5520 |\n", "| policy_gradient_loss | 0.000307 |\n", "| value_loss | 43 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 483 |\n", "| ep_rew_mean | 137 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 554 |\n", "| time_elapsed | 2651 |\n", "| total_timesteps | 1134592 |\n", "| train/ | |\n", "| approx_kl | 0.009480523 |\n", "| clip_fraction | 0.0375 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.453 |\n", "| explained_variance | 0.577 |\n", "| learning_rate | 0.0003 |\n", "| loss | 46.2 |\n", "| n_updates | 5530 |\n", "| policy_gradient_loss | -0.00143 |\n", "| value_loss | 89.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 483 |\n", "| ep_rew_mean | 134 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 555 |\n", "| time_elapsed | 2656 |\n", "| total_timesteps | 1136640 |\n", "| train/ | |\n", "| approx_kl | 0.007384347 |\n", "| clip_fraction | 0.0536 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.41 |\n", "| explained_variance | 0.832 |\n", "| learning_rate | 0.0003 |\n", "| loss | 33 |\n", "| n_updates | 5540 |\n", "| policy_gradient_loss | -0.00135 |\n", "| value_loss | 86.6 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 479 |\n", "| ep_rew_mean | 132 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 556 |\n", "| time_elapsed | 2661 |\n", "| total_timesteps | 1138688 |\n", "| train/ | |\n", "| approx_kl | 0.0037082974 |\n", "| clip_fraction | 0.0351 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.435 |\n", "| explained_variance | 0.729 |\n", "| learning_rate | 0.0003 |\n", "| loss | 63.7 |\n", "| n_updates | 5550 |\n", "| policy_gradient_loss | -0.00133 |\n", "| value_loss | 90.8 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 474 |\n", "| ep_rew_mean | 125 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 557 |\n", "| time_elapsed | 2665 |\n", "| total_timesteps | 1140736 |\n", "| train/ | |\n", "| approx_kl | 0.004881219 |\n", "| clip_fraction | 0.0453 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.441 |\n", "| explained_variance | 0.644 |\n", "| learning_rate | 0.0003 |\n", "| loss | 67.9 |\n", "| n_updates | 5560 |\n", "| policy_gradient_loss | -0.00332 |\n", "| value_loss | 159 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 484 |\n", "| ep_rew_mean | 124 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 558 |\n", "| time_elapsed | 2670 |\n", "| total_timesteps | 1142784 |\n", "| train/ | |\n", "| approx_kl | 0.0033078282 |\n", "| clip_fraction | 0.05 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.432 |\n", "| explained_variance | 0.774 |\n", "| learning_rate | 0.0003 |\n", "| loss | 45 |\n", "| n_updates | 5570 |\n", "| policy_gradient_loss | -0.00148 |\n", "| value_loss | 141 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 480 |\n", "| ep_rew_mean | 122 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 559 |\n", "| time_elapsed | 2675 |\n", "| total_timesteps | 1144832 |\n", "| train/ | |\n", "| approx_kl | 0.016727371 |\n", "| clip_fraction | 0.0709 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.433 |\n", "| explained_variance | 0.845 |\n", "| learning_rate | 0.0003 |\n", "| loss | 10 |\n", "| n_updates | 5580 |\n", "| policy_gradient_loss | -0.00391 |\n", "| value_loss | 126 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 471 |\n", "| ep_rew_mean | 114 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 560 |\n", "| time_elapsed | 2679 |\n", "| total_timesteps | 1146880 |\n", "| train/ | |\n", "| approx_kl | 0.006921467 |\n", "| clip_fraction | 0.0561 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.397 |\n", "| explained_variance | 0.779 |\n", "| learning_rate | 0.0003 |\n", "| loss | 12.2 |\n", "| n_updates | 5590 |\n", "| policy_gradient_loss | -0.00156 |\n", "| value_loss | 104 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 472 |\n", "| ep_rew_mean | 116 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 561 |\n", "| time_elapsed | 2684 |\n", "| total_timesteps | 1148928 |\n", "| train/ | |\n", "| approx_kl | 0.0072271586 |\n", "| clip_fraction | 0.0633 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.41 |\n", "| explained_variance | 0.848 |\n", "| learning_rate | 0.0003 |\n", "| loss | 22.9 |\n", "| n_updates | 5600 |\n", "| policy_gradient_loss | -0.00162 |\n", "| value_loss | 82.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 462 |\n", "| ep_rew_mean | 120 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 562 |\n", "| time_elapsed | 2689 |\n", "| total_timesteps | 1150976 |\n", "| train/ | |\n", "| approx_kl | 0.010032705 |\n", "| clip_fraction | 0.163 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.492 |\n", "| explained_variance | 0.883 |\n", "| learning_rate | 0.0003 |\n", "| loss | 10 |\n", "| n_updates | 5610 |\n", "| policy_gradient_loss | -0.000757 |\n", "| value_loss | 40.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 464 |\n", "| ep_rew_mean | 119 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 563 |\n", "| time_elapsed | 2693 |\n", "| total_timesteps | 1153024 |\n", "| train/ | |\n", "| approx_kl | 0.004457718 |\n", "| clip_fraction | 0.0421 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.409 |\n", "| explained_variance | 0.863 |\n", "| learning_rate | 0.0003 |\n", "| loss | 21.9 |\n", "| n_updates | 5620 |\n", "| policy_gradient_loss | -0.0036 |\n", "| value_loss | 57.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 454 |\n", "| ep_rew_mean | 118 |\n", "| time/ | |\n", "| fps | 427 |\n", "| iterations | 564 |\n", "| time_elapsed | 2698 |\n", "| total_timesteps | 1155072 |\n", "| train/ | |\n", "| approx_kl | 0.008716529 |\n", "| clip_fraction | 0.0625 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.46 |\n", "| explained_variance | 0.814 |\n", "| learning_rate | 0.0003 |\n", "| loss | 26.7 |\n", "| n_updates | 5630 |\n", "| policy_gradient_loss | -0.00172 |\n", "| value_loss | 122 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 461 |\n", "| ep_rew_mean | 116 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 565 |\n", "| time_elapsed | 2703 |\n", "| total_timesteps | 1157120 |\n", "| train/ | |\n", "| approx_kl | 0.002977648 |\n", "| clip_fraction | 0.051 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.392 |\n", "| explained_variance | 0.866 |\n", "| learning_rate | 0.0003 |\n", "| loss | 11.7 |\n", "| n_updates | 5640 |\n", "| policy_gradient_loss | -0.00118 |\n", "| value_loss | 72.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 467 |\n", "| ep_rew_mean | 114 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 566 |\n", "| time_elapsed | 2707 |\n", "| total_timesteps | 1159168 |\n", "| train/ | |\n", "| approx_kl | 0.013689946 |\n", "| clip_fraction | 0.154 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.527 |\n", "| explained_variance | 0.894 |\n", "| learning_rate | 0.0003 |\n", "| loss | 40.2 |\n", "| n_updates | 5650 |\n", "| policy_gradient_loss | 0.00328 |\n", "| value_loss | 34.7 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 461 |\n", "| ep_rew_mean | 116 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 567 |\n", "| time_elapsed | 2712 |\n", "| total_timesteps | 1161216 |\n", "| train/ | |\n", "| approx_kl | 0.0119428355 |\n", "| clip_fraction | 0.116 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.536 |\n", "| explained_variance | 0.886 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.66 |\n", "| n_updates | 5660 |\n", "| policy_gradient_loss | -0.00133 |\n", "| value_loss | 17.1 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 462 |\n", "| ep_rew_mean | 116 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 568 |\n", "| time_elapsed | 2717 |\n", "| total_timesteps | 1163264 |\n", "| train/ | |\n", "| approx_kl | 0.005788869 |\n", "| clip_fraction | 0.0647 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.491 |\n", "| explained_variance | 0.817 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.4 |\n", "| n_updates | 5670 |\n", "| policy_gradient_loss | -0.000802 |\n", "| value_loss | 20.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 460 |\n", "| ep_rew_mean | 116 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 569 |\n", "| time_elapsed | 2722 |\n", "| total_timesteps | 1165312 |\n", "| train/ | |\n", "| approx_kl | 0.010089215 |\n", "| clip_fraction | 0.0567 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.429 |\n", "| explained_variance | 0.599 |\n", "| learning_rate | 0.0003 |\n", "| loss | 35.1 |\n", "| n_updates | 5680 |\n", "| policy_gradient_loss | -0.00187 |\n", "| value_loss | 143 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 461 |\n", "| ep_rew_mean | 119 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 570 |\n", "| time_elapsed | 2726 |\n", "| total_timesteps | 1167360 |\n", "| train/ | |\n", "| approx_kl | 0.0042023426 |\n", "| clip_fraction | 0.0417 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.444 |\n", "| explained_variance | 0.791 |\n", "| learning_rate | 0.0003 |\n", "| loss | 26.4 |\n", "| n_updates | 5690 |\n", "| policy_gradient_loss | -0.00133 |\n", "| value_loss | 104 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 459 |\n", "| ep_rew_mean | 121 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 571 |\n", "| time_elapsed | 2731 |\n", "| total_timesteps | 1169408 |\n", "| train/ | |\n", "| approx_kl | 0.004941967 |\n", "| clip_fraction | 0.0475 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.398 |\n", "| explained_variance | 0.882 |\n", "| learning_rate | 0.0003 |\n", "| loss | 13.9 |\n", "| n_updates | 5700 |\n", "| policy_gradient_loss | -0.00401 |\n", "| value_loss | 60.8 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 456 |\n", "| ep_rew_mean | 117 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 572 |\n", "| time_elapsed | 2736 |\n", "| total_timesteps | 1171456 |\n", "| train/ | |\n", "| approx_kl | 0.0035473355 |\n", "| clip_fraction | 0.0385 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.402 |\n", "| explained_variance | 0.868 |\n", "| learning_rate | 0.0003 |\n", "| loss | 56.9 |\n", "| n_updates | 5710 |\n", "| policy_gradient_loss | -0.000825 |\n", "| value_loss | 77.7 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 465 |\n", "| ep_rew_mean | 113 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 573 |\n", "| time_elapsed | 2740 |\n", "| total_timesteps | 1173504 |\n", "| train/ | |\n", "| approx_kl | 0.0022912868 |\n", "| clip_fraction | 0.0731 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.474 |\n", "| explained_variance | 0.859 |\n", "| learning_rate | 0.0003 |\n", "| loss | 11.9 |\n", "| n_updates | 5720 |\n", "| policy_gradient_loss | 0.000227 |\n", "| value_loss | 103 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 461 |\n", "| ep_rew_mean | 116 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 574 |\n", "| time_elapsed | 2745 |\n", "| total_timesteps | 1175552 |\n", "| train/ | |\n", "| approx_kl | 0.0077647544 |\n", "| clip_fraction | 0.072 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.458 |\n", "| explained_variance | 0.806 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.32 |\n", "| n_updates | 5730 |\n", "| policy_gradient_loss | -0.0013 |\n", "| value_loss | 100 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 464 |\n", "| ep_rew_mean | 120 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 575 |\n", "| time_elapsed | 2750 |\n", "| total_timesteps | 1177600 |\n", "| train/ | |\n", "| approx_kl | 0.0058889235 |\n", "| clip_fraction | 0.0571 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.417 |\n", "| explained_variance | 0.709 |\n", "| learning_rate | 0.0003 |\n", "| loss | 37.7 |\n", "| n_updates | 5740 |\n", "| policy_gradient_loss | -0.00258 |\n", "| value_loss | 96.1 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 461 |\n", "| ep_rew_mean | 124 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 576 |\n", "| time_elapsed | 2754 |\n", "| total_timesteps | 1179648 |\n", "| train/ | |\n", "| approx_kl | 0.0057556145 |\n", "| clip_fraction | 0.0472 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.405 |\n", "| explained_variance | 0.797 |\n", "| learning_rate | 0.0003 |\n", "| loss | 46.4 |\n", "| n_updates | 5750 |\n", "| policy_gradient_loss | -0.0027 |\n", "| value_loss | 87.9 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 457 |\n", "| ep_rew_mean | 129 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 577 |\n", "| time_elapsed | 2759 |\n", "| total_timesteps | 1181696 |\n", "| train/ | |\n", "| approx_kl | 0.003787918 |\n", "| clip_fraction | 0.0511 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.334 |\n", "| explained_variance | 0.869 |\n", "| learning_rate | 0.0003 |\n", "| loss | 17.2 |\n", "| n_updates | 5760 |\n", "| policy_gradient_loss | -0.00255 |\n", "| value_loss | 48 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 458 |\n", "| ep_rew_mean | 128 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 578 |\n", "| time_elapsed | 2764 |\n", "| total_timesteps | 1183744 |\n", "| train/ | |\n", "| approx_kl | 0.006271085 |\n", "| clip_fraction | 0.0633 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.395 |\n", "| explained_variance | 0.915 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.34 |\n", "| n_updates | 5770 |\n", "| policy_gradient_loss | -0.00046 |\n", "| value_loss | 33.3 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 464 |\n", "| ep_rew_mean | 130 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 579 |\n", "| time_elapsed | 2768 |\n", "| total_timesteps | 1185792 |\n", "| train/ | |\n", "| approx_kl | 0.00771052 |\n", "| clip_fraction | 0.0635 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.392 |\n", "| explained_variance | 0.823 |\n", "| learning_rate | 0.0003 |\n", "| loss | 11.4 |\n", "| n_updates | 5780 |\n", "| policy_gradient_loss | -0.00172 |\n", "| value_loss | 77.9 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 472 |\n", "| ep_rew_mean | 132 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 580 |\n", "| time_elapsed | 2773 |\n", "| total_timesteps | 1187840 |\n", "| train/ | |\n", "| approx_kl | 0.005082637 |\n", "| clip_fraction | 0.0476 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.389 |\n", "| explained_variance | 0.856 |\n", "| learning_rate | 0.0003 |\n", "| loss | 30.7 |\n", "| n_updates | 5790 |\n", "| policy_gradient_loss | -0.0026 |\n", "| value_loss | 75.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 463 |\n", "| ep_rew_mean | 128 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 581 |\n", "| time_elapsed | 2778 |\n", "| total_timesteps | 1189888 |\n", "| train/ | |\n", "| approx_kl | 0.102977335 |\n", "| clip_fraction | 0.257 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.546 |\n", "| explained_variance | 0.836 |\n", "| learning_rate | 0.0003 |\n", "| loss | 11.5 |\n", "| n_updates | 5800 |\n", "| policy_gradient_loss | -0.0303 |\n", "| value_loss | 41.1 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 470 |\n", "| ep_rew_mean | 122 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 582 |\n", "| time_elapsed | 2782 |\n", "| total_timesteps | 1191936 |\n", "| train/ | |\n", "| approx_kl | 0.0037267106 |\n", "| clip_fraction | 0.0342 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.378 |\n", "| explained_variance | 0.638 |\n", "| learning_rate | 0.0003 |\n", "| loss | 100 |\n", "| n_updates | 5810 |\n", "| policy_gradient_loss | -0.000241 |\n", "| value_loss | 139 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 476 |\n", "| ep_rew_mean | 124 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 583 |\n", "| time_elapsed | 2787 |\n", "| total_timesteps | 1193984 |\n", "| train/ | |\n", "| approx_kl | 0.0035090386 |\n", "| clip_fraction | 0.0407 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.383 |\n", "| explained_variance | 0.536 |\n", "| learning_rate | 0.0003 |\n", "| loss | 90.3 |\n", "| n_updates | 5820 |\n", "| policy_gradient_loss | 0.000554 |\n", "| value_loss | 256 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 486 |\n", "| ep_rew_mean | 134 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 584 |\n", "| time_elapsed | 2792 |\n", "| total_timesteps | 1196032 |\n", "| train/ | |\n", "| approx_kl | 0.009374836 |\n", "| clip_fraction | 0.0588 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.357 |\n", "| explained_variance | 0.833 |\n", "| learning_rate | 0.0003 |\n", "| loss | 29.7 |\n", "| n_updates | 5830 |\n", "| policy_gradient_loss | -0.00339 |\n", "| value_loss | 53.5 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 480 |\n", "| ep_rew_mean | 134 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 585 |\n", "| time_elapsed | 2796 |\n", "| total_timesteps | 1198080 |\n", "| train/ | |\n", "| approx_kl | 0.0050649224 |\n", "| clip_fraction | 0.0918 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.466 |\n", "| explained_variance | 0.911 |\n", "| learning_rate | 0.0003 |\n", "| loss | 10.7 |\n", "| n_updates | 5840 |\n", "| policy_gradient_loss | 0.00159 |\n", "| value_loss | 23.1 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 480 |\n", "| ep_rew_mean | 132 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 586 |\n", "| time_elapsed | 2802 |\n", "| total_timesteps | 1200128 |\n", "| train/ | |\n", "| approx_kl | 0.0058341874 |\n", "| clip_fraction | 0.0526 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.358 |\n", "| explained_variance | 0.797 |\n", "| learning_rate | 0.0003 |\n", "| loss | 81.2 |\n", "| n_updates | 5850 |\n", "| policy_gradient_loss | -0.0015 |\n", "| value_loss | 81.1 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 482 |\n", "| ep_rew_mean | 133 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 587 |\n", "| time_elapsed | 2806 |\n", "| total_timesteps | 1202176 |\n", "| train/ | |\n", "| approx_kl | 0.00538499 |\n", "| clip_fraction | 0.0724 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.421 |\n", "| explained_variance | 0.864 |\n", "| learning_rate | 0.0003 |\n", "| loss | 17.3 |\n", "| n_updates | 5860 |\n", "| policy_gradient_loss | -0.00209 |\n", "| value_loss | 48.9 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 488 |\n", "| ep_rew_mean | 133 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 588 |\n", "| time_elapsed | 2810 |\n", "| total_timesteps | 1204224 |\n", "| train/ | |\n", "| approx_kl | 0.0053222985 |\n", "| clip_fraction | 0.0702 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.382 |\n", "| explained_variance | 0.822 |\n", "| learning_rate | 0.0003 |\n", "| loss | 28.7 |\n", "| n_updates | 5870 |\n", "| policy_gradient_loss | -0.00258 |\n", "| value_loss | 108 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 484 |\n", "| ep_rew_mean | 137 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 589 |\n", "| time_elapsed | 2816 |\n", "| total_timesteps | 1206272 |\n", "| train/ | |\n", "| approx_kl | 0.012854487 |\n", "| clip_fraction | 0.0964 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.506 |\n", "| explained_variance | 0.887 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.41 |\n", "| n_updates | 5880 |\n", "| policy_gradient_loss | 0.000489 |\n", "| value_loss | 9.37 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 473 |\n", "| ep_rew_mean | 132 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 590 |\n", "| time_elapsed | 2820 |\n", "| total_timesteps | 1208320 |\n", "| train/ | |\n", "| approx_kl | 0.004895383 |\n", "| clip_fraction | 0.0618 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.46 |\n", "| explained_variance | 0.855 |\n", "| learning_rate | 0.0003 |\n", "| loss | 3.2 |\n", "| n_updates | 5890 |\n", "| policy_gradient_loss | -0.000632 |\n", "| value_loss | 32.8 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 471 |\n", "| ep_rew_mean | 133 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 591 |\n", "| time_elapsed | 2824 |\n", "| total_timesteps | 1210368 |\n", "| train/ | |\n", "| approx_kl | 0.009725379 |\n", "| clip_fraction | 0.0728 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.428 |\n", "| explained_variance | 0.715 |\n", "| learning_rate | 0.0003 |\n", "| loss | 35.9 |\n", "| n_updates | 5900 |\n", "| policy_gradient_loss | -0.00595 |\n", "| value_loss | 138 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 477 |\n", "| ep_rew_mean | 131 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 592 |\n", "| time_elapsed | 2830 |\n", "| total_timesteps | 1212416 |\n", "| train/ | |\n", "| approx_kl | 0.0047201477 |\n", "| clip_fraction | 0.049 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.383 |\n", "| explained_variance | 0.517 |\n", "| learning_rate | 0.0003 |\n", "| loss | 14.5 |\n", "| n_updates | 5910 |\n", "| policy_gradient_loss | -0.00178 |\n", "| value_loss | 122 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 481 |\n", "| ep_rew_mean | 134 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 593 |\n", "| time_elapsed | 2834 |\n", "| total_timesteps | 1214464 |\n", "| train/ | |\n", "| approx_kl | 0.012734052 |\n", "| clip_fraction | 0.112 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.491 |\n", "| explained_variance | 0.854 |\n", "| learning_rate | 0.0003 |\n", "| loss | 3.05 |\n", "| n_updates | 5920 |\n", "| policy_gradient_loss | -0.00163 |\n", "| value_loss | 27.2 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 485 |\n", "| ep_rew_mean | 133 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 594 |\n", "| time_elapsed | 2839 |\n", "| total_timesteps | 1216512 |\n", "| train/ | |\n", "| approx_kl | 0.0031703059 |\n", "| clip_fraction | 0.0391 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.393 |\n", "| explained_variance | 0.655 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.68 |\n", "| n_updates | 5930 |\n", "| policy_gradient_loss | -0.00311 |\n", "| value_loss | 86.1 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 480 |\n", "| ep_rew_mean | 137 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 595 |\n", "| time_elapsed | 2844 |\n", "| total_timesteps | 1218560 |\n", "| train/ | |\n", "| approx_kl | 0.005896183 |\n", "| clip_fraction | 0.0627 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.489 |\n", "| explained_variance | 0.824 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.79 |\n", "| n_updates | 5940 |\n", "| policy_gradient_loss | -0.000458 |\n", "| value_loss | 32.7 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 478 |\n", "| ep_rew_mean | 140 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 596 |\n", "| time_elapsed | 2848 |\n", "| total_timesteps | 1220608 |\n", "| train/ | |\n", "| approx_kl | 0.0035374167 |\n", "| clip_fraction | 0.0587 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.411 |\n", "| explained_variance | 0.857 |\n", "| learning_rate | 0.0003 |\n", "| loss | 60.5 |\n", "| n_updates | 5950 |\n", "| policy_gradient_loss | -0.00282 |\n", "| value_loss | 82.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 477 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 597 |\n", "| time_elapsed | 2853 |\n", "| total_timesteps | 1222656 |\n", "| train/ | |\n", "| approx_kl | 0.005633903 |\n", "| clip_fraction | 0.0634 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.468 |\n", "| explained_variance | 0.775 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.62 |\n", "| n_updates | 5960 |\n", "| policy_gradient_loss | -0.000128 |\n", "| value_loss | 27.5 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 484 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 598 |\n", "| time_elapsed | 2858 |\n", "| total_timesteps | 1224704 |\n", "| train/ | |\n", "| approx_kl | 0.0032009394 |\n", "| clip_fraction | 0.0372 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.395 |\n", "| explained_variance | 0.554 |\n", "| learning_rate | 0.0003 |\n", "| loss | 24 |\n", "| n_updates | 5970 |\n", "| policy_gradient_loss | -0.00115 |\n", "| value_loss | 72.6 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 489 |\n", "| ep_rew_mean | 137 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 599 |\n", "| time_elapsed | 2862 |\n", "| total_timesteps | 1226752 |\n", "| train/ | |\n", "| approx_kl | 0.02550316 |\n", "| clip_fraction | 0.099 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.493 |\n", "| explained_variance | 0.799 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.42 |\n", "| n_updates | 5980 |\n", "| policy_gradient_loss | -0.00327 |\n", "| value_loss | 22.8 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 489 |\n", "| ep_rew_mean | 133 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 600 |\n", "| time_elapsed | 2867 |\n", "| total_timesteps | 1228800 |\n", "| train/ | |\n", "| approx_kl | 0.0058284197 |\n", "| clip_fraction | 0.0624 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.507 |\n", "| explained_variance | 0.772 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.12 |\n", "| n_updates | 5990 |\n", "| policy_gradient_loss | -0.000728 |\n", "| value_loss | 13 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 489 |\n", "| ep_rew_mean | 134 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 601 |\n", "| time_elapsed | 2872 |\n", "| total_timesteps | 1230848 |\n", "| train/ | |\n", "| approx_kl | 0.0077087083 |\n", "| clip_fraction | 0.0707 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.437 |\n", "| explained_variance | 0.704 |\n", "| learning_rate | 0.0003 |\n", "| loss | 27.2 |\n", "| n_updates | 6000 |\n", "| policy_gradient_loss | -0.00381 |\n", "| value_loss | 86.9 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 494 |\n", "| ep_rew_mean | 135 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 602 |\n", "| time_elapsed | 2876 |\n", "| total_timesteps | 1232896 |\n", "| train/ | |\n", "| approx_kl | 0.0056669745 |\n", "| clip_fraction | 0.0667 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.395 |\n", "| explained_variance | 0.894 |\n", "| learning_rate | 0.0003 |\n", "| loss | 41.2 |\n", "| n_updates | 6010 |\n", "| policy_gradient_loss | -0.0015 |\n", "| value_loss | 31.1 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 501 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 603 |\n", "| time_elapsed | 2881 |\n", "| total_timesteps | 1234944 |\n", "| train/ | |\n", "| approx_kl | 0.010163366 |\n", "| clip_fraction | 0.0703 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.446 |\n", "| explained_variance | 0.607 |\n", "| learning_rate | 0.0003 |\n", "| loss | 13.6 |\n", "| n_updates | 6020 |\n", "| policy_gradient_loss | 0.000159 |\n", "| value_loss | 45.9 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 494 |\n", "| ep_rew_mean | 139 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 604 |\n", "| time_elapsed | 2886 |\n", "| total_timesteps | 1236992 |\n", "| train/ | |\n", "| approx_kl | 0.01251073 |\n", "| clip_fraction | 0.0959 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.473 |\n", "| explained_variance | 0.884 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.97 |\n", "| n_updates | 6030 |\n", "| policy_gradient_loss | 0.000617 |\n", "| value_loss | 15.5 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 499 |\n", "| ep_rew_mean | 142 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 605 |\n", "| time_elapsed | 2890 |\n", "| total_timesteps | 1239040 |\n", "| train/ | |\n", "| approx_kl | 0.0052781473 |\n", "| clip_fraction | 0.0551 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.42 |\n", "| explained_variance | 0.851 |\n", "| learning_rate | 0.0003 |\n", "| loss | 32.2 |\n", "| n_updates | 6040 |\n", "| policy_gradient_loss | -7.33e-05 |\n", "| value_loss | 107 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 497 |\n", "| ep_rew_mean | 149 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 606 |\n", "| time_elapsed | 2895 |\n", "| total_timesteps | 1241088 |\n", "| train/ | |\n", "| approx_kl | 0.004059146 |\n", "| clip_fraction | 0.0352 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.488 |\n", "| explained_variance | 0.729 |\n", "| learning_rate | 0.0003 |\n", "| loss | 14.3 |\n", "| n_updates | 6050 |\n", "| policy_gradient_loss | -0.00118 |\n", "| value_loss | 35.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 498 |\n", "| ep_rew_mean | 150 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 607 |\n", "| time_elapsed | 2900 |\n", "| total_timesteps | 1243136 |\n", "| train/ | |\n", "| approx_kl | 0.004273815 |\n", "| clip_fraction | 0.0427 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.394 |\n", "| explained_variance | 0.832 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.75 |\n", "| n_updates | 6060 |\n", "| policy_gradient_loss | -0.000906 |\n", "| value_loss | 21.4 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 491 |\n", "| ep_rew_mean | 141 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 608 |\n", "| time_elapsed | 2904 |\n", "| total_timesteps | 1245184 |\n", "| train/ | |\n", "| approx_kl | 0.0032682677 |\n", "| clip_fraction | 0.0336 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.388 |\n", "| explained_variance | 0.605 |\n", "| learning_rate | 0.0003 |\n", "| loss | 40.7 |\n", "| n_updates | 6070 |\n", "| policy_gradient_loss | -0.00312 |\n", "| value_loss | 77.6 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 492 |\n", "| ep_rew_mean | 143 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 609 |\n", "| time_elapsed | 2909 |\n", "| total_timesteps | 1247232 |\n", "| train/ | |\n", "| approx_kl | 0.007157719 |\n", "| clip_fraction | 0.0453 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.382 |\n", "| explained_variance | 0.611 |\n", "| learning_rate | 0.0003 |\n", "| loss | 101 |\n", "| n_updates | 6080 |\n", "| policy_gradient_loss | -0.00127 |\n", "| value_loss | 160 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 498 |\n", "| ep_rew_mean | 142 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 610 |\n", "| time_elapsed | 2914 |\n", "| total_timesteps | 1249280 |\n", "| train/ | |\n", "| approx_kl | 0.0047901347 |\n", "| clip_fraction | 0.0478 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.399 |\n", "| explained_variance | 0.746 |\n", "| learning_rate | 0.0003 |\n", "| loss | 13.5 |\n", "| n_updates | 6090 |\n", "| policy_gradient_loss | -0.000469 |\n", "| value_loss | 42.6 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 502 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 611 |\n", "| time_elapsed | 2919 |\n", "| total_timesteps | 1251328 |\n", "| train/ | |\n", "| approx_kl | 0.0068886247 |\n", "| clip_fraction | 0.0715 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.477 |\n", "| explained_variance | 0.777 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.57 |\n", "| n_updates | 6100 |\n", "| policy_gradient_loss | 0.000135 |\n", "| value_loss | 22 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 498 |\n", "| ep_rew_mean | 143 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 612 |\n", "| time_elapsed | 2923 |\n", "| total_timesteps | 1253376 |\n", "| train/ | |\n", "| approx_kl | 0.037974022 |\n", "| clip_fraction | 0.213 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.525 |\n", "| explained_variance | 0.747 |\n", "| learning_rate | 0.0003 |\n", "| loss | 8.44 |\n", "| n_updates | 6110 |\n", "| policy_gradient_loss | -0.0238 |\n", "| value_loss | 31.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 489 |\n", "| ep_rew_mean | 140 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 613 |\n", "| time_elapsed | 2928 |\n", "| total_timesteps | 1255424 |\n", "| train/ | |\n", "| approx_kl | 0.013029137 |\n", "| clip_fraction | 0.0765 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.472 |\n", "| explained_variance | 0.717 |\n", "| learning_rate | 0.0003 |\n", "| loss | 34.4 |\n", "| n_updates | 6120 |\n", "| policy_gradient_loss | -0.000919 |\n", "| value_loss | 108 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 483 |\n", "| ep_rew_mean | 145 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 614 |\n", "| time_elapsed | 2933 |\n", "| total_timesteps | 1257472 |\n", "| train/ | |\n", "| approx_kl | 0.0060887155 |\n", "| clip_fraction | 0.0766 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.415 |\n", "| explained_variance | 0.782 |\n", "| learning_rate | 0.0003 |\n", "| loss | 51.3 |\n", "| n_updates | 6130 |\n", "| policy_gradient_loss | -0.000629 |\n", "| value_loss | 75.3 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 490 |\n", "| ep_rew_mean | 146 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 615 |\n", "| time_elapsed | 2938 |\n", "| total_timesteps | 1259520 |\n", "| train/ | |\n", "| approx_kl | 0.008930445 |\n", "| clip_fraction | 0.036 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.412 |\n", "| explained_variance | 0.763 |\n", "| learning_rate | 0.0003 |\n", "| loss | 49.9 |\n", "| n_updates | 6140 |\n", "| policy_gradient_loss | -0.00287 |\n", "| value_loss | 98.7 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 496 |\n", "| ep_rew_mean | 148 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 616 |\n", "| time_elapsed | 2942 |\n", "| total_timesteps | 1261568 |\n", "| train/ | |\n", "| approx_kl | 0.0031336215 |\n", "| clip_fraction | 0.0652 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.459 |\n", "| explained_variance | 0.858 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.41 |\n", "| n_updates | 6150 |\n", "| policy_gradient_loss | 0.00114 |\n", "| value_loss | 34.4 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 495 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 617 |\n", "| time_elapsed | 2947 |\n", "| total_timesteps | 1263616 |\n", "| train/ | |\n", "| approx_kl | 0.0033063334 |\n", "| clip_fraction | 0.071 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.458 |\n", "| explained_variance | 0.698 |\n", "| learning_rate | 0.0003 |\n", "| loss | 11.8 |\n", "| n_updates | 6160 |\n", "| policy_gradient_loss | -0.00216 |\n", "| value_loss | 56.9 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 497 |\n", "| ep_rew_mean | 145 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 618 |\n", "| time_elapsed | 2951 |\n", "| total_timesteps | 1265664 |\n", "| train/ | |\n", "| approx_kl | 0.0051268823 |\n", "| clip_fraction | 0.0502 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.423 |\n", "| explained_variance | 0.63 |\n", "| learning_rate | 0.0003 |\n", "| loss | 24.9 |\n", "| n_updates | 6170 |\n", "| policy_gradient_loss | -0.000333 |\n", "| value_loss | 82.9 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 496 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 619 |\n", "| time_elapsed | 2956 |\n", "| total_timesteps | 1267712 |\n", "| train/ | |\n", "| approx_kl | 0.0044256123 |\n", "| clip_fraction | 0.0482 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.386 |\n", "| explained_variance | 0.768 |\n", "| learning_rate | 0.0003 |\n", "| loss | 8.18 |\n", "| n_updates | 6180 |\n", "| policy_gradient_loss | -0.000619 |\n", "| value_loss | 34.1 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 490 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 620 |\n", "| time_elapsed | 2961 |\n", "| total_timesteps | 1269760 |\n", "| train/ | |\n", "| approx_kl | 0.0068230964 |\n", "| clip_fraction | 0.0875 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.41 |\n", "| explained_variance | 0.694 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.08 |\n", "| n_updates | 6190 |\n", "| policy_gradient_loss | -0.0059 |\n", "| value_loss | 96.9 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 501 |\n", "| ep_rew_mean | 143 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 621 |\n", "| time_elapsed | 2965 |\n", "| total_timesteps | 1271808 |\n", "| train/ | |\n", "| approx_kl | 0.0037400583 |\n", "| clip_fraction | 0.0409 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.348 |\n", "| explained_variance | 0.812 |\n", "| learning_rate | 0.0003 |\n", "| loss | 32.6 |\n", "| n_updates | 6200 |\n", "| policy_gradient_loss | -0.0013 |\n", "| value_loss | 63.1 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 499 |\n", "| ep_rew_mean | 145 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 622 |\n", "| time_elapsed | 2970 |\n", "| total_timesteps | 1273856 |\n", "| train/ | |\n", "| approx_kl | 0.018568423 |\n", "| clip_fraction | 0.0936 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.472 |\n", "| explained_variance | 0.883 |\n", "| learning_rate | 0.0003 |\n", "| loss | 23.5 |\n", "| n_updates | 6210 |\n", "| policy_gradient_loss | 0.00226 |\n", "| value_loss | 39.6 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 484 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 623 |\n", "| time_elapsed | 2975 |\n", "| total_timesteps | 1275904 |\n", "| train/ | |\n", "| approx_kl | 0.0032491507 |\n", "| clip_fraction | 0.0408 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.388 |\n", "| explained_variance | 0.764 |\n", "| learning_rate | 0.0003 |\n", "| loss | 8.47 |\n", "| n_updates | 6220 |\n", "| policy_gradient_loss | 0.00046 |\n", "| value_loss | 58.4 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 493 |\n", "| ep_rew_mean | 142 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 624 |\n", "| time_elapsed | 2979 |\n", "| total_timesteps | 1277952 |\n", "| train/ | |\n", "| approx_kl | 0.004373729 |\n", "| clip_fraction | 0.036 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.405 |\n", "| explained_variance | 0.788 |\n", "| learning_rate | 0.0003 |\n", "| loss | 16.8 |\n", "| n_updates | 6230 |\n", "| policy_gradient_loss | -0.000971 |\n", "| value_loss | 66.7 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 495 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 625 |\n", "| time_elapsed | 2984 |\n", "| total_timesteps | 1280000 |\n", "| train/ | |\n", "| approx_kl | 0.0072266324 |\n", "| clip_fraction | 0.0432 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.359 |\n", "| explained_variance | 0.708 |\n", "| learning_rate | 0.0003 |\n", "| loss | 71.7 |\n", "| n_updates | 6240 |\n", "| policy_gradient_loss | -0.00151 |\n", "| value_loss | 132 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 491 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 626 |\n", "| time_elapsed | 2989 |\n", "| total_timesteps | 1282048 |\n", "| train/ | |\n", "| approx_kl | 0.01223236 |\n", "| clip_fraction | 0.103 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.486 |\n", "| explained_variance | 0.859 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.27 |\n", "| n_updates | 6250 |\n", "| policy_gradient_loss | 0.00264 |\n", "| value_loss | 45.8 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 490 |\n", "| ep_rew_mean | 136 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 627 |\n", "| time_elapsed | 2993 |\n", "| total_timesteps | 1284096 |\n", "| train/ | |\n", "| approx_kl | 0.0048354594 |\n", "| clip_fraction | 0.0487 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.406 |\n", "| explained_variance | 0.805 |\n", "| learning_rate | 0.0003 |\n", "| loss | 14.5 |\n", "| n_updates | 6260 |\n", "| policy_gradient_loss | -0.00148 |\n", "| value_loss | 80.8 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 494 |\n", "| ep_rew_mean | 135 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 628 |\n", "| time_elapsed | 2998 |\n", "| total_timesteps | 1286144 |\n", "| train/ | |\n", "| approx_kl | 0.008309221 |\n", "| clip_fraction | 0.0757 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.433 |\n", "| explained_variance | 0.81 |\n", "| learning_rate | 0.0003 |\n", "| loss | 12.6 |\n", "| n_updates | 6270 |\n", "| policy_gradient_loss | -4.57e-05 |\n", "| value_loss | 97.3 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 501 |\n", "| ep_rew_mean | 139 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 629 |\n", "| time_elapsed | 3003 |\n", "| total_timesteps | 1288192 |\n", "| train/ | |\n", "| approx_kl | 0.0064479243 |\n", "| clip_fraction | 0.0616 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.397 |\n", "| explained_variance | 0.782 |\n", "| learning_rate | 0.0003 |\n", "| loss | 48.8 |\n", "| n_updates | 6280 |\n", "| policy_gradient_loss | -0.00219 |\n", "| value_loss | 89.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 489 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 630 |\n", "| time_elapsed | 3007 |\n", "| total_timesteps | 1290240 |\n", "| train/ | |\n", "| approx_kl | 0.010094101 |\n", "| clip_fraction | 0.0892 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.459 |\n", "| explained_variance | 0.879 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.37 |\n", "| n_updates | 6290 |\n", "| policy_gradient_loss | -0.000634 |\n", "| value_loss | 19.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 490 |\n", "| ep_rew_mean | 140 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 631 |\n", "| time_elapsed | 3012 |\n", "| total_timesteps | 1292288 |\n", "| train/ | |\n", "| approx_kl | 0.004461647 |\n", "| clip_fraction | 0.0464 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.403 |\n", "| explained_variance | 0.838 |\n", "| learning_rate | 0.0003 |\n", "| loss | 19 |\n", "| n_updates | 6300 |\n", "| policy_gradient_loss | -0.00107 |\n", "| value_loss | 48.2 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 491 |\n", "| ep_rew_mean | 144 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 632 |\n", "| time_elapsed | 3016 |\n", "| total_timesteps | 1294336 |\n", "| train/ | |\n", "| approx_kl | 0.0046276534 |\n", "| clip_fraction | 0.0554 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.457 |\n", "| explained_variance | 0.885 |\n", "| learning_rate | 0.0003 |\n", "| loss | 16.8 |\n", "| n_updates | 6310 |\n", "| policy_gradient_loss | 0.000712 |\n", "| value_loss | 23.3 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 485 |\n", "| ep_rew_mean | 139 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 633 |\n", "| time_elapsed | 3021 |\n", "| total_timesteps | 1296384 |\n", "| train/ | |\n", "| approx_kl | 0.0068013687 |\n", "| clip_fraction | 0.0648 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.379 |\n", "| explained_variance | 0.817 |\n", "| learning_rate | 0.0003 |\n", "| loss | 84.1 |\n", "| n_updates | 6320 |\n", "| policy_gradient_loss | 0.000287 |\n", "| value_loss | 91.1 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 491 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 634 |\n", "| time_elapsed | 3026 |\n", "| total_timesteps | 1298432 |\n", "| train/ | |\n", "| approx_kl | 0.0030641935 |\n", "| clip_fraction | 0.0415 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.367 |\n", "| explained_variance | 0.734 |\n", "| learning_rate | 0.0003 |\n", "| loss | 29.5 |\n", "| n_updates | 6330 |\n", "| policy_gradient_loss | -0.00154 |\n", "| value_loss | 123 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 491 |\n", "| ep_rew_mean | 138 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 635 |\n", "| time_elapsed | 3030 |\n", "| total_timesteps | 1300480 |\n", "| train/ | |\n", "| approx_kl | 0.00809389 |\n", "| clip_fraction | 0.0664 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.43 |\n", "| explained_variance | 0.835 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.48 |\n", "| n_updates | 6340 |\n", "| policy_gradient_loss | -0.000473 |\n", "| value_loss | 30.7 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 491 |\n", "| ep_rew_mean | 140 |\n", "| time/ | |\n", "| fps | 428 |\n", "| iterations | 636 |\n", "| time_elapsed | 3036 |\n", "| total_timesteps | 1302528 |\n", "| train/ | |\n", "| approx_kl | 0.0056022285 |\n", "| clip_fraction | 0.0567 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.454 |\n", "| explained_variance | 0.845 |\n", "| learning_rate | 0.0003 |\n", "| loss | 9.26 |\n", "| n_updates | 6350 |\n", "| policy_gradient_loss | 0.000377 |\n", "| value_loss | 27.5 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 502 |\n", "| ep_rew_mean | 141 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 637 |\n", "| time_elapsed | 3040 |\n", "| total_timesteps | 1304576 |\n", "| train/ | |\n", "| approx_kl | 0.0051720664 |\n", "| clip_fraction | 0.0548 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.46 |\n", "| explained_variance | 0.815 |\n", "| learning_rate | 0.0003 |\n", "| loss | 3.35 |\n", "| n_updates | 6360 |\n", "| policy_gradient_loss | 0.000951 |\n", "| value_loss | 63.3 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 507 |\n", "| ep_rew_mean | 145 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 638 |\n", "| time_elapsed | 3045 |\n", "| total_timesteps | 1306624 |\n", "| train/ | |\n", "| approx_kl | 0.0022787186 |\n", "| clip_fraction | 0.0307 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.321 |\n", "| explained_variance | 0.752 |\n", "| learning_rate | 0.0003 |\n", "| loss | 32.5 |\n", "| n_updates | 6370 |\n", "| policy_gradient_loss | -0.000312 |\n", "| value_loss | 135 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 506 |\n", "| ep_rew_mean | 143 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 639 |\n", "| time_elapsed | 3050 |\n", "| total_timesteps | 1308672 |\n", "| train/ | |\n", "| approx_kl | 0.0040058363 |\n", "| clip_fraction | 0.0433 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.379 |\n", "| explained_variance | 0.776 |\n", "| learning_rate | 0.0003 |\n", "| loss | 65.9 |\n", "| n_updates | 6380 |\n", "| policy_gradient_loss | -0.00173 |\n", "| value_loss | 87.8 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 512 |\n", "| ep_rew_mean | 137 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 640 |\n", "| time_elapsed | 3054 |\n", "| total_timesteps | 1310720 |\n", "| train/ | |\n", "| approx_kl | 0.0050199903 |\n", "| clip_fraction | 0.052 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.384 |\n", "| explained_variance | 0.654 |\n", "| learning_rate | 0.0003 |\n", "| loss | 57.1 |\n", "| n_updates | 6390 |\n", "| policy_gradient_loss | -0.00168 |\n", "| value_loss | 97 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 510 |\n", "| ep_rew_mean | 135 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 641 |\n", "| time_elapsed | 3059 |\n", "| total_timesteps | 1312768 |\n", "| train/ | |\n", "| approx_kl | 0.0058750836 |\n", "| clip_fraction | 0.0675 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.461 |\n", "| explained_variance | 0.482 |\n", "| learning_rate | 0.0003 |\n", "| loss | 21.9 |\n", "| n_updates | 6400 |\n", "| policy_gradient_loss | -0.00428 |\n", "| value_loss | 113 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 504 |\n", "| ep_rew_mean | 132 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 642 |\n", "| time_elapsed | 3064 |\n", "| total_timesteps | 1314816 |\n", "| train/ | |\n", "| approx_kl | 0.008813405 |\n", "| clip_fraction | 0.0865 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.48 |\n", "| explained_variance | 0.5 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.87 |\n", "| n_updates | 6410 |\n", "| policy_gradient_loss | -0.00109 |\n", "| value_loss | 73.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 501 |\n", "| ep_rew_mean | 131 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 643 |\n", "| time_elapsed | 3068 |\n", "| total_timesteps | 1316864 |\n", "| train/ | |\n", "| approx_kl | 0.005574503 |\n", "| clip_fraction | 0.0554 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.4 |\n", "| explained_variance | 0.579 |\n", "| learning_rate | 0.0003 |\n", "| loss | 13.2 |\n", "| n_updates | 6420 |\n", "| policy_gradient_loss | -0.000229 |\n", "| value_loss | 153 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 507 |\n", "| ep_rew_mean | 129 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 644 |\n", "| time_elapsed | 3073 |\n", "| total_timesteps | 1318912 |\n", "| train/ | |\n", "| approx_kl | 0.0050408356 |\n", "| clip_fraction | 0.0675 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.391 |\n", "| explained_variance | 0.651 |\n", "| learning_rate | 0.0003 |\n", "| loss | 43.4 |\n", "| n_updates | 6430 |\n", "| policy_gradient_loss | -0.00163 |\n", "| value_loss | 133 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 517 |\n", "| ep_rew_mean | 130 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 645 |\n", "| time_elapsed | 3078 |\n", "| total_timesteps | 1320960 |\n", "| train/ | |\n", "| approx_kl | 0.0059750034 |\n", "| clip_fraction | 0.0803 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.47 |\n", "| explained_variance | 0.468 |\n", "| learning_rate | 0.0003 |\n", "| loss | 15.3 |\n", "| n_updates | 6440 |\n", "| policy_gradient_loss | 0.0011 |\n", "| value_loss | 78.5 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 515 |\n", "| ep_rew_mean | 129 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 646 |\n", "| time_elapsed | 3083 |\n", "| total_timesteps | 1323008 |\n", "| train/ | |\n", "| approx_kl | 0.010442553 |\n", "| clip_fraction | 0.0934 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.497 |\n", "| explained_variance | 0.814 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.38 |\n", "| n_updates | 6450 |\n", "| policy_gradient_loss | 0.00254 |\n", "| value_loss | 15.1 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 516 |\n", "| ep_rew_mean | 127 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 647 |\n", "| time_elapsed | 3088 |\n", "| total_timesteps | 1325056 |\n", "| train/ | |\n", "| approx_kl | 0.0037268298 |\n", "| clip_fraction | 0.0355 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.428 |\n", "| explained_variance | 0.798 |\n", "| learning_rate | 0.0003 |\n", "| loss | 27.6 |\n", "| n_updates | 6460 |\n", "| policy_gradient_loss | -0.00278 |\n", "| value_loss | 58.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 517 |\n", "| ep_rew_mean | 122 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 648 |\n", "| time_elapsed | 3092 |\n", "| total_timesteps | 1327104 |\n", "| train/ | |\n", "| approx_kl | 0.004259568 |\n", "| clip_fraction | 0.0542 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.352 |\n", "| explained_variance | 0.673 |\n", "| learning_rate | 0.0003 |\n", "| loss | 103 |\n", "| n_updates | 6470 |\n", "| policy_gradient_loss | -0.00336 |\n", "| value_loss | 150 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 510 |\n", "| ep_rew_mean | 127 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 649 |\n", "| time_elapsed | 3096 |\n", "| total_timesteps | 1329152 |\n", "| train/ | |\n", "| approx_kl | 0.0074136867 |\n", "| clip_fraction | 0.0515 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.371 |\n", "| explained_variance | 0.561 |\n", "| learning_rate | 0.0003 |\n", "| loss | 41.3 |\n", "| n_updates | 6480 |\n", "| policy_gradient_loss | -0.00159 |\n", "| value_loss | 142 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 504 |\n", "| ep_rew_mean | 130 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 650 |\n", "| time_elapsed | 3102 |\n", "| total_timesteps | 1331200 |\n", "| train/ | |\n", "| approx_kl | 0.01160535 |\n", "| clip_fraction | 0.0877 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.425 |\n", "| explained_variance | 0.81 |\n", "| learning_rate | 0.0003 |\n", "| loss | 67.4 |\n", "| n_updates | 6490 |\n", "| policy_gradient_loss | -0.00424 |\n", "| value_loss | 46 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 514 |\n", "| ep_rew_mean | 128 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 651 |\n", "| time_elapsed | 3106 |\n", "| total_timesteps | 1333248 |\n", "| train/ | |\n", "| approx_kl | 0.0052286293 |\n", "| clip_fraction | 0.0612 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.386 |\n", "| explained_variance | 0.843 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.52 |\n", "| n_updates | 6500 |\n", "| policy_gradient_loss | -0.00106 |\n", "| value_loss | 55 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 518 |\n", "| ep_rew_mean | 123 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 652 |\n", "| time_elapsed | 3111 |\n", "| total_timesteps | 1335296 |\n", "| train/ | |\n", "| approx_kl | 0.010203233 |\n", "| clip_fraction | 0.0793 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.48 |\n", "| explained_variance | 0.837 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.32 |\n", "| n_updates | 6510 |\n", "| policy_gradient_loss | 0.000281 |\n", "| value_loss | 16.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 520 |\n", "| ep_rew_mean | 124 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 653 |\n", "| time_elapsed | 3116 |\n", "| total_timesteps | 1337344 |\n", "| train/ | |\n", "| approx_kl | 0.010623779 |\n", "| clip_fraction | 0.0998 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.438 |\n", "| explained_variance | 0.829 |\n", "| learning_rate | 0.0003 |\n", "| loss | 11.3 |\n", "| n_updates | 6520 |\n", "| policy_gradient_loss | -0.00316 |\n", "| value_loss | 57.1 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 517 |\n", "| ep_rew_mean | 127 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 654 |\n", "| time_elapsed | 3120 |\n", "| total_timesteps | 1339392 |\n", "| train/ | |\n", "| approx_kl | 0.0056538745 |\n", "| clip_fraction | 0.0474 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.344 |\n", "| explained_variance | 0.87 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.42 |\n", "| n_updates | 6530 |\n", "| policy_gradient_loss | -0.00235 |\n", "| value_loss | 36.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 520 |\n", "| ep_rew_mean | 123 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 655 |\n", "| time_elapsed | 3124 |\n", "| total_timesteps | 1341440 |\n", "| train/ | |\n", "| approx_kl | 0.007547711 |\n", "| clip_fraction | 0.0698 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.459 |\n", "| explained_variance | 0.886 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.33 |\n", "| n_updates | 6540 |\n", "| policy_gradient_loss | -0.00171 |\n", "| value_loss | 11 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 525 |\n", "| ep_rew_mean | 124 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 656 |\n", "| time_elapsed | 3130 |\n", "| total_timesteps | 1343488 |\n", "| train/ | |\n", "| approx_kl | 0.004894417 |\n", "| clip_fraction | 0.0527 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.452 |\n", "| explained_variance | 0.906 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.41 |\n", "| n_updates | 6550 |\n", "| policy_gradient_loss | -0.000228 |\n", "| value_loss | 18.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 525 |\n", "| ep_rew_mean | 121 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 657 |\n", "| time_elapsed | 3134 |\n", "| total_timesteps | 1345536 |\n", "| train/ | |\n", "| approx_kl | 0.008808415 |\n", "| clip_fraction | 0.0756 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.431 |\n", "| explained_variance | 0.867 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.72 |\n", "| n_updates | 6560 |\n", "| policy_gradient_loss | -0.00175 |\n", "| value_loss | 15.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 529 |\n", "| ep_rew_mean | 122 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 658 |\n", "| time_elapsed | 3139 |\n", "| total_timesteps | 1347584 |\n", "| train/ | |\n", "| approx_kl | 0.003836998 |\n", "| clip_fraction | 0.04 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.316 |\n", "| explained_variance | 0.606 |\n", "| learning_rate | 0.0003 |\n", "| loss | 55.4 |\n", "| n_updates | 6570 |\n", "| policy_gradient_loss | -0.00255 |\n", "| value_loss | 132 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 529 |\n", "| ep_rew_mean | 127 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 659 |\n", "| time_elapsed | 3144 |\n", "| total_timesteps | 1349632 |\n", "| train/ | |\n", "| approx_kl | 0.0052296016 |\n", "| clip_fraction | 0.0649 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.356 |\n", "| explained_variance | 0.759 |\n", "| learning_rate | 0.0003 |\n", "| loss | 21.4 |\n", "| n_updates | 6580 |\n", "| policy_gradient_loss | -0.00247 |\n", "| value_loss | 79.9 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 530 |\n", "| ep_rew_mean | 124 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 660 |\n", "| time_elapsed | 3148 |\n", "| total_timesteps | 1351680 |\n", "| train/ | |\n", "| approx_kl | 0.006044627 |\n", "| clip_fraction | 0.0507 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.342 |\n", "| explained_variance | 0.694 |\n", "| learning_rate | 0.0003 |\n", "| loss | 66.2 |\n", "| n_updates | 6590 |\n", "| policy_gradient_loss | -0.00133 |\n", "| value_loss | 77 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 525 |\n", "| ep_rew_mean | 125 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 661 |\n", "| time_elapsed | 3153 |\n", "| total_timesteps | 1353728 |\n", "| train/ | |\n", "| approx_kl | 0.0063030953 |\n", "| clip_fraction | 0.0512 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.317 |\n", "| explained_variance | 0.842 |\n", "| learning_rate | 0.0003 |\n", "| loss | 17 |\n", "| n_updates | 6600 |\n", "| policy_gradient_loss | -0.00252 |\n", "| value_loss | 50.7 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 527 |\n", "| ep_rew_mean | 129 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 662 |\n", "| time_elapsed | 3158 |\n", "| total_timesteps | 1355776 |\n", "| train/ | |\n", "| approx_kl | 0.005034018 |\n", "| clip_fraction | 0.0384 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.385 |\n", "| explained_variance | 0.843 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.7 |\n", "| n_updates | 6610 |\n", "| policy_gradient_loss | -0.0022 |\n", "| value_loss | 22.8 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 526 |\n", "| ep_rew_mean | 128 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 663 |\n", "| time_elapsed | 3162 |\n", "| total_timesteps | 1357824 |\n", "| train/ | |\n", "| approx_kl | 0.004498868 |\n", "| clip_fraction | 0.0527 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.372 |\n", "| explained_variance | 0.954 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.41 |\n", "| n_updates | 6620 |\n", "| policy_gradient_loss | 0.00139 |\n", "| value_loss | 13.6 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 530 |\n", "| ep_rew_mean | 128 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 664 |\n", "| time_elapsed | 3167 |\n", "| total_timesteps | 1359872 |\n", "| train/ | |\n", "| approx_kl | 0.0022981986 |\n", "| clip_fraction | 0.0284 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.319 |\n", "| explained_variance | 0.434 |\n", "| learning_rate | 0.0003 |\n", "| loss | 18.6 |\n", "| n_updates | 6630 |\n", "| policy_gradient_loss | -0.000301 |\n", "| value_loss | 164 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 531 |\n", "| ep_rew_mean | 119 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 665 |\n", "| time_elapsed | 3172 |\n", "| total_timesteps | 1361920 |\n", "| train/ | |\n", "| approx_kl | 0.008487377 |\n", "| clip_fraction | 0.0689 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.387 |\n", "| explained_variance | 0.736 |\n", "| learning_rate | 0.0003 |\n", "| loss | 51.7 |\n", "| n_updates | 6640 |\n", "| policy_gradient_loss | 0.00196 |\n", "| value_loss | 62.5 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 525 |\n", "| ep_rew_mean | 125 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 666 |\n", "| time_elapsed | 3176 |\n", "| total_timesteps | 1363968 |\n", "| train/ | |\n", "| approx_kl | 0.0029568546 |\n", "| clip_fraction | 0.0554 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.365 |\n", "| explained_variance | 0.667 |\n", "| learning_rate | 0.0003 |\n", "| loss | 36.7 |\n", "| n_updates | 6650 |\n", "| policy_gradient_loss | -0.00122 |\n", "| value_loss | 246 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 523 |\n", "| ep_rew_mean | 128 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 667 |\n", "| time_elapsed | 3181 |\n", "| total_timesteps | 1366016 |\n", "| train/ | |\n", "| approx_kl | 0.003722764 |\n", "| clip_fraction | 0.0596 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.359 |\n", "| explained_variance | 0.819 |\n", "| learning_rate | 0.0003 |\n", "| loss | 35.6 |\n", "| n_updates | 6660 |\n", "| policy_gradient_loss | 0.000168 |\n", "| value_loss | 75.2 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 531 |\n", "| ep_rew_mean | 127 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 668 |\n", "| time_elapsed | 3186 |\n", "| total_timesteps | 1368064 |\n", "| train/ | |\n", "| approx_kl | 0.00535537 |\n", "| clip_fraction | 0.0549 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.372 |\n", "| explained_variance | 0.75 |\n", "| learning_rate | 0.0003 |\n", "| loss | 26 |\n", "| n_updates | 6670 |\n", "| policy_gradient_loss | -0.000477 |\n", "| value_loss | 40 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 540 |\n", "| ep_rew_mean | 130 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 669 |\n", "| time_elapsed | 3190 |\n", "| total_timesteps | 1370112 |\n", "| train/ | |\n", "| approx_kl | 0.0031132577 |\n", "| clip_fraction | 0.0468 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.399 |\n", "| explained_variance | 0.658 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.4 |\n", "| n_updates | 6680 |\n", "| policy_gradient_loss | -0.00124 |\n", "| value_loss | 53 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 534 |\n", "| ep_rew_mean | 131 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 670 |\n", "| time_elapsed | 3195 |\n", "| total_timesteps | 1372160 |\n", "| train/ | |\n", "| approx_kl | 0.0026808484 |\n", "| clip_fraction | 0.0418 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.387 |\n", "| explained_variance | 0.644 |\n", "| learning_rate | 0.0003 |\n", "| loss | 36.5 |\n", "| n_updates | 6690 |\n", "| policy_gradient_loss | -0.000526 |\n", "| value_loss | 51.4 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 528 |\n", "| ep_rew_mean | 133 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 671 |\n", "| time_elapsed | 3200 |\n", "| total_timesteps | 1374208 |\n", "| train/ | |\n", "| approx_kl | 0.00489672 |\n", "| clip_fraction | 0.0563 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.392 |\n", "| explained_variance | 0.844 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.5 |\n", "| n_updates | 6700 |\n", "| policy_gradient_loss | 0.00104 |\n", "| value_loss | 41.3 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 525 |\n", "| ep_rew_mean | 139 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 672 |\n", "| time_elapsed | 3204 |\n", "| total_timesteps | 1376256 |\n", "| train/ | |\n", "| approx_kl | 0.007224474 |\n", "| clip_fraction | 0.0635 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.451 |\n", "| explained_variance | 0.841 |\n", "| learning_rate | 0.0003 |\n", "| loss | 9.35 |\n", "| n_updates | 6710 |\n", "| policy_gradient_loss | 0.000287 |\n", "| value_loss | 20.1 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 530 |\n", "| ep_rew_mean | 142 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 673 |\n", "| time_elapsed | 3209 |\n", "| total_timesteps | 1378304 |\n", "| train/ | |\n", "| approx_kl | 0.0075524854 |\n", "| clip_fraction | 0.0643 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.344 |\n", "| explained_variance | 0.917 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.89 |\n", "| n_updates | 6720 |\n", "| policy_gradient_loss | -0.00228 |\n", "| value_loss | 10.8 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 536 |\n", "| ep_rew_mean | 141 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 674 |\n", "| time_elapsed | 3214 |\n", "| total_timesteps | 1380352 |\n", "| train/ | |\n", "| approx_kl | 0.006372129 |\n", "| clip_fraction | 0.0571 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.37 |\n", "| explained_variance | 0.745 |\n", "| learning_rate | 0.0003 |\n", "| loss | 21.8 |\n", "| n_updates | 6730 |\n", "| policy_gradient_loss | -0.00105 |\n", "| value_loss | 53.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 544 |\n", "| ep_rew_mean | 144 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 675 |\n", "| time_elapsed | 3219 |\n", "| total_timesteps | 1382400 |\n", "| train/ | |\n", "| approx_kl | 0.009212567 |\n", "| clip_fraction | 0.0601 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.411 |\n", "| explained_variance | 0.765 |\n", "| learning_rate | 0.0003 |\n", "| loss | 32.7 |\n", "| n_updates | 6740 |\n", "| policy_gradient_loss | -0.00158 |\n", "| value_loss | 75.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 544 |\n", "| ep_rew_mean | 144 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 676 |\n", "| time_elapsed | 3223 |\n", "| total_timesteps | 1384448 |\n", "| train/ | |\n", "| approx_kl | 0.004939094 |\n", "| clip_fraction | 0.0411 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.422 |\n", "| explained_variance | 0.897 |\n", "| learning_rate | 0.0003 |\n", "| loss | 8.01 |\n", "| n_updates | 6750 |\n", "| policy_gradient_loss | -6.3e-05 |\n", "| value_loss | 28.3 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 541 |\n", "| ep_rew_mean | 148 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 677 |\n", "| time_elapsed | 3228 |\n", "| total_timesteps | 1386496 |\n", "| train/ | |\n", "| approx_kl | 0.0076215724 |\n", "| clip_fraction | 0.0789 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.372 |\n", "| explained_variance | 0.9 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.31 |\n", "| n_updates | 6760 |\n", "| policy_gradient_loss | -0.00214 |\n", "| value_loss | 20.3 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 540 |\n", "| ep_rew_mean | 152 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 678 |\n", "| time_elapsed | 3233 |\n", "| total_timesteps | 1388544 |\n", "| train/ | |\n", "| approx_kl | 0.005975459 |\n", "| clip_fraction | 0.0562 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.351 |\n", "| explained_variance | 0.925 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.42 |\n", "| n_updates | 6770 |\n", "| policy_gradient_loss | -0.00241 |\n", "| value_loss | 19.3 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 546 |\n", "| ep_rew_mean | 151 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 679 |\n", "| time_elapsed | 3237 |\n", "| total_timesteps | 1390592 |\n", "| train/ | |\n", "| approx_kl | 0.00945029 |\n", "| clip_fraction | 0.0869 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.435 |\n", "| explained_variance | 0.818 |\n", "| learning_rate | 0.0003 |\n", "| loss | 40.4 |\n", "| n_updates | 6780 |\n", "| policy_gradient_loss | 6.23e-05 |\n", "| value_loss | 33.1 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 557 |\n", "| ep_rew_mean | 150 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 680 |\n", "| time_elapsed | 3241 |\n", "| total_timesteps | 1392640 |\n", "| train/ | |\n", "| approx_kl | 0.007817095 |\n", "| clip_fraction | 0.0761 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.537 |\n", "| explained_variance | 0.84 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.61 |\n", "| n_updates | 6790 |\n", "| policy_gradient_loss | -0.000382 |\n", "| value_loss | 12.9 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 554 |\n", "| ep_rew_mean | 147 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 681 |\n", "| time_elapsed | 3247 |\n", "| total_timesteps | 1394688 |\n", "| train/ | |\n", "| approx_kl | 0.00922651 |\n", "| clip_fraction | 0.0652 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.511 |\n", "| explained_variance | 0.846 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.09 |\n", "| n_updates | 6800 |\n", "| policy_gradient_loss | 0.000971 |\n", "| value_loss | 5.98 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 563 |\n", "| ep_rew_mean | 150 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 682 |\n", "| time_elapsed | 3251 |\n", "| total_timesteps | 1396736 |\n", "| train/ | |\n", "| approx_kl | 0.0040961727 |\n", "| clip_fraction | 0.0294 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.333 |\n", "| explained_variance | 0.648 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.96 |\n", "| n_updates | 6810 |\n", "| policy_gradient_loss | -0.0018 |\n", "| value_loss | 90.8 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 556 |\n", "| ep_rew_mean | 144 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 683 |\n", "| time_elapsed | 3255 |\n", "| total_timesteps | 1398784 |\n", "| train/ | |\n", "| approx_kl | 0.0046532224 |\n", "| clip_fraction | 0.0692 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.448 |\n", "| explained_variance | 0.849 |\n", "| learning_rate | 0.0003 |\n", "| loss | 20.7 |\n", "| n_updates | 6820 |\n", "| policy_gradient_loss | -0.00112 |\n", "| value_loss | 34 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 543 |\n", "| ep_rew_mean | 149 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 684 |\n", "| time_elapsed | 3261 |\n", "| total_timesteps | 1400832 |\n", "| train/ | |\n", "| approx_kl | 0.0054458007 |\n", "| clip_fraction | 0.0425 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.312 |\n", "| explained_variance | 0.571 |\n", "| learning_rate | 0.0003 |\n", "| loss | 33.3 |\n", "| n_updates | 6830 |\n", "| policy_gradient_loss | -0.00107 |\n", "| value_loss | 143 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 550 |\n", "| ep_rew_mean | 152 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 685 |\n", "| time_elapsed | 3265 |\n", "| total_timesteps | 1402880 |\n", "| train/ | |\n", "| approx_kl | 0.004553126 |\n", "| clip_fraction | 0.0446 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.343 |\n", "| explained_variance | 0.901 |\n", "| learning_rate | 0.0003 |\n", "| loss | 3.55 |\n", "| n_updates | 6840 |\n", "| policy_gradient_loss | -0.0012 |\n", "| value_loss | 25.1 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 557 |\n", "| ep_rew_mean | 152 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 686 |\n", "| time_elapsed | 3270 |\n", "| total_timesteps | 1404928 |\n", "| train/ | |\n", "| approx_kl | 0.0067462837 |\n", "| clip_fraction | 0.0728 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.422 |\n", "| explained_variance | 0.81 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.34 |\n", "| n_updates | 6850 |\n", "| policy_gradient_loss | -0.00794 |\n", "| value_loss | 37.1 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 554 |\n", "| ep_rew_mean | 154 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 687 |\n", "| time_elapsed | 3275 |\n", "| total_timesteps | 1406976 |\n", "| train/ | |\n", "| approx_kl | 0.0055950633 |\n", "| clip_fraction | 0.0534 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.371 |\n", "| explained_variance | 0.911 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.91 |\n", "| n_updates | 6860 |\n", "| policy_gradient_loss | -0.0019 |\n", "| value_loss | 21.9 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 559 |\n", "| ep_rew_mean | 154 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 688 |\n", "| time_elapsed | 3279 |\n", "| total_timesteps | 1409024 |\n", "| train/ | |\n", "| approx_kl | 0.0042046653 |\n", "| clip_fraction | 0.0455 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.348 |\n", "| explained_variance | 0.715 |\n", "| learning_rate | 0.0003 |\n", "| loss | 3.76 |\n", "| n_updates | 6870 |\n", "| policy_gradient_loss | -0.000329 |\n", "| value_loss | 68.3 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 557 |\n", "| ep_rew_mean | 154 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 689 |\n", "| time_elapsed | 3284 |\n", "| total_timesteps | 1411072 |\n", "| train/ | |\n", "| approx_kl | 0.0052424176 |\n", "| clip_fraction | 0.0758 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.455 |\n", "| explained_variance | 0.869 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.63 |\n", "| n_updates | 6880 |\n", "| policy_gradient_loss | 0.00253 |\n", "| value_loss | 7.46 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 551 |\n", "| ep_rew_mean | 156 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 690 |\n", "| time_elapsed | 3289 |\n", "| total_timesteps | 1413120 |\n", "| train/ | |\n", "| approx_kl | 0.007667018 |\n", "| clip_fraction | 0.0539 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.424 |\n", "| explained_variance | 0.863 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.71 |\n", "| n_updates | 6890 |\n", "| policy_gradient_loss | -0.000562 |\n", "| value_loss | 18.4 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 548 |\n", "| ep_rew_mean | 159 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 691 |\n", "| time_elapsed | 3293 |\n", "| total_timesteps | 1415168 |\n", "| train/ | |\n", "| approx_kl | 0.0043840054 |\n", "| clip_fraction | 0.0531 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.344 |\n", "| explained_variance | 0.762 |\n", "| learning_rate | 0.0003 |\n", "| loss | 14.2 |\n", "| n_updates | 6900 |\n", "| policy_gradient_loss | -0.00183 |\n", "| value_loss | 38.6 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 552 |\n", "| ep_rew_mean | 162 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 692 |\n", "| time_elapsed | 3298 |\n", "| total_timesteps | 1417216 |\n", "| train/ | |\n", "| approx_kl | 0.003601518 |\n", "| clip_fraction | 0.0359 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.373 |\n", "| explained_variance | 0.661 |\n", "| learning_rate | 0.0003 |\n", "| loss | 44.3 |\n", "| n_updates | 6910 |\n", "| policy_gradient_loss | -0.00225 |\n", "| value_loss | 49.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 557 |\n", "| ep_rew_mean | 168 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 693 |\n", "| time_elapsed | 3303 |\n", "| total_timesteps | 1419264 |\n", "| train/ | |\n", "| approx_kl | 0.010243534 |\n", "| clip_fraction | 0.0858 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.425 |\n", "| explained_variance | 0.805 |\n", "| learning_rate | 0.0003 |\n", "| loss | 13.6 |\n", "| n_updates | 6920 |\n", "| policy_gradient_loss | -0.000623 |\n", "| value_loss | 16.7 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 554 |\n", "| ep_rew_mean | 168 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 694 |\n", "| time_elapsed | 3307 |\n", "| total_timesteps | 1421312 |\n", "| train/ | |\n", "| approx_kl | 0.0051258677 |\n", "| clip_fraction | 0.046 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.341 |\n", "| explained_variance | 0.831 |\n", "| learning_rate | 0.0003 |\n", "| loss | 13.9 |\n", "| n_updates | 6930 |\n", "| policy_gradient_loss | 0.000321 |\n", "| value_loss | 32.6 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 545 |\n", "| ep_rew_mean | 174 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 695 |\n", "| time_elapsed | 3312 |\n", "| total_timesteps | 1423360 |\n", "| train/ | |\n", "| approx_kl | 0.0030902482 |\n", "| clip_fraction | 0.0365 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.33 |\n", "| explained_variance | 0.864 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.75 |\n", "| n_updates | 6940 |\n", "| policy_gradient_loss | -0.00197 |\n", "| value_loss | 36.9 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 545 |\n", "| ep_rew_mean | 173 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 696 |\n", "| time_elapsed | 3317 |\n", "| total_timesteps | 1425408 |\n", "| train/ | |\n", "| approx_kl | 0.003730912 |\n", "| clip_fraction | 0.0494 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.328 |\n", "| explained_variance | 0.839 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.77 |\n", "| n_updates | 6950 |\n", "| policy_gradient_loss | -0.00259 |\n", "| value_loss | 40 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 549 |\n", "| ep_rew_mean | 173 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 697 |\n", "| time_elapsed | 3321 |\n", "| total_timesteps | 1427456 |\n", "| train/ | |\n", "| approx_kl | 0.0056591285 |\n", "| clip_fraction | 0.0367 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.317 |\n", "| explained_variance | 0.827 |\n", "| learning_rate | 0.0003 |\n", "| loss | 19.5 |\n", "| n_updates | 6960 |\n", "| policy_gradient_loss | -0.00122 |\n", "| value_loss | 32.3 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 555 |\n", "| ep_rew_mean | 172 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 698 |\n", "| time_elapsed | 3326 |\n", "| total_timesteps | 1429504 |\n", "| train/ | |\n", "| approx_kl | 0.004573603 |\n", "| clip_fraction | 0.0423 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.257 |\n", "| explained_variance | 0.6 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4 |\n", "| n_updates | 6970 |\n", "| policy_gradient_loss | -0.000102 |\n", "| value_loss | 23.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 546 |\n", "| ep_rew_mean | 167 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 699 |\n", "| time_elapsed | 3331 |\n", "| total_timesteps | 1431552 |\n", "| train/ | |\n", "| approx_kl | 0.007859121 |\n", "| clip_fraction | 0.095 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.479 |\n", "| explained_variance | 0.818 |\n", "| learning_rate | 0.0003 |\n", "| loss | 16.5 |\n", "| n_updates | 6980 |\n", "| policy_gradient_loss | 0.000627 |\n", "| value_loss | 9.09 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 548 |\n", "| ep_rew_mean | 171 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 700 |\n", "| time_elapsed | 3335 |\n", "| total_timesteps | 1433600 |\n", "| train/ | |\n", "| approx_kl | 0.0074086105 |\n", "| clip_fraction | 0.0559 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.351 |\n", "| explained_variance | 0.657 |\n", "| learning_rate | 0.0003 |\n", "| loss | 40.4 |\n", "| n_updates | 6990 |\n", "| policy_gradient_loss | -0.00244 |\n", "| value_loss | 127 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 546 |\n", "| ep_rew_mean | 172 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 701 |\n", "| time_elapsed | 3340 |\n", "| total_timesteps | 1435648 |\n", "| train/ | |\n", "| approx_kl | 0.0023174845 |\n", "| clip_fraction | 0.0364 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.296 |\n", "| explained_variance | 0.49 |\n", "| learning_rate | 0.0003 |\n", "| loss | 39.4 |\n", "| n_updates | 7000 |\n", "| policy_gradient_loss | -0.003 |\n", "| value_loss | 150 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 545 |\n", "| ep_rew_mean | 170 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 702 |\n", "| time_elapsed | 3345 |\n", "| total_timesteps | 1437696 |\n", "| train/ | |\n", "| approx_kl | 0.007208988 |\n", "| clip_fraction | 0.0509 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.377 |\n", "| explained_variance | 0.686 |\n", "| learning_rate | 0.0003 |\n", "| loss | 12.2 |\n", "| n_updates | 7010 |\n", "| policy_gradient_loss | -0.0013 |\n", "| value_loss | 44.6 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 550 |\n", "| ep_rew_mean | 169 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 703 |\n", "| time_elapsed | 3350 |\n", "| total_timesteps | 1439744 |\n", "| train/ | |\n", "| approx_kl | 0.0074947923 |\n", "| clip_fraction | 0.0518 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.335 |\n", "| explained_variance | 0.707 |\n", "| learning_rate | 0.0003 |\n", "| loss | 42.1 |\n", "| n_updates | 7020 |\n", "| policy_gradient_loss | -0.00272 |\n", "| value_loss | 38.6 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 551 |\n", "| ep_rew_mean | 168 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 704 |\n", "| time_elapsed | 3354 |\n", "| total_timesteps | 1441792 |\n", "| train/ | |\n", "| approx_kl | 0.00439015 |\n", "| clip_fraction | 0.0273 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.264 |\n", "| explained_variance | 0.548 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.93 |\n", "| n_updates | 7030 |\n", "| policy_gradient_loss | -0.000712 |\n", "| value_loss | 90 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 547 |\n", "| ep_rew_mean | 169 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 705 |\n", "| time_elapsed | 3359 |\n", "| total_timesteps | 1443840 |\n", "| train/ | |\n", "| approx_kl | 0.005838719 |\n", "| clip_fraction | 0.0566 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.334 |\n", "| explained_variance | 0.881 |\n", "| learning_rate | 0.0003 |\n", "| loss | 28.9 |\n", "| n_updates | 7040 |\n", "| policy_gradient_loss | -0.00211 |\n", "| value_loss | 33.2 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 530 |\n", "| ep_rew_mean | 172 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 706 |\n", "| time_elapsed | 3364 |\n", "| total_timesteps | 1445888 |\n", "| train/ | |\n", "| approx_kl | 0.0087067345 |\n", "| clip_fraction | 0.0808 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.43 |\n", "| explained_variance | 0.829 |\n", "| learning_rate | 0.0003 |\n", "| loss | 11.1 |\n", "| n_updates | 7050 |\n", "| policy_gradient_loss | -0.000353 |\n", "| value_loss | 25.7 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 524 |\n", "| ep_rew_mean | 176 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 707 |\n", "| time_elapsed | 3369 |\n", "| total_timesteps | 1447936 |\n", "| train/ | |\n", "| approx_kl | 0.00368839 |\n", "| clip_fraction | 0.0475 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.324 |\n", "| explained_variance | 0.834 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.5 |\n", "| n_updates | 7060 |\n", "| policy_gradient_loss | -0.000359 |\n", "| value_loss | 54.9 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 522 |\n", "| ep_rew_mean | 181 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 708 |\n", "| time_elapsed | 3373 |\n", "| total_timesteps | 1449984 |\n", "| train/ | |\n", "| approx_kl | 0.0048118625 |\n", "| clip_fraction | 0.0517 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.299 |\n", "| explained_variance | 0.858 |\n", "| learning_rate | 0.0003 |\n", "| loss | 8.58 |\n", "| n_updates | 7070 |\n", "| policy_gradient_loss | -0.000623 |\n", "| value_loss | 31.4 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 526 |\n", "| ep_rew_mean | 182 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 709 |\n", "| time_elapsed | 3378 |\n", "| total_timesteps | 1452032 |\n", "| train/ | |\n", "| approx_kl | 0.007132729 |\n", "| clip_fraction | 0.052 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.335 |\n", "| explained_variance | 0.852 |\n", "| learning_rate | 0.0003 |\n", "| loss | 32.2 |\n", "| n_updates | 7080 |\n", "| policy_gradient_loss | -0.000645 |\n", "| value_loss | 35.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 534 |\n", "| ep_rew_mean | 181 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 710 |\n", "| time_elapsed | 3383 |\n", "| total_timesteps | 1454080 |\n", "| train/ | |\n", "| approx_kl | 0.004435127 |\n", "| clip_fraction | 0.0834 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.395 |\n", "| explained_variance | 0.843 |\n", "| learning_rate | 0.0003 |\n", "| loss | 16.3 |\n", "| n_updates | 7090 |\n", "| policy_gradient_loss | 0.000368 |\n", "| value_loss | 19.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 523 |\n", "| ep_rew_mean | 181 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 711 |\n", "| time_elapsed | 3387 |\n", "| total_timesteps | 1456128 |\n", "| train/ | |\n", "| approx_kl | 0.008041816 |\n", "| clip_fraction | 0.0756 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.412 |\n", "| explained_variance | 0.875 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.87 |\n", "| n_updates | 7100 |\n", "| policy_gradient_loss | -6.96e-05 |\n", "| value_loss | 20.8 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 521 |\n", "| ep_rew_mean | 180 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 712 |\n", "| time_elapsed | 3392 |\n", "| total_timesteps | 1458176 |\n", "| train/ | |\n", "| approx_kl | 0.0027350239 |\n", "| clip_fraction | 0.0387 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.319 |\n", "| explained_variance | 0.604 |\n", "| learning_rate | 0.0003 |\n", "| loss | 14.2 |\n", "| n_updates | 7110 |\n", "| policy_gradient_loss | -0.000576 |\n", "| value_loss | 79.5 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 517 |\n", "| ep_rew_mean | 183 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 713 |\n", "| time_elapsed | 3397 |\n", "| total_timesteps | 1460224 |\n", "| train/ | |\n", "| approx_kl | 0.003070236 |\n", "| clip_fraction | 0.0432 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.318 |\n", "| explained_variance | 0.944 |\n", "| learning_rate | 0.0003 |\n", "| loss | 10.6 |\n", "| n_updates | 7120 |\n", "| policy_gradient_loss | 0.000213 |\n", "| value_loss | 18.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 516 |\n", "| ep_rew_mean | 180 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 714 |\n", "| time_elapsed | 3401 |\n", "| total_timesteps | 1462272 |\n", "| train/ | |\n", "| approx_kl | 0.006038408 |\n", "| clip_fraction | 0.0606 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.328 |\n", "| explained_variance | 0.876 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.12 |\n", "| n_updates | 7130 |\n", "| policy_gradient_loss | -0.00107 |\n", "| value_loss | 33.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 514 |\n", "| ep_rew_mean | 177 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 715 |\n", "| time_elapsed | 3406 |\n", "| total_timesteps | 1464320 |\n", "| train/ | |\n", "| approx_kl | 0.004517485 |\n", "| clip_fraction | 0.0445 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.365 |\n", "| explained_variance | 0.86 |\n", "| learning_rate | 0.0003 |\n", "| loss | 11.7 |\n", "| n_updates | 7140 |\n", "| policy_gradient_loss | 0.00068 |\n", "| value_loss | 38.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 516 |\n", "| ep_rew_mean | 176 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 716 |\n", "| time_elapsed | 3411 |\n", "| total_timesteps | 1466368 |\n", "| train/ | |\n", "| approx_kl | 0.005280545 |\n", "| clip_fraction | 0.0707 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.329 |\n", "| explained_variance | 0.661 |\n", "| learning_rate | 0.0003 |\n", "| loss | 16.1 |\n", "| n_updates | 7150 |\n", "| policy_gradient_loss | -0.00261 |\n", "| value_loss | 139 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 504 |\n", "| ep_rew_mean | 175 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 717 |\n", "| time_elapsed | 3416 |\n", "| total_timesteps | 1468416 |\n", "| train/ | |\n", "| approx_kl | 0.0070655555 |\n", "| clip_fraction | 0.0409 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.263 |\n", "| explained_variance | 0.851 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.76 |\n", "| n_updates | 7160 |\n", "| policy_gradient_loss | -0.000188 |\n", "| value_loss | 20.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 503 |\n", "| ep_rew_mean | 173 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 718 |\n", "| time_elapsed | 3421 |\n", "| total_timesteps | 1470464 |\n", "| train/ | |\n", "| approx_kl | 0.005839767 |\n", "| clip_fraction | 0.065 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.295 |\n", "| explained_variance | 0.585 |\n", "| learning_rate | 0.0003 |\n", "| loss | 156 |\n", "| n_updates | 7170 |\n", "| policy_gradient_loss | -0.00349 |\n", "| value_loss | 124 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 504 |\n", "| ep_rew_mean | 173 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 719 |\n", "| time_elapsed | 3425 |\n", "| total_timesteps | 1472512 |\n", "| train/ | |\n", "| approx_kl | 0.006070084 |\n", "| clip_fraction | 0.0559 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.331 |\n", "| explained_variance | 0.669 |\n", "| learning_rate | 0.0003 |\n", "| loss | 114 |\n", "| n_updates | 7180 |\n", "| policy_gradient_loss | -0.00186 |\n", "| value_loss | 84.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 514 |\n", "| ep_rew_mean | 169 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 720 |\n", "| time_elapsed | 3430 |\n", "| total_timesteps | 1474560 |\n", "| train/ | |\n", "| approx_kl | 0.013455278 |\n", "| clip_fraction | 0.102 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.417 |\n", "| explained_variance | 0.845 |\n", "| learning_rate | 0.0003 |\n", "| loss | 6.15 |\n", "| n_updates | 7190 |\n", "| policy_gradient_loss | 0.00402 |\n", "| value_loss | 7.92 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 518 |\n", "| ep_rew_mean | 163 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 721 |\n", "| time_elapsed | 3435 |\n", "| total_timesteps | 1476608 |\n", "| train/ | |\n", "| approx_kl | 0.008661312 |\n", "| clip_fraction | 0.1 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.397 |\n", "| explained_variance | 0.671 |\n", "| learning_rate | 0.0003 |\n", "| loss | 82.9 |\n", "| n_updates | 7200 |\n", "| policy_gradient_loss | -0.00602 |\n", "| value_loss | 88.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 518 |\n", "| ep_rew_mean | 164 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 722 |\n", "| time_elapsed | 3439 |\n", "| total_timesteps | 1478656 |\n", "| train/ | |\n", "| approx_kl | 0.006998668 |\n", "| clip_fraction | 0.0761 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.414 |\n", "| explained_variance | 0.632 |\n", "| learning_rate | 0.0003 |\n", "| loss | 4.6 |\n", "| n_updates | 7210 |\n", "| policy_gradient_loss | -0.000438 |\n", "| value_loss | 90.3 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 509 |\n", "| ep_rew_mean | 168 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 723 |\n", "| time_elapsed | 3444 |\n", "| total_timesteps | 1480704 |\n", "| train/ | |\n", "| approx_kl | 0.0040886123 |\n", "| clip_fraction | 0.0592 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.412 |\n", "| explained_variance | 0.771 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.27 |\n", "| n_updates | 7220 |\n", "| policy_gradient_loss | 0.00137 |\n", "| value_loss | 27 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 507 |\n", "| ep_rew_mean | 171 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 724 |\n", "| time_elapsed | 3448 |\n", "| total_timesteps | 1482752 |\n", "| train/ | |\n", "| approx_kl | 0.0073271873 |\n", "| clip_fraction | 0.0358 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.308 |\n", "| explained_variance | 0.584 |\n", "| learning_rate | 0.0003 |\n", "| loss | 15.3 |\n", "| n_updates | 7230 |\n", "| policy_gradient_loss | -0.00263 |\n", "| value_loss | 91.3 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 502 |\n", "| ep_rew_mean | 163 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 725 |\n", "| time_elapsed | 3453 |\n", "| total_timesteps | 1484800 |\n", "| train/ | |\n", "| approx_kl | 0.007110974 |\n", "| clip_fraction | 0.0533 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.284 |\n", "| explained_variance | 0.904 |\n", "| learning_rate | 0.0003 |\n", "| loss | 2.76 |\n", "| n_updates | 7240 |\n", "| policy_gradient_loss | -0.000323 |\n", "| value_loss | 25.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 501 |\n", "| ep_rew_mean | 164 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 726 |\n", "| time_elapsed | 3458 |\n", "| total_timesteps | 1486848 |\n", "| train/ | |\n", "| approx_kl | 0.002882342 |\n", "| clip_fraction | 0.0281 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.303 |\n", "| explained_variance | 0.68 |\n", "| learning_rate | 0.0003 |\n", "| loss | 90.8 |\n", "| n_updates | 7250 |\n", "| policy_gradient_loss | -0.00461 |\n", "| value_loss | 291 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 502 |\n", "| ep_rew_mean | 163 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 727 |\n", "| time_elapsed | 3462 |\n", "| total_timesteps | 1488896 |\n", "| train/ | |\n", "| approx_kl | 0.01011255 |\n", "| clip_fraction | 0.0746 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.402 |\n", "| explained_variance | 0.721 |\n", "| learning_rate | 0.0003 |\n", "| loss | 5.08 |\n", "| n_updates | 7260 |\n", "| policy_gradient_loss | -0.00265 |\n", "| value_loss | 33.3 |\n", "----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 509 |\n", "| ep_rew_mean | 163 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 728 |\n", "| time_elapsed | 3467 |\n", "| total_timesteps | 1490944 |\n", "| train/ | |\n", "| approx_kl | 0.0037286947 |\n", "| clip_fraction | 0.0428 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.357 |\n", "| explained_variance | 0.844 |\n", "| learning_rate | 0.0003 |\n", "| loss | 23.8 |\n", "| n_updates | 7270 |\n", "| policy_gradient_loss | 0.000588 |\n", "| value_loss | 22.5 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 509 |\n", "| ep_rew_mean | 163 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 729 |\n", "| time_elapsed | 3472 |\n", "| total_timesteps | 1492992 |\n", "| train/ | |\n", "| approx_kl | 0.0097370185 |\n", "| clip_fraction | 0.0729 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.435 |\n", "| explained_variance | 0.803 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.71 |\n", "| n_updates | 7280 |\n", "| policy_gradient_loss | 0.000102 |\n", "| value_loss | 10.9 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 519 |\n", "| ep_rew_mean | 161 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 730 |\n", "| time_elapsed | 3476 |\n", "| total_timesteps | 1495040 |\n", "| train/ | |\n", "| approx_kl | 0.0029288619 |\n", "| clip_fraction | 0.0337 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.3 |\n", "| explained_variance | 0.751 |\n", "| learning_rate | 0.0003 |\n", "| loss | 47.6 |\n", "| n_updates | 7290 |\n", "| policy_gradient_loss | 0.0017 |\n", "| value_loss | 48.6 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 516 |\n", "| ep_rew_mean | 164 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 731 |\n", "| time_elapsed | 3481 |\n", "| total_timesteps | 1497088 |\n", "| train/ | |\n", "| approx_kl | 0.0042147515 |\n", "| clip_fraction | 0.0861 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.492 |\n", "| explained_variance | 0.659 |\n", "| learning_rate | 0.0003 |\n", "| loss | 0.999 |\n", "| n_updates | 7300 |\n", "| policy_gradient_loss | 0.00347 |\n", "| value_loss | 3.12 |\n", "------------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 521 |\n", "| ep_rew_mean | 164 |\n", "| time/ | |\n", "| fps | 429 |\n", "| iterations | 732 |\n", "| time_elapsed | 3486 |\n", "| total_timesteps | 1499136 |\n", "| train/ | |\n", "| approx_kl | 0.0039165737 |\n", "| clip_fraction | 0.0345 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.261 |\n", "| explained_variance | 0.883 |\n", "| learning_rate | 0.0003 |\n", "| loss | 7.91 |\n", "| n_updates | 7310 |\n", "| policy_gradient_loss | -0.0016 |\n", "| value_loss | 25.7 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 523 |\n", "| ep_rew_mean | 163 |\n", "| time/ | |\n", "| fps | 430 |\n", "| iterations | 733 |\n", "| time_elapsed | 3490 |\n", "| total_timesteps | 1501184 |\n", "| train/ | |\n", "| approx_kl | 0.008056066 |\n", "| clip_fraction | 0.065 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.399 |\n", "| explained_variance | 0.882 |\n", "| learning_rate | 0.0003 |\n", "| loss | 1.44 |\n", "| n_updates | 7320 |\n", "| policy_gradient_loss | -0.00138 |\n", "| value_loss | 9.35 |\n", "-----------------------------------------\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": {}, "execution_count": 27 } ], "source": [ "# TODO: Define a PPO MlpPolicy architecture\n", "# We use MultiLayerPerceptron (MLPPolicy) because the input is a vector,\n", "# if we had frames as input we would use CnnPolicy\n", "model = PPO('MlpPolicy', env, verbose=1)\n", "model.learn(total_timesteps=1500000)#ich" ] }, { "cell_type": "code", "source": [ "#ich\n", "# Save the model\n", "model_name = \"ppo-LunarLander-v2\"\n", "model.save(model_name)" ], "metadata": { "id": "KZt4GfK3auci" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "model_tuned = PPO('MlpPolicy', env, verbose=1,\n", " n_steps = 2**9,\n", " gamma = 1-0.006075594024321983,\n", " max_grad_norm=1.8559426752164974,\n", " learning_rate=0.0011176199638550707)\n", "model_tuned.learn(total_timesteps=1500000)#ich" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "fCKw74-Kjbn9", "outputId": "00bd8151-d9f5-4927-9fa0-68585dc634a6" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Using cuda device\n", "---------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 87.7 |\n", "| ep_rew_mean | -178 |\n", "| time/ | |\n", "| fps | 2010 |\n", "| iterations | 1 |\n", "| time_elapsed | 4 |\n", "| total_timesteps | 8192 |\n", "---------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 98.4 |\n", "| ep_rew_mean | -158 |\n", "| time/ | |\n", "| fps | 1340 |\n", "| iterations | 2 |\n", "| time_elapsed | 12 |\n", "| total_timesteps | 16384 |\n", "| train/ | |\n", "| approx_kl | 0.009433961 |\n", "| clip_fraction | 0.083 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.38 |\n", "| explained_variance | -0.00253 |\n", "| learning_rate | 0.00112 |\n", "| loss | 271 |\n", "| n_updates | 10 |\n", "| policy_gradient_loss | -0.00621 |\n", "| value_loss | 818 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 100 |\n", "| ep_rew_mean | -108 |\n", "| time/ | |\n", "| fps | 1204 |\n", "| iterations | 3 |\n", "| time_elapsed | 20 |\n", "| total_timesteps | 24576 |\n", "| train/ | |\n", "| approx_kl | 0.01248903 |\n", "| clip_fraction | 0.174 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.35 |\n", "| explained_variance | 0.516 |\n", "| learning_rate | 0.00112 |\n", "| loss | 87.4 |\n", "| n_updates | 20 |\n", "| policy_gradient_loss | -0.0163 |\n", "| value_loss | 252 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 105 |\n", "| ep_rew_mean | -88.3 |\n", "| time/ | |\n", "| fps | 1135 |\n", "| iterations | 4 |\n", "| time_elapsed | 28 |\n", "| total_timesteps | 32768 |\n", "| train/ | |\n", "| approx_kl | 0.013888482 |\n", "| clip_fraction | 0.236 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.32 |\n", "| explained_variance | 0.699 |\n", "| learning_rate | 0.00112 |\n", "| loss | 68.6 |\n", "| n_updates | 30 |\n", "| policy_gradient_loss | -0.021 |\n", "| value_loss | 168 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 113 |\n", "| ep_rew_mean | -58.8 |\n", "| time/ | |\n", "| fps | 1119 |\n", "| iterations | 5 |\n", "| time_elapsed | 36 |\n", "| total_timesteps | 40960 |\n", "| train/ | |\n", "| approx_kl | 0.013373473 |\n", "| clip_fraction | 0.238 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.27 |\n", "| explained_variance | 0.791 |\n", "| learning_rate | 0.00112 |\n", "| loss | 37.2 |\n", "| n_updates | 40 |\n", "| policy_gradient_loss | -0.0221 |\n", "| value_loss | 84 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 124 |\n", "| ep_rew_mean | -42.8 |\n", "| time/ | |\n", "| fps | 1071 |\n", "| iterations | 6 |\n", "| time_elapsed | 45 |\n", "| total_timesteps | 49152 |\n", "| train/ | |\n", "| approx_kl | 0.01669573 |\n", "| clip_fraction | 0.276 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.2 |\n", "| explained_variance | 0.84 |\n", "| learning_rate | 0.00112 |\n", "| loss | 20.6 |\n", "| n_updates | 50 |\n", "| policy_gradient_loss | -0.0237 |\n", "| value_loss | 62 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 158 |\n", "| ep_rew_mean | -35.6 |\n", "| time/ | |\n", "| fps | 998 |\n", "| iterations | 7 |\n", "| time_elapsed | 57 |\n", "| total_timesteps | 57344 |\n", "| train/ | |\n", "| approx_kl | 0.015557964 |\n", "| clip_fraction | 0.244 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.14 |\n", "| explained_variance | 0.892 |\n", "| learning_rate | 0.00112 |\n", "| loss | 20.6 |\n", "| n_updates | 60 |\n", "| policy_gradient_loss | -0.0188 |\n", "| value_loss | 48.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 202 |\n", "| ep_rew_mean | -32.3 |\n", "| time/ | |\n", "| fps | 945 |\n", "| iterations | 8 |\n", "| time_elapsed | 69 |\n", "| total_timesteps | 65536 |\n", "| train/ | |\n", "| approx_kl | 0.015042961 |\n", "| clip_fraction | 0.211 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.12 |\n", "| explained_variance | 0.937 |\n", "| learning_rate | 0.00112 |\n", "| loss | 9.6 |\n", "| n_updates | 70 |\n", "| policy_gradient_loss | -0.0162 |\n", "| value_loss | 29.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 235 |\n", "| ep_rew_mean | -25.8 |\n", "| time/ | |\n", "| fps | 893 |\n", "| iterations | 9 |\n", "| time_elapsed | 82 |\n", "| total_timesteps | 73728 |\n", "| train/ | |\n", "| approx_kl | 0.013015391 |\n", "| clip_fraction | 0.173 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.09 |\n", "| explained_variance | 0.937 |\n", "| learning_rate | 0.00112 |\n", "| loss | 9.45 |\n", "| n_updates | 80 |\n", "| policy_gradient_loss | -0.0117 |\n", "| value_loss | 33.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 288 |\n", "| ep_rew_mean | -16.4 |\n", "| time/ | |\n", "| fps | 842 |\n", "| iterations | 10 |\n", "| time_elapsed | 97 |\n", "| total_timesteps | 81920 |\n", "| train/ | |\n", "| approx_kl | 0.009876698 |\n", "| clip_fraction | 0.147 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.03 |\n", "| explained_variance | 0.913 |\n", "| learning_rate | 0.00112 |\n", "| loss | 33.8 |\n", "| n_updates | 90 |\n", "| policy_gradient_loss | -0.00702 |\n", "| value_loss | 48.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 368 |\n", "| ep_rew_mean | -3.64 |\n", "| time/ | |\n", "| fps | 809 |\n", "| iterations | 11 |\n", "| time_elapsed | 111 |\n", "| total_timesteps | 90112 |\n", "| train/ | |\n", "| approx_kl | 0.010058949 |\n", "| clip_fraction | 0.132 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.04 |\n", "| explained_variance | 0.861 |\n", "| learning_rate | 0.00112 |\n", "| loss | 8.21 |\n", "| n_updates | 100 |\n", "| policy_gradient_loss | -0.00594 |\n", "| value_loss | 62.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 407 |\n", "| ep_rew_mean | 1.11 |\n", "| time/ | |\n", "| fps | 777 |\n", "| iterations | 12 |\n", "| time_elapsed | 126 |\n", "| total_timesteps | 98304 |\n", "| train/ | |\n", "| approx_kl | 0.012389878 |\n", "| clip_fraction | 0.146 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.01 |\n", "| explained_variance | 0.939 |\n", "| learning_rate | 0.00112 |\n", "| loss | 8.92 |\n", "| n_updates | 110 |\n", "| policy_gradient_loss | -0.00553 |\n", "| value_loss | 34.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 482 |\n", "| ep_rew_mean | 8.83 |\n", "| time/ | |\n", "| fps | 758 |\n", "| iterations | 13 |\n", "| time_elapsed | 140 |\n", "| total_timesteps | 106496 |\n", "| train/ | |\n", "| approx_kl | 0.011692753 |\n", "| clip_fraction | 0.145 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.96 |\n", "| explained_variance | 0.86 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.98 |\n", "| n_updates | 120 |\n", "| policy_gradient_loss | -0.00677 |\n", "| value_loss | 41.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 508 |\n", "| ep_rew_mean | 14.3 |\n", "| time/ | |\n", "| fps | 733 |\n", "| iterations | 14 |\n", "| time_elapsed | 156 |\n", "| total_timesteps | 114688 |\n", "| train/ | |\n", "| approx_kl | 0.010017732 |\n", "| clip_fraction | 0.124 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.01 |\n", "| explained_variance | 0.925 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.22 |\n", "| n_updates | 130 |\n", "| policy_gradient_loss | -0.00546 |\n", "| value_loss | 18.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 580 |\n", "| ep_rew_mean | 30.1 |\n", "| time/ | |\n", "| fps | 722 |\n", "| iterations | 15 |\n", "| time_elapsed | 170 |\n", "| total_timesteps | 122880 |\n", "| train/ | |\n", "| approx_kl | 0.018968552 |\n", "| clip_fraction | 0.213 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -1.01 |\n", "| explained_variance | 0.961 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.61 |\n", "| n_updates | 140 |\n", "| policy_gradient_loss | -0.0072 |\n", "| value_loss | 5.93 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 621 |\n", "| ep_rew_mean | 41 |\n", "| time/ | |\n", "| fps | 709 |\n", "| iterations | 16 |\n", "| time_elapsed | 184 |\n", "| total_timesteps | 131072 |\n", "| train/ | |\n", "| approx_kl | 0.00793292 |\n", "| clip_fraction | 0.096 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.975 |\n", "| explained_variance | 0.939 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.53 |\n", "| n_updates | 150 |\n", "| policy_gradient_loss | -0.00597 |\n", "| value_loss | 17.5 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 693 |\n", "| ep_rew_mean | 59.5 |\n", "| time/ | |\n", "| fps | 700 |\n", "| iterations | 17 |\n", "| time_elapsed | 198 |\n", "| total_timesteps | 139264 |\n", "| train/ | |\n", "| approx_kl | 0.010770112 |\n", "| clip_fraction | 0.121 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.942 |\n", "| explained_variance | 0.939 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.08 |\n", "| n_updates | 160 |\n", "| policy_gradient_loss | -0.00436 |\n", "| value_loss | 14.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 729 |\n", "| ep_rew_mean | 72.2 |\n", "| time/ | |\n", "| fps | 694 |\n", "| iterations | 18 |\n", "| time_elapsed | 212 |\n", "| total_timesteps | 147456 |\n", "| train/ | |\n", "| approx_kl | 0.010736944 |\n", "| clip_fraction | 0.107 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.953 |\n", "| explained_variance | 0.961 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.4 |\n", "| n_updates | 170 |\n", "| policy_gradient_loss | -0.00328 |\n", "| value_loss | 13.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 777 |\n", "| ep_rew_mean | 78 |\n", "| time/ | |\n", "| fps | 686 |\n", "| iterations | 19 |\n", "| time_elapsed | 226 |\n", "| total_timesteps | 155648 |\n", "| train/ | |\n", "| approx_kl | 0.009378614 |\n", "| clip_fraction | 0.113 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.913 |\n", "| explained_variance | 0.935 |\n", "| learning_rate | 0.00112 |\n", "| loss | 0.843 |\n", "| n_updates | 180 |\n", "| policy_gradient_loss | -0.0037 |\n", "| value_loss | 26.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 812 |\n", "| ep_rew_mean | 89.9 |\n", "| time/ | |\n", "| fps | 679 |\n", "| iterations | 20 |\n", "| time_elapsed | 241 |\n", "| total_timesteps | 163840 |\n", "| train/ | |\n", "| approx_kl | 0.011721934 |\n", "| clip_fraction | 0.114 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.882 |\n", "| explained_variance | 0.976 |\n", "| learning_rate | 0.00112 |\n", "| loss | 0.515 |\n", "| n_updates | 190 |\n", "| policy_gradient_loss | -0.000511 |\n", "| value_loss | 5.02 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 817 |\n", "| ep_rew_mean | 98.3 |\n", "| time/ | |\n", "| fps | 673 |\n", "| iterations | 21 |\n", "| time_elapsed | 255 |\n", "| total_timesteps | 172032 |\n", "| train/ | |\n", "| approx_kl | 0.0080495905 |\n", "| clip_fraction | 0.106 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.856 |\n", "| explained_variance | 0.975 |\n", "| learning_rate | 0.00112 |\n", "| loss | 0.856 |\n", "| n_updates | 200 |\n", "| policy_gradient_loss | -0.00237 |\n", "| value_loss | 7.41 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 817 |\n", "| ep_rew_mean | 106 |\n", "| time/ | |\n", "| fps | 668 |\n", "| iterations | 22 |\n", "| time_elapsed | 269 |\n", "| total_timesteps | 180224 |\n", "| train/ | |\n", "| approx_kl | 0.010070408 |\n", "| clip_fraction | 0.108 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.817 |\n", "| explained_variance | 0.9 |\n", "| learning_rate | 0.00112 |\n", "| loss | 25.7 |\n", "| n_updates | 210 |\n", "| policy_gradient_loss | -0.00291 |\n", "| value_loss | 48.1 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 825 |\n", "| ep_rew_mean | 123 |\n", "| time/ | |\n", "| fps | 665 |\n", "| iterations | 23 |\n", "| time_elapsed | 283 |\n", "| total_timesteps | 188416 |\n", "| train/ | |\n", "| approx_kl | 0.00600619 |\n", "| clip_fraction | 0.0928 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.828 |\n", "| explained_variance | 0.924 |\n", "| learning_rate | 0.00112 |\n", "| loss | 70.5 |\n", "| n_updates | 220 |\n", "| policy_gradient_loss | -0.00244 |\n", "| value_loss | 23.1 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 833 |\n", "| ep_rew_mean | 126 |\n", "| time/ | |\n", "| fps | 662 |\n", "| iterations | 24 |\n", "| time_elapsed | 296 |\n", "| total_timesteps | 196608 |\n", "| train/ | |\n", "| approx_kl | 0.009083592 |\n", "| clip_fraction | 0.096 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.794 |\n", "| explained_variance | 0.938 |\n", "| learning_rate | 0.00112 |\n", "| loss | 26 |\n", "| n_updates | 230 |\n", "| policy_gradient_loss | -0.00204 |\n", "| value_loss | 33 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 841 |\n", "| ep_rew_mean | 137 |\n", "| time/ | |\n", "| fps | 662 |\n", "| iterations | 25 |\n", "| time_elapsed | 309 |\n", "| total_timesteps | 204800 |\n", "| train/ | |\n", "| approx_kl | 0.009901471 |\n", "| clip_fraction | 0.116 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.799 |\n", "| explained_variance | 0.964 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.96 |\n", "| n_updates | 240 |\n", "| policy_gradient_loss | -0.000674 |\n", "| value_loss | 6.14 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 837 |\n", "| ep_rew_mean | 140 |\n", "| time/ | |\n", "| fps | 661 |\n", "| iterations | 26 |\n", "| time_elapsed | 321 |\n", "| total_timesteps | 212992 |\n", "| train/ | |\n", "| approx_kl | 0.007379354 |\n", "| clip_fraction | 0.064 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.749 |\n", "| explained_variance | 0.891 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.46 |\n", "| n_updates | 250 |\n", "| policy_gradient_loss | -0.00277 |\n", "| value_loss | 55.6 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 842 |\n", "| ep_rew_mean | 149 |\n", "| time/ | |\n", "| fps | 661 |\n", "| iterations | 27 |\n", "| time_elapsed | 334 |\n", "| total_timesteps | 221184 |\n", "| train/ | |\n", "| approx_kl | 0.0057649757 |\n", "| clip_fraction | 0.102 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.756 |\n", "| explained_variance | 0.963 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.46 |\n", "| n_updates | 260 |\n", "| policy_gradient_loss | -0.000331 |\n", "| value_loss | 18.1 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 841 |\n", "| ep_rew_mean | 154 |\n", "| time/ | |\n", "| fps | 661 |\n", "| iterations | 28 |\n", "| time_elapsed | 347 |\n", "| total_timesteps | 229376 |\n", "| train/ | |\n", "| approx_kl | 0.005553289 |\n", "| clip_fraction | 0.0809 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.746 |\n", "| explained_variance | 0.91 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.28 |\n", "| n_updates | 270 |\n", "| policy_gradient_loss | -0.00363 |\n", "| value_loss | 46.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 840 |\n", "| ep_rew_mean | 159 |\n", "| time/ | |\n", "| fps | 661 |\n", "| iterations | 29 |\n", "| time_elapsed | 359 |\n", "| total_timesteps | 237568 |\n", "| train/ | |\n", "| approx_kl | 0.008921631 |\n", "| clip_fraction | 0.0904 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.73 |\n", "| explained_variance | 0.915 |\n", "| learning_rate | 0.00112 |\n", "| loss | 9.48 |\n", "| n_updates | 280 |\n", "| policy_gradient_loss | -0.00178 |\n", "| value_loss | 42.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 824 |\n", "| ep_rew_mean | 162 |\n", "| time/ | |\n", "| fps | 661 |\n", "| iterations | 30 |\n", "| time_elapsed | 371 |\n", "| total_timesteps | 245760 |\n", "| train/ | |\n", "| approx_kl | 0.011628082 |\n", "| clip_fraction | 0.111 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.719 |\n", "| explained_variance | 0.961 |\n", "| learning_rate | 0.00112 |\n", "| loss | 0.186 |\n", "| n_updates | 290 |\n", "| policy_gradient_loss | -0.00063 |\n", "| value_loss | 9.54 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 825 |\n", "| ep_rew_mean | 169 |\n", "| time/ | |\n", "| fps | 662 |\n", "| iterations | 31 |\n", "| time_elapsed | 383 |\n", "| total_timesteps | 253952 |\n", "| train/ | |\n", "| approx_kl | 0.008334557 |\n", "| clip_fraction | 0.0847 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.705 |\n", "| explained_variance | 0.927 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.84 |\n", "| n_updates | 300 |\n", "| policy_gradient_loss | -0.0026 |\n", "| value_loss | 30.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 796 |\n", "| ep_rew_mean | 173 |\n", "| time/ | |\n", "| fps | 662 |\n", "| iterations | 32 |\n", "| time_elapsed | 395 |\n", "| total_timesteps | 262144 |\n", "| train/ | |\n", "| approx_kl | 0.007569379 |\n", "| clip_fraction | 0.0691 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.681 |\n", "| explained_variance | 0.92 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.36 |\n", "| n_updates | 310 |\n", "| policy_gradient_loss | -0.00288 |\n", "| value_loss | 35.8 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 772 |\n", "| ep_rew_mean | 175 |\n", "| time/ | |\n", "| fps | 662 |\n", "| iterations | 33 |\n", "| time_elapsed | 408 |\n", "| total_timesteps | 270336 |\n", "| train/ | |\n", "| approx_kl | 0.012198856 |\n", "| clip_fraction | 0.114 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.721 |\n", "| explained_variance | 0.92 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.99 |\n", "| n_updates | 320 |\n", "| policy_gradient_loss | -0.00182 |\n", "| value_loss | 48.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 760 |\n", "| ep_rew_mean | 179 |\n", "| time/ | |\n", "| fps | 661 |\n", "| iterations | 34 |\n", "| time_elapsed | 420 |\n", "| total_timesteps | 278528 |\n", "| train/ | |\n", "| approx_kl | 0.009193368 |\n", "| clip_fraction | 0.0721 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.64 |\n", "| explained_variance | 0.925 |\n", "| learning_rate | 0.00112 |\n", "| loss | 11.6 |\n", "| n_updates | 330 |\n", "| policy_gradient_loss | -0.00167 |\n", "| value_loss | 49.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 729 |\n", "| ep_rew_mean | 175 |\n", "| time/ | |\n", "| fps | 662 |\n", "| iterations | 35 |\n", "| time_elapsed | 432 |\n", "| total_timesteps | 286720 |\n", "| train/ | |\n", "| approx_kl | 0.014393084 |\n", "| clip_fraction | 0.106 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.65 |\n", "| explained_variance | 0.924 |\n", "| learning_rate | 0.00112 |\n", "| loss | 55.9 |\n", "| n_updates | 340 |\n", "| policy_gradient_loss | -0.00448 |\n", "| value_loss | 46 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 665 |\n", "| ep_rew_mean | 176 |\n", "| time/ | |\n", "| fps | 666 |\n", "| iterations | 36 |\n", "| time_elapsed | 442 |\n", "| total_timesteps | 294912 |\n", "| train/ | |\n", "| approx_kl | 0.007863423 |\n", "| clip_fraction | 0.101 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.666 |\n", "| explained_variance | 0.929 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.92 |\n", "| n_updates | 350 |\n", "| policy_gradient_loss | -0.00136 |\n", "| value_loss | 54.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 567 |\n", "| ep_rew_mean | 189 |\n", "| time/ | |\n", "| fps | 668 |\n", "| iterations | 37 |\n", "| time_elapsed | 453 |\n", "| total_timesteps | 303104 |\n", "| train/ | |\n", "| approx_kl | 0.013041699 |\n", "| clip_fraction | 0.104 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.682 |\n", "| explained_variance | 0.855 |\n", "| learning_rate | 0.00112 |\n", "| loss | 35.4 |\n", "| n_updates | 360 |\n", "| policy_gradient_loss | -0.00437 |\n", "| value_loss | 97.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 517 |\n", "| ep_rew_mean | 197 |\n", "| time/ | |\n", "| fps | 670 |\n", "| iterations | 38 |\n", "| time_elapsed | 464 |\n", "| total_timesteps | 311296 |\n", "| train/ | |\n", "| approx_kl | 0.018412486 |\n", "| clip_fraction | 0.109 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.655 |\n", "| explained_variance | 0.842 |\n", "| learning_rate | 0.00112 |\n", "| loss | 45.9 |\n", "| n_updates | 370 |\n", "| policy_gradient_loss | -0.00408 |\n", "| value_loss | 141 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 401 |\n", "| ep_rew_mean | 216 |\n", "| time/ | |\n", "| fps | 674 |\n", "| iterations | 39 |\n", "| time_elapsed | 473 |\n", "| total_timesteps | 319488 |\n", "| train/ | |\n", "| approx_kl | 0.015856426 |\n", "| clip_fraction | 0.112 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.692 |\n", "| explained_variance | 0.85 |\n", "| learning_rate | 0.00112 |\n", "| loss | 79.5 |\n", "| n_updates | 380 |\n", "| policy_gradient_loss | -0.00422 |\n", "| value_loss | 126 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 340 |\n", "| ep_rew_mean | 219 |\n", "| time/ | |\n", "| fps | 677 |\n", "| iterations | 40 |\n", "| time_elapsed | 483 |\n", "| total_timesteps | 327680 |\n", "| train/ | |\n", "| approx_kl | 0.01049629 |\n", "| clip_fraction | 0.1 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.767 |\n", "| explained_variance | 0.846 |\n", "| learning_rate | 0.00112 |\n", "| loss | 65 |\n", "| n_updates | 390 |\n", "| policy_gradient_loss | -0.00197 |\n", "| value_loss | 79.2 |\n", "----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 331 |\n", "| ep_rew_mean | 223 |\n", "| time/ | |\n", "| fps | 680 |\n", "| iterations | 41 |\n", "| time_elapsed | 493 |\n", "| total_timesteps | 335872 |\n", "| train/ | |\n", "| approx_kl | 0.01153439 |\n", "| clip_fraction | 0.114 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.764 |\n", "| explained_variance | 0.855 |\n", "| learning_rate | 0.00112 |\n", "| loss | 19 |\n", "| n_updates | 400 |\n", "| policy_gradient_loss | -0.00381 |\n", "| value_loss | 88 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 311 |\n", "| ep_rew_mean | 227 |\n", "| time/ | |\n", "| fps | 682 |\n", "| iterations | 42 |\n", "| time_elapsed | 504 |\n", "| total_timesteps | 344064 |\n", "| train/ | |\n", "| approx_kl | 0.011043092 |\n", "| clip_fraction | 0.133 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.775 |\n", "| explained_variance | 0.851 |\n", "| learning_rate | 0.00112 |\n", "| loss | 35.2 |\n", "| n_updates | 410 |\n", "| policy_gradient_loss | -0.00298 |\n", "| value_loss | 81.3 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 312 |\n", "| ep_rew_mean | 224 |\n", "| time/ | |\n", "| fps | 685 |\n", "| iterations | 43 |\n", "| time_elapsed | 514 |\n", "| total_timesteps | 352256 |\n", "| train/ | |\n", "| approx_kl | 0.01578068 |\n", "| clip_fraction | 0.122 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.773 |\n", "| explained_variance | 0.901 |\n", "| learning_rate | 0.00112 |\n", "| loss | 42.1 |\n", "| n_updates | 420 |\n", "| policy_gradient_loss | -0.0028 |\n", "| value_loss | 53 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 300 |\n", "| ep_rew_mean | 240 |\n", "| time/ | |\n", "| fps | 687 |\n", "| iterations | 44 |\n", "| time_elapsed | 524 |\n", "| total_timesteps | 360448 |\n", "| train/ | |\n", "| approx_kl | 0.009789484 |\n", "| clip_fraction | 0.123 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.749 |\n", "| explained_variance | 0.87 |\n", "| learning_rate | 0.00112 |\n", "| loss | 14 |\n", "| n_updates | 430 |\n", "| policy_gradient_loss | -0.00445 |\n", "| value_loss | 39 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 294 |\n", "| ep_rew_mean | 236 |\n", "| time/ | |\n", "| fps | 690 |\n", "| iterations | 45 |\n", "| time_elapsed | 533 |\n", "| total_timesteps | 368640 |\n", "| train/ | |\n", "| approx_kl | 0.009179481 |\n", "| clip_fraction | 0.0893 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.743 |\n", "| explained_variance | 0.937 |\n", "| learning_rate | 0.00112 |\n", "| loss | 17.6 |\n", "| n_updates | 440 |\n", "| policy_gradient_loss | -0.00201 |\n", "| value_loss | 35.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 291 |\n", "| ep_rew_mean | 241 |\n", "| time/ | |\n", "| fps | 691 |\n", "| iterations | 46 |\n", "| time_elapsed | 544 |\n", "| total_timesteps | 376832 |\n", "| train/ | |\n", "| approx_kl | 0.010585489 |\n", "| clip_fraction | 0.0946 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.716 |\n", "| explained_variance | 0.934 |\n", "| learning_rate | 0.00112 |\n", "| loss | 110 |\n", "| n_updates | 450 |\n", "| policy_gradient_loss | -0.00494 |\n", "| value_loss | 43.4 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 311 |\n", "| ep_rew_mean | 242 |\n", "| time/ | |\n", "| fps | 692 |\n", "| iterations | 47 |\n", "| time_elapsed | 555 |\n", "| total_timesteps | 385024 |\n", "| train/ | |\n", "| approx_kl | 0.0127665745 |\n", "| clip_fraction | 0.126 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.712 |\n", "| explained_variance | 0.959 |\n", "| learning_rate | 0.00112 |\n", "| loss | 10.8 |\n", "| n_updates | 460 |\n", "| policy_gradient_loss | -0.00334 |\n", "| value_loss | 16.8 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 335 |\n", "| ep_rew_mean | 242 |\n", "| time/ | |\n", "| fps | 694 |\n", "| iterations | 48 |\n", "| time_elapsed | 566 |\n", "| total_timesteps | 393216 |\n", "| train/ | |\n", "| approx_kl | 0.008233515 |\n", "| clip_fraction | 0.103 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.688 |\n", "| explained_variance | 0.965 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.15 |\n", "| n_updates | 470 |\n", "| policy_gradient_loss | -0.00182 |\n", "| value_loss | 13.8 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 350 |\n", "| ep_rew_mean | 248 |\n", "| time/ | |\n", "| fps | 695 |\n", "| iterations | 49 |\n", "| time_elapsed | 576 |\n", "| total_timesteps | 401408 |\n", "| train/ | |\n", "| approx_kl | 0.010485147 |\n", "| clip_fraction | 0.101 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.701 |\n", "| explained_variance | 0.966 |\n", "| learning_rate | 0.00112 |\n", "| loss | 8.88 |\n", "| n_updates | 480 |\n", "| policy_gradient_loss | -0.00358 |\n", "| value_loss | 27.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 348 |\n", "| ep_rew_mean | 249 |\n", "| time/ | |\n", "| fps | 697 |\n", "| iterations | 50 |\n", "| time_elapsed | 587 |\n", "| total_timesteps | 409600 |\n", "| train/ | |\n", "| approx_kl | 0.009453544 |\n", "| clip_fraction | 0.0938 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.687 |\n", "| explained_variance | 0.966 |\n", "| learning_rate | 0.00112 |\n", "| loss | 13.4 |\n", "| n_updates | 490 |\n", "| policy_gradient_loss | -0.000721 |\n", "| value_loss | 22.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 314 |\n", "| ep_rew_mean | 257 |\n", "| time/ | |\n", "| fps | 700 |\n", "| iterations | 51 |\n", "| time_elapsed | 596 |\n", "| total_timesteps | 417792 |\n", "| train/ | |\n", "| approx_kl | 0.011446089 |\n", "| clip_fraction | 0.113 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.708 |\n", "| explained_variance | 0.988 |\n", "| learning_rate | 0.00112 |\n", "| loss | 7.33 |\n", "| n_updates | 500 |\n", "| policy_gradient_loss | -0.00305 |\n", "| value_loss | 9.94 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 301 |\n", "| ep_rew_mean | 260 |\n", "| time/ | |\n", "| fps | 702 |\n", "| iterations | 52 |\n", "| time_elapsed | 606 |\n", "| total_timesteps | 425984 |\n", "| train/ | |\n", "| approx_kl | 0.008446146 |\n", "| clip_fraction | 0.116 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.696 |\n", "| explained_variance | 0.976 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.4 |\n", "| n_updates | 510 |\n", "| policy_gradient_loss | -0.00287 |\n", "| value_loss | 7.64 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 302 |\n", "| ep_rew_mean | 262 |\n", "| time/ | |\n", "| fps | 704 |\n", "| iterations | 53 |\n", "| time_elapsed | 616 |\n", "| total_timesteps | 434176 |\n", "| train/ | |\n", "| approx_kl | 0.009668088 |\n", "| clip_fraction | 0.109 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.71 |\n", "| explained_variance | 0.977 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.68 |\n", "| n_updates | 520 |\n", "| policy_gradient_loss | -0.00557 |\n", "| value_loss | 7.61 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 287 |\n", "| ep_rew_mean | 259 |\n", "| time/ | |\n", "| fps | 706 |\n", "| iterations | 54 |\n", "| time_elapsed | 626 |\n", "| total_timesteps | 442368 |\n", "| train/ | |\n", "| approx_kl | 0.0102976635 |\n", "| clip_fraction | 0.111 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.697 |\n", "| explained_variance | 0.989 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.61 |\n", "| n_updates | 530 |\n", "| policy_gradient_loss | -0.00204 |\n", "| value_loss | 6.36 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 300 |\n", "| ep_rew_mean | 258 |\n", "| time/ | |\n", "| fps | 708 |\n", "| iterations | 55 |\n", "| time_elapsed | 635 |\n", "| total_timesteps | 450560 |\n", "| train/ | |\n", "| approx_kl | 0.01195965 |\n", "| clip_fraction | 0.109 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.714 |\n", "| explained_variance | 0.938 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.05 |\n", "| n_updates | 540 |\n", "| policy_gradient_loss | -0.0026 |\n", "| value_loss | 32.6 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 275 |\n", "| ep_rew_mean | 254 |\n", "| time/ | |\n", "| fps | 711 |\n", "| iterations | 56 |\n", "| time_elapsed | 644 |\n", "| total_timesteps | 458752 |\n", "| train/ | |\n", "| approx_kl | 0.009921346 |\n", "| clip_fraction | 0.119 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.701 |\n", "| explained_variance | 0.989 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.38 |\n", "| n_updates | 550 |\n", "| policy_gradient_loss | -0.00289 |\n", "| value_loss | 7.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 267 |\n", "| ep_rew_mean | 264 |\n", "| time/ | |\n", "| fps | 713 |\n", "| iterations | 57 |\n", "| time_elapsed | 654 |\n", "| total_timesteps | 466944 |\n", "| train/ | |\n", "| approx_kl | 0.008440893 |\n", "| clip_fraction | 0.0839 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.698 |\n", "| explained_variance | 0.85 |\n", "| learning_rate | 0.00112 |\n", "| loss | 69.9 |\n", "| n_updates | 560 |\n", "| policy_gradient_loss | -0.00203 |\n", "| value_loss | 37.8 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 236 |\n", "| ep_rew_mean | 268 |\n", "| time/ | |\n", "| fps | 717 |\n", "| iterations | 58 |\n", "| time_elapsed | 662 |\n", "| total_timesteps | 475136 |\n", "| train/ | |\n", "| approx_kl | 0.0125043485 |\n", "| clip_fraction | 0.118 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.686 |\n", "| explained_variance | 0.943 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.9 |\n", "| n_updates | 570 |\n", "| policy_gradient_loss | -0.00381 |\n", "| value_loss | 14.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 230 |\n", "| ep_rew_mean | 273 |\n", "| time/ | |\n", "| fps | 719 |\n", "| iterations | 59 |\n", "| time_elapsed | 671 |\n", "| total_timesteps | 483328 |\n", "| train/ | |\n", "| approx_kl | 0.009416393 |\n", "| clip_fraction | 0.121 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.664 |\n", "| explained_variance | 0.965 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.15 |\n", "| n_updates | 580 |\n", "| policy_gradient_loss | -0.00327 |\n", "| value_loss | 8.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 222 |\n", "| ep_rew_mean | 275 |\n", "| time/ | |\n", "| fps | 722 |\n", "| iterations | 60 |\n", "| time_elapsed | 680 |\n", "| total_timesteps | 491520 |\n", "| train/ | |\n", "| approx_kl | 0.008947594 |\n", "| clip_fraction | 0.11 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.644 |\n", "| explained_variance | 0.955 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.75 |\n", "| n_updates | 590 |\n", "| policy_gradient_loss | -0.00155 |\n", "| value_loss | 7.21 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 221 |\n", "| ep_rew_mean | 281 |\n", "| time/ | |\n", "| fps | 725 |\n", "| iterations | 61 |\n", "| time_elapsed | 688 |\n", "| total_timesteps | 499712 |\n", "| train/ | |\n", "| approx_kl | 0.01299087 |\n", "| clip_fraction | 0.127 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.667 |\n", "| explained_variance | 0.982 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.03 |\n", "| n_updates | 600 |\n", "| policy_gradient_loss | -0.00352 |\n", "| value_loss | 6.24 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 219 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 727 |\n", "| iterations | 62 |\n", "| time_elapsed | 697 |\n", "| total_timesteps | 507904 |\n", "| train/ | |\n", "| approx_kl | 0.013910411 |\n", "| clip_fraction | 0.141 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.634 |\n", "| explained_variance | 0.983 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.45 |\n", "| n_updates | 610 |\n", "| policy_gradient_loss | -0.00628 |\n", "| value_loss | 6.01 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 213 |\n", "| ep_rew_mean | 280 |\n", "| time/ | |\n", "| fps | 730 |\n", "| iterations | 63 |\n", "| time_elapsed | 706 |\n", "| total_timesteps | 516096 |\n", "| train/ | |\n", "| approx_kl | 0.010656804 |\n", "| clip_fraction | 0.122 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.625 |\n", "| explained_variance | 0.985 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.69 |\n", "| n_updates | 620 |\n", "| policy_gradient_loss | -0.00395 |\n", "| value_loss | 6.06 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 207 |\n", "| ep_rew_mean | 277 |\n", "| time/ | |\n", "| fps | 733 |\n", "| iterations | 64 |\n", "| time_elapsed | 714 |\n", "| total_timesteps | 524288 |\n", "| train/ | |\n", "| approx_kl | 0.0110030435 |\n", "| clip_fraction | 0.105 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.594 |\n", "| explained_variance | 0.966 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.91 |\n", "| n_updates | 630 |\n", "| policy_gradient_loss | -0.00278 |\n", "| value_loss | 6.85 |\n", "------------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 201 |\n", "| ep_rew_mean | 276 |\n", "| time/ | |\n", "| fps | 735 |\n", "| iterations | 65 |\n", "| time_elapsed | 723 |\n", "| total_timesteps | 532480 |\n", "| train/ | |\n", "| approx_kl | 0.00963041 |\n", "| clip_fraction | 0.11 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.6 |\n", "| explained_variance | 0.959 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.58 |\n", "| n_updates | 640 |\n", "| policy_gradient_loss | -0.00199 |\n", "| value_loss | 10.1 |\n", "----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 208 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 737 |\n", "| iterations | 66 |\n", "| time_elapsed | 732 |\n", "| total_timesteps | 540672 |\n", "| train/ | |\n", "| approx_kl | 0.00821097 |\n", "| clip_fraction | 0.105 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.577 |\n", "| explained_variance | 0.945 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.53 |\n", "| n_updates | 650 |\n", "| policy_gradient_loss | -0.00212 |\n", "| value_loss | 33.3 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 207 |\n", "| ep_rew_mean | 276 |\n", "| time/ | |\n", "| fps | 741 |\n", "| iterations | 67 |\n", "| time_elapsed | 740 |\n", "| total_timesteps | 548864 |\n", "| train/ | |\n", "| approx_kl | 0.010355219 |\n", "| clip_fraction | 0.114 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.576 |\n", "| explained_variance | 0.942 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.5 |\n", "| n_updates | 660 |\n", "| policy_gradient_loss | -0.00109 |\n", "| value_loss | 40.3 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 205 |\n", "| ep_rew_mean | 276 |\n", "| time/ | |\n", "| fps | 742 |\n", "| iterations | 68 |\n", "| time_elapsed | 749 |\n", "| total_timesteps | 557056 |\n", "| train/ | |\n", "| approx_kl | 0.009888106 |\n", "| clip_fraction | 0.111 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.567 |\n", "| explained_variance | 0.983 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.34 |\n", "| n_updates | 670 |\n", "| policy_gradient_loss | -0.0023 |\n", "| value_loss | 6.59 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 214 |\n", "| ep_rew_mean | 277 |\n", "| time/ | |\n", "| fps | 744 |\n", "| iterations | 69 |\n", "| time_elapsed | 759 |\n", "| total_timesteps | 565248 |\n", "| train/ | |\n", "| approx_kl | 0.008005397 |\n", "| clip_fraction | 0.1 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.569 |\n", "| explained_variance | 0.912 |\n", "| learning_rate | 0.00112 |\n", "| loss | 13.9 |\n", "| n_updates | 680 |\n", "| policy_gradient_loss | -0.00165 |\n", "| value_loss | 52.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 235 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 746 |\n", "| iterations | 70 |\n", "| time_elapsed | 768 |\n", "| total_timesteps | 573440 |\n", "| train/ | |\n", "| approx_kl | 0.011526575 |\n", "| clip_fraction | 0.113 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.597 |\n", "| explained_variance | 0.981 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.71 |\n", "| n_updates | 690 |\n", "| policy_gradient_loss | -5.38e-05 |\n", "| value_loss | 6.75 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 242 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 747 |\n", "| iterations | 71 |\n", "| time_elapsed | 777 |\n", "| total_timesteps | 581632 |\n", "| train/ | |\n", "| approx_kl | 0.012810791 |\n", "| clip_fraction | 0.0998 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.578 |\n", "| explained_variance | 0.983 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.65 |\n", "| n_updates | 700 |\n", "| policy_gradient_loss | -0.00346 |\n", "| value_loss | 7.52 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 243 |\n", "| ep_rew_mean | 274 |\n", "| time/ | |\n", "| fps | 749 |\n", "| iterations | 72 |\n", "| time_elapsed | 786 |\n", "| total_timesteps | 589824 |\n", "| train/ | |\n", "| approx_kl | 0.008115089 |\n", "| clip_fraction | 0.0916 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.577 |\n", "| explained_variance | 0.983 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.02 |\n", "| n_updates | 710 |\n", "| policy_gradient_loss | -0.000584 |\n", "| value_loss | 8.79 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 230 |\n", "| ep_rew_mean | 269 |\n", "| time/ | |\n", "| fps | 750 |\n", "| iterations | 73 |\n", "| time_elapsed | 796 |\n", "| total_timesteps | 598016 |\n", "| train/ | |\n", "| approx_kl | 0.009654837 |\n", "| clip_fraction | 0.107 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.568 |\n", "| explained_variance | 0.991 |\n", "| learning_rate | 0.00112 |\n", "| loss | 6.56 |\n", "| n_updates | 720 |\n", "| policy_gradient_loss | -0.00457 |\n", "| value_loss | 7.54 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 217 |\n", "| ep_rew_mean | 266 |\n", "| time/ | |\n", "| fps | 753 |\n", "| iterations | 74 |\n", "| time_elapsed | 805 |\n", "| total_timesteps | 606208 |\n", "| train/ | |\n", "| approx_kl | 0.010406023 |\n", "| clip_fraction | 0.0979 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.555 |\n", "| explained_variance | 0.93 |\n", "| learning_rate | 0.00112 |\n", "| loss | 10.5 |\n", "| n_updates | 730 |\n", "| policy_gradient_loss | -0.00149 |\n", "| value_loss | 45.9 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 219 |\n", "| ep_rew_mean | 272 |\n", "| time/ | |\n", "| fps | 754 |\n", "| iterations | 75 |\n", "| time_elapsed | 814 |\n", "| total_timesteps | 614400 |\n", "| train/ | |\n", "| approx_kl | 0.01060657 |\n", "| clip_fraction | 0.111 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.544 |\n", "| explained_variance | 0.957 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.6 |\n", "| n_updates | 740 |\n", "| policy_gradient_loss | -0.00187 |\n", "| value_loss | 25 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 231 |\n", "| ep_rew_mean | 274 |\n", "| time/ | |\n", "| fps | 755 |\n", "| iterations | 76 |\n", "| time_elapsed | 824 |\n", "| total_timesteps | 622592 |\n", "| train/ | |\n", "| approx_kl | 0.010488458 |\n", "| clip_fraction | 0.111 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.546 |\n", "| explained_variance | 0.98 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.72 |\n", "| n_updates | 750 |\n", "| policy_gradient_loss | 0.00114 |\n", "| value_loss | 8.52 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 242 |\n", "| ep_rew_mean | 276 |\n", "| time/ | |\n", "| fps | 756 |\n", "| iterations | 77 |\n", "| time_elapsed | 833 |\n", "| total_timesteps | 630784 |\n", "| train/ | |\n", "| approx_kl | 0.0069501195 |\n", "| clip_fraction | 0.0803 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.525 |\n", "| explained_variance | 0.984 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.74 |\n", "| n_updates | 760 |\n", "| policy_gradient_loss | -0.0001 |\n", "| value_loss | 10 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 256 |\n", "| ep_rew_mean | 274 |\n", "| time/ | |\n", "| fps | 758 |\n", "| iterations | 78 |\n", "| time_elapsed | 842 |\n", "| total_timesteps | 638976 |\n", "| train/ | |\n", "| approx_kl | 0.011625826 |\n", "| clip_fraction | 0.107 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.539 |\n", "| explained_variance | 0.966 |\n", "| learning_rate | 0.00112 |\n", "| loss | 8.11 |\n", "| n_updates | 770 |\n", "| policy_gradient_loss | -0.00158 |\n", "| value_loss | 20.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 249 |\n", "| ep_rew_mean | 274 |\n", "| time/ | |\n", "| fps | 759 |\n", "| iterations | 79 |\n", "| time_elapsed | 851 |\n", "| total_timesteps | 647168 |\n", "| train/ | |\n", "| approx_kl | 0.009716832 |\n", "| clip_fraction | 0.107 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.546 |\n", "| explained_variance | 0.988 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.56 |\n", "| n_updates | 780 |\n", "| policy_gradient_loss | 0.000131 |\n", "| value_loss | 6.82 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 241 |\n", "| ep_rew_mean | 277 |\n", "| time/ | |\n", "| fps | 760 |\n", "| iterations | 80 |\n", "| time_elapsed | 861 |\n", "| total_timesteps | 655360 |\n", "| train/ | |\n", "| approx_kl | 0.012574303 |\n", "| clip_fraction | 0.11 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.493 |\n", "| explained_variance | 0.99 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.92 |\n", "| n_updates | 790 |\n", "| policy_gradient_loss | -0.00356 |\n", "| value_loss | 8.06 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 222 |\n", "| ep_rew_mean | 277 |\n", "| time/ | |\n", "| fps | 763 |\n", "| iterations | 81 |\n", "| time_elapsed | 869 |\n", "| total_timesteps | 663552 |\n", "| train/ | |\n", "| approx_kl | 0.008756278 |\n", "| clip_fraction | 0.0861 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.499 |\n", "| explained_variance | 0.987 |\n", "| learning_rate | 0.00112 |\n", "| loss | 6.86 |\n", "| n_updates | 800 |\n", "| policy_gradient_loss | -0.00194 |\n", "| value_loss | 8.96 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 209 |\n", "| ep_rew_mean | 277 |\n", "| time/ | |\n", "| fps | 765 |\n", "| iterations | 82 |\n", "| time_elapsed | 877 |\n", "| total_timesteps | 671744 |\n", "| train/ | |\n", "| approx_kl | 0.011609758 |\n", "| clip_fraction | 0.114 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.494 |\n", "| explained_variance | 0.964 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.44 |\n", "| n_updates | 810 |\n", "| policy_gradient_loss | -0.00104 |\n", "| value_loss | 10.9 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 202 |\n", "| ep_rew_mean | 283 |\n", "| time/ | |\n", "| fps | 766 |\n", "| iterations | 83 |\n", "| time_elapsed | 886 |\n", "| total_timesteps | 679936 |\n", "| train/ | |\n", "| approx_kl | 0.00815385 |\n", "| clip_fraction | 0.0868 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.435 |\n", "| explained_variance | 0.99 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.55 |\n", "| n_updates | 820 |\n", "| policy_gradient_loss | 0.00203 |\n", "| value_loss | 6.94 |\n", "----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 202 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 768 |\n", "| iterations | 84 |\n", "| time_elapsed | 894 |\n", "| total_timesteps | 688128 |\n", "| train/ | |\n", "| approx_kl | 0.00922096 |\n", "| clip_fraction | 0.0924 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.463 |\n", "| explained_variance | 0.991 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.08 |\n", "| n_updates | 830 |\n", "| policy_gradient_loss | -0.00248 |\n", "| value_loss | 6.89 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 200 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 770 |\n", "| iterations | 85 |\n", "| time_elapsed | 903 |\n", "| total_timesteps | 696320 |\n", "| train/ | |\n", "| approx_kl | 0.008536881 |\n", "| clip_fraction | 0.0941 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.466 |\n", "| explained_variance | 0.985 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.67 |\n", "| n_updates | 840 |\n", "| policy_gradient_loss | -0.000408 |\n", "| value_loss | 6.32 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 200 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 771 |\n", "| iterations | 86 |\n", "| time_elapsed | 912 |\n", "| total_timesteps | 704512 |\n", "| train/ | |\n", "| approx_kl | 0.014402258 |\n", "| clip_fraction | 0.0805 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.428 |\n", "| explained_variance | 0.919 |\n", "| learning_rate | 0.00112 |\n", "| loss | 6.99 |\n", "| n_updates | 850 |\n", "| policy_gradient_loss | -0.00194 |\n", "| value_loss | 66.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 210 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 774 |\n", "| iterations | 87 |\n", "| time_elapsed | 920 |\n", "| total_timesteps | 712704 |\n", "| train/ | |\n", "| approx_kl | 0.008684108 |\n", "| clip_fraction | 0.0929 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.445 |\n", "| explained_variance | 0.986 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.65 |\n", "| n_updates | 860 |\n", "| policy_gradient_loss | 5.92e-05 |\n", "| value_loss | 6.19 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 203 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 775 |\n", "| iterations | 88 |\n", "| time_elapsed | 929 |\n", "| total_timesteps | 720896 |\n", "| train/ | |\n", "| approx_kl | 0.012579354 |\n", "| clip_fraction | 0.11 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.453 |\n", "| explained_variance | 0.952 |\n", "| learning_rate | 0.00112 |\n", "| loss | 7.62 |\n", "| n_updates | 870 |\n", "| policy_gradient_loss | -0.00121 |\n", "| value_loss | 8.73 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 209 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 775 |\n", "| iterations | 89 |\n", "| time_elapsed | 939 |\n", "| total_timesteps | 729088 |\n", "| train/ | |\n", "| approx_kl | 0.0132595375 |\n", "| clip_fraction | 0.0988 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.5 |\n", "| explained_variance | 0.983 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.94 |\n", "| n_updates | 880 |\n", "| policy_gradient_loss | -0.000703 |\n", "| value_loss | 6.86 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 223 |\n", "| ep_rew_mean | 275 |\n", "| time/ | |\n", "| fps | 778 |\n", "| iterations | 90 |\n", "| time_elapsed | 947 |\n", "| total_timesteps | 737280 |\n", "| train/ | |\n", "| approx_kl | 0.013090314 |\n", "| clip_fraction | 0.0992 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.523 |\n", "| explained_variance | 0.953 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.96 |\n", "| n_updates | 890 |\n", "| policy_gradient_loss | -0.0011 |\n", "| value_loss | 44.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 204 |\n", "| ep_rew_mean | 276 |\n", "| time/ | |\n", "| fps | 779 |\n", "| iterations | 91 |\n", "| time_elapsed | 956 |\n", "| total_timesteps | 745472 |\n", "| train/ | |\n", "| approx_kl | 0.012395062 |\n", "| clip_fraction | 0.114 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.443 |\n", "| explained_variance | 0.958 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.54 |\n", "| n_updates | 900 |\n", "| policy_gradient_loss | -0.00132 |\n", "| value_loss | 43.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 187 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 781 |\n", "| iterations | 92 |\n", "| time_elapsed | 964 |\n", "| total_timesteps | 753664 |\n", "| train/ | |\n", "| approx_kl | 0.010506206 |\n", "| clip_fraction | 0.106 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.41 |\n", "| explained_variance | 0.992 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.77 |\n", "| n_updates | 910 |\n", "| policy_gradient_loss | -0.000594 |\n", "| value_loss | 5.95 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 184 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 783 |\n", "| iterations | 93 |\n", "| time_elapsed | 972 |\n", "| total_timesteps | 761856 |\n", "| train/ | |\n", "| approx_kl | 0.010762966 |\n", "| clip_fraction | 0.0979 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.422 |\n", "| explained_variance | 0.959 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.94 |\n", "| n_updates | 920 |\n", "| policy_gradient_loss | -0.00225 |\n", "| value_loss | 22 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 192 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 784 |\n", "| iterations | 94 |\n", "| time_elapsed | 981 |\n", "| total_timesteps | 770048 |\n", "| train/ | |\n", "| approx_kl | 0.009342314 |\n", "| clip_fraction | 0.0885 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.461 |\n", "| explained_variance | 0.96 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.11 |\n", "| n_updates | 930 |\n", "| policy_gradient_loss | -0.00123 |\n", "| value_loss | 27.8 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 194 |\n", "| ep_rew_mean | 283 |\n", "| time/ | |\n", "| fps | 785 |\n", "| iterations | 95 |\n", "| time_elapsed | 990 |\n", "| total_timesteps | 778240 |\n", "| train/ | |\n", "| approx_kl | 0.013833285 |\n", "| clip_fraction | 0.11 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.446 |\n", "| explained_variance | 0.989 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.8 |\n", "| n_updates | 940 |\n", "| policy_gradient_loss | -0.000528 |\n", "| value_loss | 7.06 |\n", "-----------------------------------------\n", "---------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 196 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 787 |\n", "| iterations | 96 |\n", "| time_elapsed | 998 |\n", "| total_timesteps | 786432 |\n", "| train/ | |\n", "| approx_kl | 0.0163402 |\n", "| clip_fraction | 0.11 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.433 |\n", "| explained_variance | 0.991 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.58 |\n", "| n_updates | 950 |\n", "| policy_gradient_loss | 0.000378 |\n", "| value_loss | 5.84 |\n", "---------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 202 |\n", "| ep_rew_mean | 285 |\n", "| time/ | |\n", "| fps | 788 |\n", "| iterations | 97 |\n", "| time_elapsed | 1007 |\n", "| total_timesteps | 794624 |\n", "| train/ | |\n", "| approx_kl | 0.008656707 |\n", "| clip_fraction | 0.0845 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.434 |\n", "| explained_variance | 0.99 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.4 |\n", "| n_updates | 960 |\n", "| policy_gradient_loss | -0.000407 |\n", "| value_loss | 6.17 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 199 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 789 |\n", "| iterations | 98 |\n", "| time_elapsed | 1016 |\n", "| total_timesteps | 802816 |\n", "| train/ | |\n", "| approx_kl | 0.01054048 |\n", "| clip_fraction | 0.0949 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.424 |\n", "| explained_variance | 0.993 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.25 |\n", "| n_updates | 970 |\n", "| policy_gradient_loss | -0.00161 |\n", "| value_loss | 5.78 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 202 |\n", "| ep_rew_mean | 286 |\n", "| time/ | |\n", "| fps | 791 |\n", "| iterations | 99 |\n", "| time_elapsed | 1024 |\n", "| total_timesteps | 811008 |\n", "| train/ | |\n", "| approx_kl | 0.014685566 |\n", "| clip_fraction | 0.111 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.45 |\n", "| explained_variance | 0.992 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.96 |\n", "| n_updates | 980 |\n", "| policy_gradient_loss | -0.00712 |\n", "| value_loss | 6.13 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 200 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 792 |\n", "| iterations | 100 |\n", "| time_elapsed | 1033 |\n", "| total_timesteps | 819200 |\n", "| train/ | |\n", "| approx_kl | 0.0129933255 |\n", "| clip_fraction | 0.0934 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.432 |\n", "| explained_variance | 0.995 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.34 |\n", "| n_updates | 990 |\n", "| policy_gradient_loss | -0.000785 |\n", "| value_loss | 5.58 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 202 |\n", "| ep_rew_mean | 285 |\n", "| time/ | |\n", "| fps | 793 |\n", "| iterations | 101 |\n", "| time_elapsed | 1042 |\n", "| total_timesteps | 827392 |\n", "| train/ | |\n", "| approx_kl | 0.010801972 |\n", "| clip_fraction | 0.0976 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.404 |\n", "| explained_variance | 0.968 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.23 |\n", "| n_updates | 1000 |\n", "| policy_gradient_loss | -0.000655 |\n", "| value_loss | 9.81 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 199 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 795 |\n", "| iterations | 102 |\n", "| time_elapsed | 1050 |\n", "| total_timesteps | 835584 |\n", "| train/ | |\n", "| approx_kl | 0.010502162 |\n", "| clip_fraction | 0.0931 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.463 |\n", "| explained_variance | 0.988 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.74 |\n", "| n_updates | 1010 |\n", "| policy_gradient_loss | 0.000105 |\n", "| value_loss | 4.68 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 208 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 796 |\n", "| iterations | 103 |\n", "| time_elapsed | 1059 |\n", "| total_timesteps | 843776 |\n", "| train/ | |\n", "| approx_kl | 0.009636023 |\n", "| clip_fraction | 0.0985 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.442 |\n", "| explained_variance | 0.976 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.33 |\n", "| n_updates | 1020 |\n", "| policy_gradient_loss | -0.00202 |\n", "| value_loss | 7.95 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 206 |\n", "| ep_rew_mean | 283 |\n", "| time/ | |\n", "| fps | 797 |\n", "| iterations | 104 |\n", "| time_elapsed | 1068 |\n", "| total_timesteps | 851968 |\n", "| train/ | |\n", "| approx_kl | 0.008595383 |\n", "| clip_fraction | 0.0899 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.425 |\n", "| explained_variance | 0.951 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.46 |\n", "| n_updates | 1030 |\n", "| policy_gradient_loss | -0.00298 |\n", "| value_loss | 26.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 201 |\n", "| ep_rew_mean | 290 |\n", "| time/ | |\n", "| fps | 799 |\n", "| iterations | 105 |\n", "| time_elapsed | 1076 |\n", "| total_timesteps | 860160 |\n", "| train/ | |\n", "| approx_kl | 0.011436255 |\n", "| clip_fraction | 0.102 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.46 |\n", "| explained_variance | 0.981 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.47 |\n", "| n_updates | 1040 |\n", "| policy_gradient_loss | -0.00071 |\n", "| value_loss | 9.55 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 206 |\n", "| ep_rew_mean | 290 |\n", "| time/ | |\n", "| fps | 800 |\n", "| iterations | 106 |\n", "| time_elapsed | 1084 |\n", "| total_timesteps | 868352 |\n", "| train/ | |\n", "| approx_kl | 0.009246144 |\n", "| clip_fraction | 0.0943 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.409 |\n", "| explained_variance | 0.985 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.37 |\n", "| n_updates | 1050 |\n", "| policy_gradient_loss | -0.00171 |\n", "| value_loss | 12.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 192 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 801 |\n", "| iterations | 107 |\n", "| time_elapsed | 1093 |\n", "| total_timesteps | 876544 |\n", "| train/ | |\n", "| approx_kl | 0.010345182 |\n", "| clip_fraction | 0.103 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.385 |\n", "| explained_variance | 0.991 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.12 |\n", "| n_updates | 1060 |\n", "| policy_gradient_loss | -0.000463 |\n", "| value_loss | 6.57 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 195 |\n", "| ep_rew_mean | 288 |\n", "| time/ | |\n", "| fps | 802 |\n", "| iterations | 108 |\n", "| time_elapsed | 1102 |\n", "| total_timesteps | 884736 |\n", "| train/ | |\n", "| approx_kl | 0.012043006 |\n", "| clip_fraction | 0.102 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.439 |\n", "| explained_variance | 0.991 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.01 |\n", "| n_updates | 1070 |\n", "| policy_gradient_loss | 0.00203 |\n", "| value_loss | 6.54 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 209 |\n", "| ep_rew_mean | 281 |\n", "| time/ | |\n", "| fps | 803 |\n", "| iterations | 109 |\n", "| time_elapsed | 1111 |\n", "| total_timesteps | 892928 |\n", "| train/ | |\n", "| approx_kl | 0.024911318 |\n", "| clip_fraction | 0.116 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.464 |\n", "| explained_variance | 0.989 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.13 |\n", "| n_updates | 1080 |\n", "| policy_gradient_loss | -0.00789 |\n", "| value_loss | 7.93 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 203 |\n", "| ep_rew_mean | 272 |\n", "| time/ | |\n", "| fps | 804 |\n", "| iterations | 110 |\n", "| time_elapsed | 1120 |\n", "| total_timesteps | 901120 |\n", "| train/ | |\n", "| approx_kl | 0.012062022 |\n", "| clip_fraction | 0.0876 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.416 |\n", "| explained_variance | 0.94 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.31 |\n", "| n_updates | 1090 |\n", "| policy_gradient_loss | -0.00182 |\n", "| value_loss | 48.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 198 |\n", "| ep_rew_mean | 267 |\n", "| time/ | |\n", "| fps | 805 |\n", "| iterations | 111 |\n", "| time_elapsed | 1129 |\n", "| total_timesteps | 909312 |\n", "| train/ | |\n", "| approx_kl | 0.012001529 |\n", "| clip_fraction | 0.112 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.429 |\n", "| explained_variance | 0.927 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.72 |\n", "| n_updates | 1100 |\n", "| policy_gradient_loss | -0.00103 |\n", "| value_loss | 57.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 195 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 806 |\n", "| iterations | 112 |\n", "| time_elapsed | 1137 |\n", "| total_timesteps | 917504 |\n", "| train/ | |\n", "| approx_kl | 0.009573877 |\n", "| clip_fraction | 0.0871 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.429 |\n", "| explained_variance | 0.97 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.87 |\n", "| n_updates | 1110 |\n", "| policy_gradient_loss | -0.00138 |\n", "| value_loss | 26.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 194 |\n", "| ep_rew_mean | 277 |\n", "| time/ | |\n", "| fps | 807 |\n", "| iterations | 113 |\n", "| time_elapsed | 1146 |\n", "| total_timesteps | 925696 |\n", "| train/ | |\n", "| approx_kl | 0.013059037 |\n", "| clip_fraction | 0.119 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.401 |\n", "| explained_variance | 0.992 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.2 |\n", "| n_updates | 1120 |\n", "| policy_gradient_loss | -0.00174 |\n", "| value_loss | 5.42 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 188 |\n", "| ep_rew_mean | 280 |\n", "| time/ | |\n", "| fps | 808 |\n", "| iterations | 114 |\n", "| time_elapsed | 1154 |\n", "| total_timesteps | 933888 |\n", "| train/ | |\n", "| approx_kl | 0.010725938 |\n", "| clip_fraction | 0.0849 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.394 |\n", "| explained_variance | 0.907 |\n", "| learning_rate | 0.00112 |\n", "| loss | 12.6 |\n", "| n_updates | 1130 |\n", "| policy_gradient_loss | -0.00281 |\n", "| value_loss | 59.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 196 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 809 |\n", "| iterations | 115 |\n", "| time_elapsed | 1163 |\n", "| total_timesteps | 942080 |\n", "| train/ | |\n", "| approx_kl | 0.018430535 |\n", "| clip_fraction | 0.142 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.422 |\n", "| explained_variance | 0.979 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.52 |\n", "| n_updates | 1140 |\n", "| policy_gradient_loss | -0.00327 |\n", "| value_loss | 8.15 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 196 |\n", "| ep_rew_mean | 280 |\n", "| time/ | |\n", "| fps | 811 |\n", "| iterations | 116 |\n", "| time_elapsed | 1171 |\n", "| total_timesteps | 950272 |\n", "| train/ | |\n", "| approx_kl | 0.0103278905 |\n", "| clip_fraction | 0.096 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.409 |\n", "| explained_variance | 0.967 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.66 |\n", "| n_updates | 1150 |\n", "| policy_gradient_loss | 1.32e-05 |\n", "| value_loss | 11.2 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 189 |\n", "| ep_rew_mean | 286 |\n", "| time/ | |\n", "| fps | 812 |\n", "| iterations | 117 |\n", "| time_elapsed | 1179 |\n", "| total_timesteps | 958464 |\n", "| train/ | |\n", "| approx_kl | 0.013442278 |\n", "| clip_fraction | 0.106 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.402 |\n", "| explained_variance | 0.989 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.43 |\n", "| n_updates | 1160 |\n", "| policy_gradient_loss | -0.000461 |\n", "| value_loss | 6.56 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 187 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 813 |\n", "| iterations | 118 |\n", "| time_elapsed | 1188 |\n", "| total_timesteps | 966656 |\n", "| train/ | |\n", "| approx_kl | 0.008221455 |\n", "| clip_fraction | 0.0786 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.38 |\n", "| explained_variance | 0.965 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.85 |\n", "| n_updates | 1170 |\n", "| policy_gradient_loss | -0.00117 |\n", "| value_loss | 22 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 184 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 814 |\n", "| iterations | 119 |\n", "| time_elapsed | 1196 |\n", "| total_timesteps | 974848 |\n", "| train/ | |\n", "| approx_kl | 0.016034331 |\n", "| clip_fraction | 0.123 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.404 |\n", "| explained_variance | 0.945 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.6 |\n", "| n_updates | 1180 |\n", "| policy_gradient_loss | -0.00304 |\n", "| value_loss | 39 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 191 |\n", "| ep_rew_mean | 281 |\n", "| time/ | |\n", "| fps | 816 |\n", "| iterations | 120 |\n", "| time_elapsed | 1204 |\n", "| total_timesteps | 983040 |\n", "| train/ | |\n", "| approx_kl | 0.0139214955 |\n", "| clip_fraction | 0.108 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.382 |\n", "| explained_variance | 0.959 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.65 |\n", "| n_updates | 1190 |\n", "| policy_gradient_loss | -0.00231 |\n", "| value_loss | 12.1 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 193 |\n", "| ep_rew_mean | 286 |\n", "| time/ | |\n", "| fps | 816 |\n", "| iterations | 121 |\n", "| time_elapsed | 1213 |\n", "| total_timesteps | 991232 |\n", "| train/ | |\n", "| approx_kl | 0.016680727 |\n", "| clip_fraction | 0.0803 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.392 |\n", "| explained_variance | 0.986 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.89 |\n", "| n_updates | 1200 |\n", "| policy_gradient_loss | -0.000347 |\n", "| value_loss | 10.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 196 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 817 |\n", "| iterations | 122 |\n", "| time_elapsed | 1222 |\n", "| total_timesteps | 999424 |\n", "| train/ | |\n", "| approx_kl | 0.010871852 |\n", "| clip_fraction | 0.103 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.391 |\n", "| explained_variance | 0.994 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.94 |\n", "| n_updates | 1210 |\n", "| policy_gradient_loss | 6.12e-05 |\n", "| value_loss | 4.93 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 201 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 818 |\n", "| iterations | 123 |\n", "| time_elapsed | 1231 |\n", "| total_timesteps | 1007616 |\n", "| train/ | |\n", "| approx_kl | 0.009346518 |\n", "| clip_fraction | 0.0919 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.422 |\n", "| explained_variance | 0.972 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.51 |\n", "| n_updates | 1220 |\n", "| policy_gradient_loss | -0.00128 |\n", "| value_loss | 36.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 200 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 818 |\n", "| iterations | 124 |\n", "| time_elapsed | 1240 |\n", "| total_timesteps | 1015808 |\n", "| train/ | |\n", "| approx_kl | 0.009216516 |\n", "| clip_fraction | 0.0975 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.424 |\n", "| explained_variance | 0.981 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2 |\n", "| n_updates | 1230 |\n", "| policy_gradient_loss | -0.00122 |\n", "| value_loss | 8.58 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 193 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 819 |\n", "| iterations | 125 |\n", "| time_elapsed | 1249 |\n", "| total_timesteps | 1024000 |\n", "| train/ | |\n", "| approx_kl | 0.011388582 |\n", "| clip_fraction | 0.0948 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.426 |\n", "| explained_variance | 0.993 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.6 |\n", "| n_updates | 1240 |\n", "| policy_gradient_loss | -0.000579 |\n", "| value_loss | 7.02 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 193 |\n", "| ep_rew_mean | 285 |\n", "| time/ | |\n", "| fps | 820 |\n", "| iterations | 126 |\n", "| time_elapsed | 1257 |\n", "| total_timesteps | 1032192 |\n", "| train/ | |\n", "| approx_kl | 0.014873717 |\n", "| clip_fraction | 0.115 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.406 |\n", "| explained_variance | 0.988 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.82 |\n", "| n_updates | 1250 |\n", "| policy_gradient_loss | -0.000279 |\n", "| value_loss | 6.25 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 194 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 821 |\n", "| iterations | 127 |\n", "| time_elapsed | 1265 |\n", "| total_timesteps | 1040384 |\n", "| train/ | |\n", "| approx_kl | 0.014328106 |\n", "| clip_fraction | 0.112 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.405 |\n", "| explained_variance | 0.986 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.57 |\n", "| n_updates | 1260 |\n", "| policy_gradient_loss | -0.00349 |\n", "| value_loss | 6.86 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 186 |\n", "| ep_rew_mean | 289 |\n", "| time/ | |\n", "| fps | 822 |\n", "| iterations | 128 |\n", "| time_elapsed | 1274 |\n", "| total_timesteps | 1048576 |\n", "| train/ | |\n", "| approx_kl | 0.012116497 |\n", "| clip_fraction | 0.104 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.424 |\n", "| explained_variance | 0.993 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.03 |\n", "| n_updates | 1270 |\n", "| policy_gradient_loss | -0.00219 |\n", "| value_loss | 5.55 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 194 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 823 |\n", "| iterations | 129 |\n", "| time_elapsed | 1282 |\n", "| total_timesteps | 1056768 |\n", "| train/ | |\n", "| approx_kl | 0.013293551 |\n", "| clip_fraction | 0.0965 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.426 |\n", "| explained_variance | 0.989 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.87 |\n", "| n_updates | 1280 |\n", "| policy_gradient_loss | -0.00648 |\n", "| value_loss | 6.52 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 196 |\n", "| ep_rew_mean | 280 |\n", "| time/ | |\n", "| fps | 824 |\n", "| iterations | 130 |\n", "| time_elapsed | 1291 |\n", "| total_timesteps | 1064960 |\n", "| train/ | |\n", "| approx_kl | 0.008993432 |\n", "| clip_fraction | 0.0885 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.427 |\n", "| explained_variance | 0.992 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.92 |\n", "| n_updates | 1290 |\n", "| policy_gradient_loss | 6.59e-05 |\n", "| value_loss | 5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 195 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 825 |\n", "| iterations | 131 |\n", "| time_elapsed | 1300 |\n", "| total_timesteps | 1073152 |\n", "| train/ | |\n", "| approx_kl | 0.008734301 |\n", "| clip_fraction | 0.0776 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.392 |\n", "| explained_variance | 0.933 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.16 |\n", "| n_updates | 1300 |\n", "| policy_gradient_loss | -0.0015 |\n", "| value_loss | 30 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 181 |\n", "| ep_rew_mean | 283 |\n", "| time/ | |\n", "| fps | 826 |\n", "| iterations | 132 |\n", "| time_elapsed | 1308 |\n", "| total_timesteps | 1081344 |\n", "| train/ | |\n", "| approx_kl | 0.012236496 |\n", "| clip_fraction | 0.0996 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.375 |\n", "| explained_variance | 0.991 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.44 |\n", "| n_updates | 1310 |\n", "| policy_gradient_loss | -0.00274 |\n", "| value_loss | 7.18 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 180 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 827 |\n", "| iterations | 133 |\n", "| time_elapsed | 1316 |\n", "| total_timesteps | 1089536 |\n", "| train/ | |\n", "| approx_kl | 0.01221641 |\n", "| clip_fraction | 0.119 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.398 |\n", "| explained_variance | 0.987 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.46 |\n", "| n_updates | 1320 |\n", "| policy_gradient_loss | -0.00191 |\n", "| value_loss | 6.09 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 185 |\n", "| ep_rew_mean | 276 |\n", "| time/ | |\n", "| fps | 828 |\n", "| iterations | 134 |\n", "| time_elapsed | 1325 |\n", "| total_timesteps | 1097728 |\n", "| train/ | |\n", "| approx_kl | 0.008897923 |\n", "| clip_fraction | 0.0887 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.432 |\n", "| explained_variance | 0.986 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.31 |\n", "| n_updates | 1330 |\n", "| policy_gradient_loss | -0.000364 |\n", "| value_loss | 6.38 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 187 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 829 |\n", "| iterations | 135 |\n", "| time_elapsed | 1333 |\n", "| total_timesteps | 1105920 |\n", "| train/ | |\n", "| approx_kl | 0.012878727 |\n", "| clip_fraction | 0.0996 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.407 |\n", "| explained_variance | 0.958 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.21 |\n", "| n_updates | 1340 |\n", "| policy_gradient_loss | 0.000802 |\n", "| value_loss | 24.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 182 |\n", "| ep_rew_mean | 286 |\n", "| time/ | |\n", "| fps | 829 |\n", "| iterations | 136 |\n", "| time_elapsed | 1342 |\n", "| total_timesteps | 1114112 |\n", "| train/ | |\n", "| approx_kl | 0.011373941 |\n", "| clip_fraction | 0.109 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.4 |\n", "| explained_variance | 0.973 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.67 |\n", "| n_updates | 1350 |\n", "| policy_gradient_loss | 0.000548 |\n", "| value_loss | 6.84 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 192 |\n", "| ep_rew_mean | 285 |\n", "| time/ | |\n", "| fps | 830 |\n", "| iterations | 137 |\n", "| time_elapsed | 1350 |\n", "| total_timesteps | 1122304 |\n", "| train/ | |\n", "| approx_kl | 0.013770735 |\n", "| clip_fraction | 0.109 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.429 |\n", "| explained_variance | 0.992 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.56 |\n", "| n_updates | 1360 |\n", "| policy_gradient_loss | 0.000384 |\n", "| value_loss | 5.67 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 190 |\n", "| ep_rew_mean | 283 |\n", "| time/ | |\n", "| fps | 832 |\n", "| iterations | 138 |\n", "| time_elapsed | 1358 |\n", "| total_timesteps | 1130496 |\n", "| train/ | |\n", "| approx_kl | 0.012790033 |\n", "| clip_fraction | 0.0958 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.391 |\n", "| explained_variance | 0.974 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.79 |\n", "| n_updates | 1370 |\n", "| policy_gradient_loss | -0.002 |\n", "| value_loss | 13.5 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 180 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 832 |\n", "| iterations | 139 |\n", "| time_elapsed | 1367 |\n", "| total_timesteps | 1138688 |\n", "| train/ | |\n", "| approx_kl | 0.0149136875 |\n", "| clip_fraction | 0.102 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.38 |\n", "| explained_variance | 0.995 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.73 |\n", "| n_updates | 1380 |\n", "| policy_gradient_loss | 0.000362 |\n", "| value_loss | 5.7 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 175 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 833 |\n", "| iterations | 140 |\n", "| time_elapsed | 1375 |\n", "| total_timesteps | 1146880 |\n", "| train/ | |\n", "| approx_kl | 0.011700343 |\n", "| clip_fraction | 0.106 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.405 |\n", "| explained_variance | 0.957 |\n", "| learning_rate | 0.00112 |\n", "| loss | 40 |\n", "| n_updates | 1390 |\n", "| policy_gradient_loss | -0.0024 |\n", "| value_loss | 43.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 169 |\n", "| ep_rew_mean | 276 |\n", "| time/ | |\n", "| fps | 834 |\n", "| iterations | 141 |\n", "| time_elapsed | 1383 |\n", "| total_timesteps | 1155072 |\n", "| train/ | |\n", "| approx_kl | 0.011100046 |\n", "| clip_fraction | 0.112 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.383 |\n", "| explained_variance | 0.992 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.55 |\n", "| n_updates | 1400 |\n", "| policy_gradient_loss | 0.00113 |\n", "| value_loss | 5.74 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 180 |\n", "| ep_rew_mean | 272 |\n", "| time/ | |\n", "| fps | 835 |\n", "| iterations | 142 |\n", "| time_elapsed | 1392 |\n", "| total_timesteps | 1163264 |\n", "| train/ | |\n", "| approx_kl | 0.007867305 |\n", "| clip_fraction | 0.0784 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.41 |\n", "| explained_variance | 0.948 |\n", "| learning_rate | 0.00112 |\n", "| loss | 6.6 |\n", "| n_updates | 1410 |\n", "| policy_gradient_loss | -0.00298 |\n", "| value_loss | 44.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 185 |\n", "| ep_rew_mean | 277 |\n", "| time/ | |\n", "| fps | 836 |\n", "| iterations | 143 |\n", "| time_elapsed | 1400 |\n", "| total_timesteps | 1171456 |\n", "| train/ | |\n", "| approx_kl | 0.013205802 |\n", "| clip_fraction | 0.101 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.388 |\n", "| explained_variance | 0.968 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.98 |\n", "| n_updates | 1420 |\n", "| policy_gradient_loss | -0.000663 |\n", "| value_loss | 24.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 176 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 837 |\n", "| iterations | 144 |\n", "| time_elapsed | 1408 |\n", "| total_timesteps | 1179648 |\n", "| train/ | |\n", "| approx_kl | 0.012715327 |\n", "| clip_fraction | 0.114 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.399 |\n", "| explained_variance | 0.993 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.07 |\n", "| n_updates | 1430 |\n", "| policy_gradient_loss | -0.000222 |\n", "| value_loss | 5.64 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 183 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 838 |\n", "| iterations | 145 |\n", "| time_elapsed | 1417 |\n", "| total_timesteps | 1187840 |\n", "| train/ | |\n", "| approx_kl | 0.010593424 |\n", "| clip_fraction | 0.0911 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.393 |\n", "| explained_variance | 0.994 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.57 |\n", "| n_updates | 1440 |\n", "| policy_gradient_loss | -0.000939 |\n", "| value_loss | 5.61 |\n", "-----------------------------------------\n", "------------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 187 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 839 |\n", "| iterations | 146 |\n", "| time_elapsed | 1425 |\n", "| total_timesteps | 1196032 |\n", "| train/ | |\n", "| approx_kl | 0.0091911405 |\n", "| clip_fraction | 0.079 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.372 |\n", "| explained_variance | 0.953 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.77 |\n", "| n_updates | 1450 |\n", "| policy_gradient_loss | -0.00158 |\n", "| value_loss | 52.8 |\n", "------------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 187 |\n", "| ep_rew_mean | 285 |\n", "| time/ | |\n", "| fps | 839 |\n", "| iterations | 147 |\n", "| time_elapsed | 1433 |\n", "| total_timesteps | 1204224 |\n", "| train/ | |\n", "| approx_kl | 0.020347305 |\n", "| clip_fraction | 0.116 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.385 |\n", "| explained_variance | 0.987 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.12 |\n", "| n_updates | 1460 |\n", "| policy_gradient_loss | -0.00284 |\n", "| value_loss | 8.19 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 182 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 840 |\n", "| iterations | 148 |\n", "| time_elapsed | 1442 |\n", "| total_timesteps | 1212416 |\n", "| train/ | |\n", "| approx_kl | 0.01209308 |\n", "| clip_fraction | 0.0854 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.347 |\n", "| explained_variance | 0.954 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.2 |\n", "| n_updates | 1470 |\n", "| policy_gradient_loss | 0.00135 |\n", "| value_loss | 55.8 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 180 |\n", "| ep_rew_mean | 280 |\n", "| time/ | |\n", "| fps | 841 |\n", "| iterations | 149 |\n", "| time_elapsed | 1450 |\n", "| total_timesteps | 1220608 |\n", "| train/ | |\n", "| approx_kl | 0.011741241 |\n", "| clip_fraction | 0.103 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.392 |\n", "| explained_variance | 0.984 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.14 |\n", "| n_updates | 1480 |\n", "| policy_gradient_loss | 0.00122 |\n", "| value_loss | 8.76 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 182 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 842 |\n", "| iterations | 150 |\n", "| time_elapsed | 1458 |\n", "| total_timesteps | 1228800 |\n", "| train/ | |\n", "| approx_kl | 0.009493947 |\n", "| clip_fraction | 0.0961 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.35 |\n", "| explained_variance | 0.947 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.06 |\n", "| n_updates | 1490 |\n", "| policy_gradient_loss | -0.00245 |\n", "| value_loss | 28.7 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 176 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 843 |\n", "| iterations | 151 |\n", "| time_elapsed | 1467 |\n", "| total_timesteps | 1236992 |\n", "| train/ | |\n", "| approx_kl | 0.012886401 |\n", "| clip_fraction | 0.104 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.352 |\n", "| explained_variance | 0.991 |\n", "| learning_rate | 0.00112 |\n", "| loss | 7.95 |\n", "| n_updates | 1500 |\n", "| policy_gradient_loss | 0.000245 |\n", "| value_loss | 7.71 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 170 |\n", "| ep_rew_mean | 272 |\n", "| time/ | |\n", "| fps | 844 |\n", "| iterations | 152 |\n", "| time_elapsed | 1475 |\n", "| total_timesteps | 1245184 |\n", "| train/ | |\n", "| approx_kl | 0.006650541 |\n", "| clip_fraction | 0.0771 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.37 |\n", "| explained_variance | 0.941 |\n", "| learning_rate | 0.00112 |\n", "| loss | 17.2 |\n", "| n_updates | 1510 |\n", "| policy_gradient_loss | -0.000548 |\n", "| value_loss | 73.4 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 173 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 844 |\n", "| iterations | 153 |\n", "| time_elapsed | 1483 |\n", "| total_timesteps | 1253376 |\n", "| train/ | |\n", "| approx_kl | 0.018804958 |\n", "| clip_fraction | 0.118 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.354 |\n", "| explained_variance | 0.944 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.28 |\n", "| n_updates | 1520 |\n", "| policy_gradient_loss | 0.000552 |\n", "| value_loss | 29.5 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 179 |\n", "| ep_rew_mean | 288 |\n", "| time/ | |\n", "| fps | 845 |\n", "| iterations | 154 |\n", "| time_elapsed | 1492 |\n", "| total_timesteps | 1261568 |\n", "| train/ | |\n", "| approx_kl | 0.016751654 |\n", "| clip_fraction | 0.118 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.375 |\n", "| explained_variance | 0.979 |\n", "| learning_rate | 0.00112 |\n", "| loss | 12.1 |\n", "| n_updates | 1530 |\n", "| policy_gradient_loss | -0.00202 |\n", "| value_loss | 10.9 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 180 |\n", "| ep_rew_mean | 285 |\n", "| time/ | |\n", "| fps | 845 |\n", "| iterations | 155 |\n", "| time_elapsed | 1500 |\n", "| total_timesteps | 1269760 |\n", "| train/ | |\n", "| approx_kl | 0.013677096 |\n", "| clip_fraction | 0.105 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.376 |\n", "| explained_variance | 0.984 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.75 |\n", "| n_updates | 1540 |\n", "| policy_gradient_loss | -0.000799 |\n", "| value_loss | 8.22 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 184 |\n", "| ep_rew_mean | 279 |\n", "| time/ | |\n", "| fps | 846 |\n", "| iterations | 156 |\n", "| time_elapsed | 1509 |\n", "| total_timesteps | 1277952 |\n", "| train/ | |\n", "| approx_kl | 0.017476227 |\n", "| clip_fraction | 0.125 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.42 |\n", "| explained_variance | 0.974 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.66 |\n", "| n_updates | 1550 |\n", "| policy_gradient_loss | -0.0019 |\n", "| value_loss | 10.8 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 180 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 846 |\n", "| iterations | 157 |\n", "| time_elapsed | 1518 |\n", "| total_timesteps | 1286144 |\n", "| train/ | |\n", "| approx_kl | 0.01002178 |\n", "| clip_fraction | 0.0956 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.389 |\n", "| explained_variance | 0.981 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.94 |\n", "| n_updates | 1560 |\n", "| policy_gradient_loss | -0.0037 |\n", "| value_loss | 10.7 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 180 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 847 |\n", "| iterations | 158 |\n", "| time_elapsed | 1526 |\n", "| total_timesteps | 1294336 |\n", "| train/ | |\n", "| approx_kl | 0.012355883 |\n", "| clip_fraction | 0.0981 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.406 |\n", "| explained_variance | 0.957 |\n", "| learning_rate | 0.00112 |\n", "| loss | 8.52 |\n", "| n_updates | 1570 |\n", "| policy_gradient_loss | -0.0023 |\n", "| value_loss | 27.2 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 188 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 848 |\n", "| iterations | 159 |\n", "| time_elapsed | 1535 |\n", "| total_timesteps | 1302528 |\n", "| train/ | |\n", "| approx_kl | 0.018652067 |\n", "| clip_fraction | 0.115 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.4 |\n", "| explained_variance | 0.942 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.56 |\n", "| n_updates | 1580 |\n", "| policy_gradient_loss | -6.91e-05 |\n", "| value_loss | 66.6 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 179 |\n", "| ep_rew_mean | 289 |\n", "| time/ | |\n", "| fps | 848 |\n", "| iterations | 160 |\n", "| time_elapsed | 1544 |\n", "| total_timesteps | 1310720 |\n", "| train/ | |\n", "| approx_kl | 0.010998421 |\n", "| clip_fraction | 0.11 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.375 |\n", "| explained_variance | 0.988 |\n", "| learning_rate | 0.00112 |\n", "| loss | 6.19 |\n", "| n_updates | 1590 |\n", "| policy_gradient_loss | 0.00455 |\n", "| value_loss | 9.05 |\n", "-----------------------------------------\n", "---------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 181 |\n", "| ep_rew_mean | 286 |\n", "| time/ | |\n", "| fps | 849 |\n", "| iterations | 161 |\n", "| time_elapsed | 1552 |\n", "| total_timesteps | 1318912 |\n", "| train/ | |\n", "| approx_kl | 0.0128868 |\n", "| clip_fraction | 0.101 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.391 |\n", "| explained_variance | 0.988 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.23 |\n", "| n_updates | 1600 |\n", "| policy_gradient_loss | -0.00169 |\n", "| value_loss | 8.54 |\n", "---------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 191 |\n", "| ep_rew_mean | 286 |\n", "| time/ | |\n", "| fps | 850 |\n", "| iterations | 162 |\n", "| time_elapsed | 1560 |\n", "| total_timesteps | 1327104 |\n", "| train/ | |\n", "| approx_kl | 0.010086833 |\n", "| clip_fraction | 0.102 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.372 |\n", "| explained_variance | 0.991 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.51 |\n", "| n_updates | 1610 |\n", "| policy_gradient_loss | 0.000697 |\n", "| value_loss | 7.88 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 186 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 850 |\n", "| iterations | 163 |\n", "| time_elapsed | 1569 |\n", "| total_timesteps | 1335296 |\n", "| train/ | |\n", "| approx_kl | 0.025277074 |\n", "| clip_fraction | 0.116 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.361 |\n", "| explained_variance | 0.995 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.22 |\n", "| n_updates | 1620 |\n", "| policy_gradient_loss | -0.000232 |\n", "| value_loss | 5.94 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 176 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 851 |\n", "| iterations | 164 |\n", "| time_elapsed | 1577 |\n", "| total_timesteps | 1343488 |\n", "| train/ | |\n", "| approx_kl | 0.013876938 |\n", "| clip_fraction | 0.0939 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.36 |\n", "| explained_variance | 0.942 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.63 |\n", "| n_updates | 1630 |\n", "| policy_gradient_loss | -0.0028 |\n", "| value_loss | 34.1 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 173 |\n", "| ep_rew_mean | 285 |\n", "| time/ | |\n", "| fps | 852 |\n", "| iterations | 165 |\n", "| time_elapsed | 1585 |\n", "| total_timesteps | 1351680 |\n", "| train/ | |\n", "| approx_kl | 0.012701863 |\n", "| clip_fraction | 0.103 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.343 |\n", "| explained_variance | 0.987 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.92 |\n", "| n_updates | 1640 |\n", "| policy_gradient_loss | -0.000969 |\n", "| value_loss | 7.23 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 171 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 853 |\n", "| iterations | 166 |\n", "| time_elapsed | 1593 |\n", "| total_timesteps | 1359872 |\n", "| train/ | |\n", "| approx_kl | 0.01579144 |\n", "| clip_fraction | 0.0956 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.351 |\n", "| explained_variance | 0.966 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.12 |\n", "| n_updates | 1650 |\n", "| policy_gradient_loss | 0.000275 |\n", "| value_loss | 37.2 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 173 |\n", "| ep_rew_mean | 278 |\n", "| time/ | |\n", "| fps | 854 |\n", "| iterations | 167 |\n", "| time_elapsed | 1601 |\n", "| total_timesteps | 1368064 |\n", "| train/ | |\n", "| approx_kl | 0.009606147 |\n", "| clip_fraction | 0.0874 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.37 |\n", "| explained_variance | 0.943 |\n", "| learning_rate | 0.00112 |\n", "| loss | 4.6 |\n", "| n_updates | 1660 |\n", "| policy_gradient_loss | -0.00185 |\n", "| value_loss | 65.8 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 174 |\n", "| ep_rew_mean | 276 |\n", "| time/ | |\n", "| fps | 854 |\n", "| iterations | 168 |\n", "| time_elapsed | 1610 |\n", "| total_timesteps | 1376256 |\n", "| train/ | |\n", "| approx_kl | 0.011658724 |\n", "| clip_fraction | 0.093 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.354 |\n", "| explained_variance | 0.936 |\n", "| learning_rate | 0.00112 |\n", "| loss | 15.7 |\n", "| n_updates | 1670 |\n", "| policy_gradient_loss | -0.00121 |\n", "| value_loss | 74.3 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 174 |\n", "| ep_rew_mean | 280 |\n", "| time/ | |\n", "| fps | 855 |\n", "| iterations | 169 |\n", "| time_elapsed | 1618 |\n", "| total_timesteps | 1384448 |\n", "| train/ | |\n", "| approx_kl | 0.02554712 |\n", "| clip_fraction | 0.13 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.345 |\n", "| explained_variance | 0.933 |\n", "| learning_rate | 0.00112 |\n", "| loss | 5.86 |\n", "| n_updates | 1680 |\n", "| policy_gradient_loss | -0.000988 |\n", "| value_loss | 51.4 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 172 |\n", "| ep_rew_mean | 282 |\n", "| time/ | |\n", "| fps | 856 |\n", "| iterations | 170 |\n", "| time_elapsed | 1626 |\n", "| total_timesteps | 1392640 |\n", "| train/ | |\n", "| approx_kl | 0.015233109 |\n", "| clip_fraction | 0.114 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.362 |\n", "| explained_variance | 0.993 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.93 |\n", "| n_updates | 1690 |\n", "| policy_gradient_loss | -0.00193 |\n", "| value_loss | 5.97 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 171 |\n", "| ep_rew_mean | 284 |\n", "| time/ | |\n", "| fps | 856 |\n", "| iterations | 171 |\n", "| time_elapsed | 1634 |\n", "| total_timesteps | 1400832 |\n", "| train/ | |\n", "| approx_kl | 0.011294821 |\n", "| clip_fraction | 0.107 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.374 |\n", "| explained_variance | 0.981 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.55 |\n", "| n_updates | 1700 |\n", "| policy_gradient_loss | -0.00272 |\n", "| value_loss | 6.99 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 174 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 857 |\n", "| iterations | 172 |\n", "| time_elapsed | 1642 |\n", "| total_timesteps | 1409024 |\n", "| train/ | |\n", "| approx_kl | 0.014799068 |\n", "| clip_fraction | 0.119 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.393 |\n", "| explained_variance | 0.996 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.06 |\n", "| n_updates | 1710 |\n", "| policy_gradient_loss | -0.000959 |\n", "| value_loss | 4.52 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 172 |\n", "| ep_rew_mean | 281 |\n", "| time/ | |\n", "| fps | 858 |\n", "| iterations | 173 |\n", "| time_elapsed | 1651 |\n", "| total_timesteps | 1417216 |\n", "| train/ | |\n", "| approx_kl | 0.014618198 |\n", "| clip_fraction | 0.0993 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.351 |\n", "| explained_variance | 0.994 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.64 |\n", "| n_updates | 1720 |\n", "| policy_gradient_loss | -0.00254 |\n", "| value_loss | 5.52 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 172 |\n", "| ep_rew_mean | 283 |\n", "| time/ | |\n", "| fps | 858 |\n", "| iterations | 174 |\n", "| time_elapsed | 1659 |\n", "| total_timesteps | 1425408 |\n", "| train/ | |\n", "| approx_kl | 0.008878873 |\n", "| clip_fraction | 0.0791 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.338 |\n", "| explained_variance | 0.925 |\n", "| learning_rate | 0.00112 |\n", "| loss | 58.5 |\n", "| n_updates | 1730 |\n", "| policy_gradient_loss | -0.00121 |\n", "| value_loss | 49.8 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 174 |\n", "| ep_rew_mean | 289 |\n", "| time/ | |\n", "| fps | 859 |\n", "| iterations | 175 |\n", "| time_elapsed | 1667 |\n", "| total_timesteps | 1433600 |\n", "| train/ | |\n", "| approx_kl | 0.01575758 |\n", "| clip_fraction | 0.119 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.353 |\n", "| explained_variance | 0.971 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.66 |\n", "| n_updates | 1740 |\n", "| policy_gradient_loss | -0.00177 |\n", "| value_loss | 26.2 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 174 |\n", "| ep_rew_mean | 286 |\n", "| time/ | |\n", "| fps | 860 |\n", "| iterations | 176 |\n", "| time_elapsed | 1676 |\n", "| total_timesteps | 1441792 |\n", "| train/ | |\n", "| approx_kl | 0.014210571 |\n", "| clip_fraction | 0.101 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.336 |\n", "| explained_variance | 0.978 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.87 |\n", "| n_updates | 1750 |\n", "| policy_gradient_loss | -0.00141 |\n", "| value_loss | 20.3 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 172 |\n", "| ep_rew_mean | 287 |\n", "| time/ | |\n", "| fps | 860 |\n", "| iterations | 177 |\n", "| time_elapsed | 1684 |\n", "| total_timesteps | 1449984 |\n", "| train/ | |\n", "| approx_kl | 0.017352598 |\n", "| clip_fraction | 0.111 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.339 |\n", "| explained_variance | 0.965 |\n", "| learning_rate | 0.00112 |\n", "| loss | 58.6 |\n", "| n_updates | 1760 |\n", "| policy_gradient_loss | -6.48e-05 |\n", "| value_loss | 28.8 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 175 |\n", "| ep_rew_mean | 289 |\n", "| time/ | |\n", "| fps | 861 |\n", "| iterations | 178 |\n", "| time_elapsed | 1692 |\n", "| total_timesteps | 1458176 |\n", "| train/ | |\n", "| approx_kl | 0.016775792 |\n", "| clip_fraction | 0.115 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.339 |\n", "| explained_variance | 0.99 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.5 |\n", "| n_updates | 1770 |\n", "| policy_gradient_loss | -0.00131 |\n", "| value_loss | 5.79 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 177 |\n", "| ep_rew_mean | 291 |\n", "| time/ | |\n", "| fps | 862 |\n", "| iterations | 179 |\n", "| time_elapsed | 1701 |\n", "| total_timesteps | 1466368 |\n", "| train/ | |\n", "| approx_kl | 0.019144004 |\n", "| clip_fraction | 0.122 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.384 |\n", "| explained_variance | 0.993 |\n", "| learning_rate | 0.00112 |\n", "| loss | 3.26 |\n", "| n_updates | 1780 |\n", "| policy_gradient_loss | -0.00627 |\n", "| value_loss | 4.75 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 173 |\n", "| ep_rew_mean | 291 |\n", "| time/ | |\n", "| fps | 862 |\n", "| iterations | 180 |\n", "| time_elapsed | 1709 |\n", "| total_timesteps | 1474560 |\n", "| train/ | |\n", "| approx_kl | 0.011667462 |\n", "| clip_fraction | 0.101 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.376 |\n", "| explained_variance | 0.993 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.83 |\n", "| n_updates | 1790 |\n", "| policy_gradient_loss | 0.000436 |\n", "| value_loss | 5.13 |\n", "-----------------------------------------\n", "----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 181 |\n", "| ep_rew_mean | 288 |\n", "| time/ | |\n", "| fps | 863 |\n", "| iterations | 181 |\n", "| time_elapsed | 1718 |\n", "| total_timesteps | 1482752 |\n", "| train/ | |\n", "| approx_kl | 0.01359967 |\n", "| clip_fraction | 0.115 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.408 |\n", "| explained_variance | 0.992 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.24 |\n", "| n_updates | 1800 |\n", "| policy_gradient_loss | -0.000924 |\n", "| value_loss | 4.55 |\n", "----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 180 |\n", "| ep_rew_mean | 290 |\n", "| time/ | |\n", "| fps | 863 |\n", "| iterations | 182 |\n", "| time_elapsed | 1726 |\n", "| total_timesteps | 1490944 |\n", "| train/ | |\n", "| approx_kl | 0.010642027 |\n", "| clip_fraction | 0.0949 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.371 |\n", "| explained_variance | 0.996 |\n", "| learning_rate | 0.00112 |\n", "| loss | 1.52 |\n", "| n_updates | 1810 |\n", "| policy_gradient_loss | -0.00211 |\n", "| value_loss | 4.22 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 170 |\n", "| ep_rew_mean | 291 |\n", "| time/ | |\n", "| fps | 864 |\n", "| iterations | 183 |\n", "| time_elapsed | 1734 |\n", "| total_timesteps | 1499136 |\n", "| train/ | |\n", "| approx_kl | 0.012676118 |\n", "| clip_fraction | 0.098 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.376 |\n", "| explained_variance | 0.996 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.78 |\n", "| n_updates | 1820 |\n", "| policy_gradient_loss | -0.00362 |\n", "| value_loss | 4.87 |\n", "-----------------------------------------\n", "-----------------------------------------\n", "| rollout/ | |\n", "| ep_len_mean | 170 |\n", "| ep_rew_mean | 289 |\n", "| time/ | |\n", "| fps | 864 |\n", "| iterations | 184 |\n", "| time_elapsed | 1743 |\n", "| total_timesteps | 1507328 |\n", "| train/ | |\n", "| approx_kl | 0.016170755 |\n", "| clip_fraction | 0.099 |\n", "| clip_range | 0.2 |\n", "| entropy_loss | -0.352 |\n", "| explained_variance | 0.996 |\n", "| learning_rate | 0.00112 |\n", "| loss | 2.39 |\n", "| n_updates | 1830 |\n", "| policy_gradient_loss | -0.0021 |\n", "| value_loss | 4.96 |\n", "-----------------------------------------\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": {}, "execution_count": 4 } ] }, { "cell_type": "code", "source": [ "#ich\n", "# Save the model\n", "model_name = \"ppo-LunarLander-v2\"\n", "model_tuned.save(model_name)" ], "metadata": { "id": "tfq7Ss7onBOJ" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "import gymnasium as gym\n", "# TODO: Evaluate the agent\n", "# Create a new environment for evaluation\n", "eval_env = Monitor(gym.make(\"LunarLander-v2\"))\n", "\n", "# Evaluate the model with 10 evaluation episodes and deterministic=True\n", "mean_reward, std_reward = evaluate_policy(model_tuned, eval_env, n_eval_episodes=10, deterministic=True)\n", "\n", "\n", "# Print the results\n", "print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")\n" ], "metadata": { "id": "Zgc1GX5TlBkT", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "cbed7222-d8c0-406f-b2bf-b4bf16131be3" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "mean_reward=285.21 +/- 15.543558761129864\n" ] } ] }, { "cell_type": "markdown", "metadata": { "id": "QAN7B0_HCVZC" }, "source": [ "#### Solution" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "543OHYDfcjK4" }, "outputs": [], "source": [ "# SOLUTION\n", "# We added some parameters to accelerate the training\n", "model = PPO(\n", " policy = 'MlpPolicy',\n", " env = env,\n", " n_steps = 1024,\n", " batch_size = 64,\n", " n_epochs = 4,\n", " gamma = 0.999,\n", " gae_lambda = 0.98,\n", " ent_coef = 0.01,\n", " verbose=1)" ] }, { "cell_type": "markdown", "metadata": { "id": "ClJJk88yoBUi" }, "source": [ "## Train the PPO agent ๐Ÿƒ\n", "- Let's train our agent for 1,000,000 timesteps, don't forget to use GPU on Colab. It will take approximately ~20min, but you can use fewer timesteps if you just want to try it out.\n", "- During the training, take a โ˜• break you deserved it ๐Ÿค—" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qKnYkNiVp89p" }, "outputs": [], "source": [ "# TODO: Train it for 1,000,000 timesteps\n", "\n", "# TODO: Specify file name for model and save the model to file\n", "model_name = \"ppo-LunarLander-v2-optuna-tuned\"\n" ] }, { "cell_type": "markdown", "metadata": { "id": "1bQzQ-QcE3zo" }, "source": [ "#### Solution" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "poBCy9u_csyR" }, "outputs": [], "source": [ "# SOLUTION\n", "# Train it for 1,000,000 timesteps\n", "model.learn(total_timesteps=1000000)\n", "# Save the model\n", "model_name = \"ppo-LunarLander-v2\"\n", "model.save(model_name)#saved where???" ] }, { "cell_type": "code", "source": [ "del model\n", "model = DQN.load(\"ppo-LunarLander-v2\", env=env)" ], "metadata": { "id": "ghbFJTIfKl8m" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "BY_HuedOoISR" }, "source": [ "## Evaluate the agent ๐Ÿ“ˆ\n", "- Remember to wrap the environment in a [Monitor](https://stable-baselines3.readthedocs.io/en/master/common/monitor.html).\n", "- Now that our Lunar Lander agent is trained ๐Ÿš€, we need to **check its performance**.\n", "- Stable-Baselines3 provides a method to do that: `evaluate_policy`.\n", "- To fill that part you need to [check the documentation](https://stable-baselines3.readthedocs.io/en/master/guide/examples.html#basic-usage-training-saving-loading)\n", "- In the next step, we'll see **how to automatically evaluate and share your agent to compete in a leaderboard, but for now let's do it ourselves**\n", "\n", "\n", "๐Ÿ’ก When you evaluate your agent, you should not use your training environment but create an evaluation environment." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "yRpno0glsADy", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "a0f29fad-921e-4986-87a8-a56f8e296b59" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "mean_reward=173.26 +/- 104.41973128975222\n" ] } ], "source": [ "# TODO: Evaluate the agent\n", "# Create a new environment for evaluation\n", "eval_env = Monitor(gym.make(\"LunarLander-v2\"))\n", "\n", "# Evaluate the model with 10 evaluation episodes and deterministic=True\n", "mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)\n", "\n", "\n", "# Print the results\n", "print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")\n" ] }, { "cell_type": "code", "source": [ "mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)\n", "\n", "\n", "# Print the results\n", "print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")" ], "metadata": { "id": "ZiOVUQ07OouW" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "BqPKw3jt_pG5" }, "source": [ "#### Solution" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zpz8kHlt_a_m" }, "outputs": [], "source": [ "#@title\n", "eval_env = Monitor(gym.make(\"LunarLander-v2\"))\n", "mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)\n", "print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")" ] }, { "cell_type": "markdown", "metadata": { "id": "reBhoODwcXfr" }, "source": [ "- In my case, I got a mean reward of `200.20 +/- 20.80` after training for 1 million steps, which means that our lunar lander agent is ready to land on the moon ๐ŸŒ›๐Ÿฅณ." ] }, { "cell_type": "markdown", "source": [ "# Optuna :)\n", "\n" ], "metadata": { "id": "AaSEQ6LuBzCc" } }, { "cell_type": "code", "source": [ "import optuna\n", "from optuna.pruners import MedianPruner\n", "from optuna.samplers import TPESampler\n", "from optuna.visualization import plot_optimization_history, plot_param_importances" ], "metadata": { "id": "tVtnWMGTc8jX" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "N_TRIALS = 100 # Maximum number of trials\n", "N_JOBS = 1 # Number of jobs to run in parallel\n", "N_STARTUP_TRIALS = 5 # Stop random sampling after N_STARTUP_TRIALS\n", "N_EVALUATIONS = 2 # Number of evaluations during the training\n", "N_TIMESTEPS = int(2e4) # Training budget\n", "EVAL_FREQ = int(N_TIMESTEPS / N_EVALUATIONS)\n", "N_EVAL_ENVS = 5\n", "N_EVAL_EPISODES = 10\n", "TIMEOUT = int(60 * 30) #30 #15 minutes\n", "\n", "ENV_ID = \"LunarLander-v2\"\n", "\n", "DEFAULT_HYPERPARAMS = {\n", " \"policy\": \"MlpPolicy\",\n", " \"env\": ENV_ID,\n", "}\n", "\n" ], "metadata": { "id": "un7FPcCgB6Jj", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "055e8043-9eb6-469a-c269-81689df35009" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "/usr/local/lib/python3.10/dist-packages/ipykernel/ipkernel.py:283: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n", " and should_run_async(code)\n" ] } ] }, { "cell_type": "code", "source": [ "from typing import Any, Dict\n", "import torch\n", "import torch.nn as nn\n", "\n", "def sample_ppo_params(trial: optuna.Trial) -> Dict[str, Any]:\n", " \"\"\"\n", " Sampler for A2C hyperparameters.\n", "\n", " :param trial: Optuna trial object\n", " :return: The sampled hyperparameters for the given trial.\n", " \"\"\"\n", " # Discount factor between 0.9 and 0.9999\n", " gamma = 1.0 - trial.suggest_float(\"gamma\", 0.0001, 0.1, log=True)\n", " max_grad_norm = trial.suggest_float(\"max_grad_norm\", 0.3, 5.0, log=True)\n", " # 8, 16, 32, ... 1024 geรคndert\n", " n_steps = 2 ** trial.suggest_int(\"exponent_n_steps\", 8, 12)\n", "\n", " ### YOUR CODE HERE\n", " # TODO:\n", " # - define the learning rate search space [1e-5, 1] (log) -> `suggest_float`\n", " # - define the network architecture search space [\"tiny\", \"small\"] -> `suggest_categorical`\n", " # - define the activation function search space [\"tanh\", \"relu\"]\n", " learning_rate = trial.suggest_float(\"learning_rate\", 1e-5, 1, log=True)\n", " #net_arch = ...\n", " #activation_fn = trial.suggest_categorical(\"activation_fn\", [\"tanh\", \"relu\"])\n", "\n", " ### END OF YOUR CODE\n", "\n", " # Display true values\n", " trial.set_user_attr(\"gamma_\", gamma)\n", " trial.set_user_attr(\"n_steps\", n_steps)\n", "\n", " \"\"\"\n", " net_arch = [\n", " {\"pi\": [64], \"vf\": [64]}\n", " if net_arch == \"tiny\"\n", " else {\"pi\": [64, 64], \"vf\": [64, 64]}\n", " ]\"\"\"\n", "\n", " #activation_fn = {\"tanh\": nn.Tanh, \"relu\": nn.ReLU}[activation_fn]\n", "\n", " return {\n", " \"n_steps\": n_steps,\n", " \"gamma\": gamma,\n", " \"learning_rate\": learning_rate,\n", " \"max_grad_norm\": max_grad_norm,\n", " #\"policy_kwargs\": {\n", " #\"net_arch\": net_arch,\n", " #\"activation_fn\": activation_fn,\n", " #},\n", " }" ], "metadata": { "id": "i7XKpH42UNSZ" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "from stable_baselines3.common.callbacks import EvalCallback\n", "\n", "class TrialEvalCallback(EvalCallback):\n", " \"\"\"\n", " Callback used for evaluating and reporting a trial.\n", "\n", " :param eval_env: Evaluation environement\n", " :param trial: Optuna trial object\n", " :param n_eval_episodes: Number of evaluation episodes\n", " :param eval_freq: Evaluate the agent every ``eval_freq`` call of the callback.\n", " :param deterministic: Whether the evaluation should\n", " use a stochastic or deterministic policy.\n", " :param verbose:\n", " \"\"\"\n", "\n", " def __init__(\n", " self,\n", " eval_env: gym.Env,\n", " trial: optuna.Trial,\n", " n_eval_episodes: int = 5,\n", " eval_freq: int = 10000,\n", " deterministic: bool = True,\n", " verbose: int = 0,\n", " ):\n", "\n", " super().__init__(\n", " eval_env=eval_env,\n", " n_eval_episodes=n_eval_episodes,\n", " eval_freq=eval_freq,\n", " deterministic=deterministic,\n", " verbose=verbose,\n", " )\n", " self.trial = trial\n", " self.eval_idx = 0\n", " self.is_pruned = False\n", "\n", " def _on_step(self) -> bool:\n", " if self.eval_freq > 0 and self.n_calls % self.eval_freq == 0:\n", " # Evaluate policy (done in the parent class)\n", " super()._on_step()\n", " self.eval_idx += 1\n", " # Send report to Optuna\n", " self.trial.report(self.last_mean_reward, self.eval_idx)\n", " # Prune trial if need\n", " if self.trial.should_prune():\n", " self.is_pruned = True\n", " return False\n", " return True" ], "metadata": { "id": "X-RGhCw2P6RI" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "def objective(trial: optuna.Trial) -> float:\n", " \"\"\"\n", " Objective function using by Optuna to evaluate\n", " one configuration (i.e., one set of hyperparameters).\n", "\n", " Given a trial object, it will sample hyperparameters,\n", " evaluate it and report the result (mean episodic reward after training)\n", "\n", " :param trial: Optuna trial object\n", " :return: Mean episodic reward after training\n", " \"\"\"\n", "\n", " kwargs = DEFAULT_HYPERPARAMS.copy()\n", " ### YOUR CODE HERE\n", " # TODO:\n", " # 1. Sample hyperparameters and update the default keyword arguments: `kwargs.update(other_params)`\n", " # 2. Create the evaluation envs\n", " # 3. Create the `TrialEvalCallback`\n", "\n", " # 1. Sample hyperparameters and update the keyword arguments\n", " kwargs.update(sample_ppo_params(trial))\n", " # Create the RL model\n", " model = PPO(**kwargs)\n", "\n", " # 2. Create envs used for evaluation using `make_vec_env`, `ENV_ID` and `N_EVAL_ENVS`\n", " eval_envs = make_vec_env(ENV_ID, n_envs=N_EVAL_ENVS)\n", " # 3. Create the `TrialEvalCallback` callback defined above that will periodically evaluate\n", " # and report the performance using `N_EVAL_EPISODES` every `EVAL_FREQ`\n", " # TrialEvalCallback signature:\n", " # TrialEvalCallback(eval_env, trial, n_eval_episodes, eval_freq, deterministic, verbose)\n", " eval_callback = TrialEvalCallback(\n", " eval_envs, trial,\n", " n_eval_episodes=N_EVAL_EPISODES, eval_freq=EVAL_FREQ, deterministic=False, verbose=1\n", " )\n", "\n", " ### END OF YOUR CODE\n", "\n", " nan_encountered = False\n", " try:\n", " # Train the model\n", " model.learn(N_TIMESTEPS, callback=eval_callback)\n", " except AssertionError as e:\n", " # Sometimes, random hyperparams can generate NaN\n", " print(e)\n", " nan_encountered = True\n", " finally:\n", " # Free memory\n", " model.env.close()\n", " eval_envs.close()\n", "\n", " # Tell the optimizer that the trial failed\n", " if nan_encountered:\n", " return float(\"nan\")\n", "\n", " if eval_callback.is_pruned:\n", " raise optuna.exceptions.TrialPruned()\n", "\n", " return eval_callback.last_mean_reward" ], "metadata": { "id": "eircqIQiQ_vC" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "import torch as th\n", "\n", "# Set pytorch num threads to 1 for faster training\n", "th.set_num_threads(1)\n", "# Select the sampler, can be random, TPESampler, CMAES, ...\n", "sampler = TPESampler(n_startup_trials=N_STARTUP_TRIALS)\n", "# Do not prune before 1/3 of the max budget is used\n", "pruner = MedianPruner(\n", " n_startup_trials=N_STARTUP_TRIALS, n_warmup_steps=11#N_EVALUATIONS // 3\n", ")\n", "# Create the study and start the hyperparameter optimization\n", "study = optuna.create_study(sampler=sampler, pruner=pruner, direction=\"maximize\")\n", "\n", "#include standard params :)\n", "study.enqueue_trial({\n", " \"gamma\": 1-0.99,\n", " \"max_grad_norm\": 0.5,\n", " \"learning_rate\": 0.0003\n", "}\n", ")\n", "\n", "try:\n", " study.optimize(objective, n_trials=N_TRIALS, n_jobs=N_JOBS, timeout=TIMEOUT)\n", "except KeyboardInterrupt:\n", " pass\n", "\n", "print(\"Number of finished trials: \", len(study.trials))\n", "\n", "print(\"Best trial:\")\n", "trial = study.best_trial\n", "\n", "print(f\" Value: {trial.value}\")\n", "\n", "print(\" Params: \")\n", "for key, value in trial.params.items():\n", " print(f\" {key}: {value}\")\n", "\n", "print(\" User attrs:\")\n", "for key, value in trial.user_attrs.items():\n", " print(f\" {key}: {value}\")\n", "\n", "# Write report\n", "study.trials_dataframe().to_csv(\"study_results_a2c_cartpole.csv\")\n", "\n", "fig1 = plot_optimization_history(study)\n", "fig2 = plot_param_importances(study)\n", "\n", "fig1.show()\n", "fig2.show()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "tFNDjvlbUEBo", "outputId": "268439a7-ff04-44e0-c582-d913a1171dea" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:06:50,989] A new study created in memory with name: no-name-02eb9dfc-ab5f-4d2e-a90e-031a2b6b9561\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-166.78 +/- 100.36\n", "Episode length: 128.30 +/- 22.34\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-140.89 +/- 108.97\n", "Episode length: 190.40 +/- 36.48\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:08:00,396] Trial 0 finished with value: -140.89088580000004 and parameters: {'gamma': 0.010000000000000009, 'max_grad_norm': 0.5, 'exponent_n_steps': 10, 'learning_rate': 0.0003}. Best is trial 0 with value: -140.89088580000004.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-532.54 +/- 170.96\n", "Episode length: 64.60 +/- 10.23\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-680.22 +/- 147.90\n", "Episode length: 74.90 +/- 13.49\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:08:53,630] Trial 1 finished with value: -680.218778 and parameters: {'gamma': 0.006386863321771126, 'max_grad_norm': 4.394368448565919, 'exponent_n_steps': 11, 'learning_rate': 0.09441575033584668}. Best is trial 0 with value: -140.89088580000004.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-514.25 +/- 120.89\n", "Episode length: 66.40 +/- 14.97\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-607.98 +/- 138.34\n", "Episode length: 65.60 +/- 11.44\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:09:44,920] Trial 2 finished with value: -607.9839039000001 and parameters: {'gamma': 0.00014223117433557477, 'max_grad_norm': 0.7062802622404362, 'exponent_n_steps': 10, 'learning_rate': 0.2893891436067333}. Best is trial 0 with value: -140.89088580000004.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-159.53 +/- 103.37\n", "Episode length: 93.70 +/- 20.21\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-178.25 +/- 101.20\n", "Episode length: 84.20 +/- 17.08\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:10:36,176] Trial 3 finished with value: -178.2513414 and parameters: {'gamma': 0.0037798286192951606, 'max_grad_norm': 1.5819785343399344, 'exponent_n_steps': 12, 'learning_rate': 1.3714737408556629e-05}. Best is trial 0 with value: -140.89088580000004.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-75.71 +/- 58.71\n", "Episode length: 112.00 +/- 23.03\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-41.04 +/- 91.10\n", "Episode length: 547.50 +/- 318.13\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:11:39,591] Trial 4 finished with value: -41.0395348 and parameters: {'gamma': 0.0005973482521263276, 'max_grad_norm': 2.41482259354506, 'exponent_n_steps': 9, 'learning_rate': 0.010500019576301626}. Best is trial 4 with value: -41.0395348.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-159.05 +/- 110.78\n", "Episode length: 144.60 +/- 20.88\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-34.56 +/- 102.35\n", "Episode length: 158.40 +/- 38.41\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:12:37,243] Trial 5 finished with value: -34.5613051 and parameters: {'gamma': 0.00021558042820451996, 'max_grad_norm': 2.538096202465164, 'exponent_n_steps': 8, 'learning_rate': 0.005102338467254626}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-193.13 +/- 103.77\n", "Episode length: 133.20 +/- 29.38\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-56.27 +/- 77.01\n", "Episode length: 289.30 +/- 240.77\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:13:34,229] Trial 6 finished with value: -56.2732172 and parameters: {'gamma': 0.00013407555665877972, 'max_grad_norm': 4.6928202088713284, 'exponent_n_steps': 8, 'learning_rate': 0.0004626623132401827}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-615.73 +/- 111.52\n", "Episode length: 99.30 +/- 11.43\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-135.39 +/- 81.29\n", "Episode length: 227.40 +/- 72.61\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:14:29,291] Trial 7 finished with value: -135.39342919999999 and parameters: {'gamma': 0.05266178258746849, 'max_grad_norm': 1.0059911531773587, 'exponent_n_steps': 8, 'learning_rate': 0.008637685530174312}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-90.72 +/- 60.32\n", "Episode length: 146.40 +/- 28.98\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-42.87 +/- 67.07\n", "Episode length: 247.60 +/- 251.81\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:15:29,939] Trial 8 finished with value: -42.8676138 and parameters: {'gamma': 0.0005759972786657972, 'max_grad_norm': 2.178198098691272, 'exponent_n_steps': 9, 'learning_rate': 0.0006154119197118329}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-124.80 +/- 47.94\n", "Episode length: 68.60 +/- 11.77\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-124.41 +/- 48.36\n", "Episode length: 69.60 +/- 13.21\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:16:18,849] Trial 9 finished with value: -124.4060378 and parameters: {'gamma': 0.0008050504720345619, 'max_grad_norm': 0.3286023130633277, 'exponent_n_steps': 8, 'learning_rate': 0.9131802052506558}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-183.85 +/- 137.64\n", "Episode length: 93.80 +/- 21.31\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-140.82 +/- 72.97\n", "Episode length: 89.00 +/- 23.83\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:17:09,828] Trial 10 finished with value: -140.8161907 and parameters: {'gamma': 0.05662749164475368, 'max_grad_norm': 2.702817366937843, 'exponent_n_steps': 9, 'learning_rate': 1.7536377840513823e-05}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-158.63 +/- 70.36\n", "Episode length: 240.90 +/- 83.03\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-161.07 +/- 96.52\n", "Episode length: 90.90 +/- 18.31\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:18:06,150] Trial 11 finished with value: -161.0745164 and parameters: {'gamma': 0.0007356403275759354, 'max_grad_norm': 2.6120038390626377, 'exponent_n_steps': 9, 'learning_rate': 0.013966444638642802}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-124.63 +/- 55.50\n", "Episode length: 74.40 +/- 8.31\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-110.18 +/- 54.03\n", "Episode length: 78.40 +/- 12.22\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:18:54,547] Trial 12 finished with value: -110.17941379999999 and parameters: {'gamma': 0.00034999108842769417, 'max_grad_norm': 1.5533282494775675, 'exponent_n_steps': 9, 'learning_rate': 0.03016602553405144}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-51.82 +/- 83.86\n", "Episode length: 233.60 +/- 102.43\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-192.24 +/- 106.57\n", "Episode length: 656.90 +/- 238.23\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:20:12,106] Trial 13 finished with value: -192.24408069999998 and parameters: {'gamma': 0.0016642767572459288, 'max_grad_norm': 3.24974427726822, 'exponent_n_steps': 8, 'learning_rate': 0.002396669124994751}. Best is trial 5 with value: -34.5613051.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-89.81 +/- 66.10\n", "Episode length: 142.20 +/- 45.14\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-9.85 +/- 27.21\n", "Episode length: 168.90 +/- 31.16\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:21:08,346] Trial 14 finished with value: -9.8521702 and parameters: {'gamma': 0.0002600392648406912, 'max_grad_norm': 1.7464162922335436, 'exponent_n_steps': 9, 'learning_rate': 0.0023634795221302986}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-97.67 +/- 66.60\n", "Episode length: 117.60 +/- 26.17\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-35.49 +/- 47.34\n", "Episode length: 175.30 +/- 28.97\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:22:01,536] Trial 15 finished with value: -35.4859521 and parameters: {'gamma': 0.00010749029702825214, 'max_grad_norm': 1.1083175833707903, 'exponent_n_steps': 11, 'learning_rate': 0.0017748932882514477}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-167.46 +/- 67.75\n", "Episode length: 92.10 +/- 19.37\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-146.81 +/- 70.97\n", "Episode length: 94.80 +/- 18.19\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:22:52,575] Trial 16 finished with value: -146.813144 and parameters: {'gamma': 0.00026374676804712425, 'max_grad_norm': 1.8480529154769332, 'exponent_n_steps': 8, 'learning_rate': 6.758512464046624e-05}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-523.47 +/- 158.51\n", "Episode length: 70.50 +/- 16.88\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-674.66 +/- 194.18\n", "Episode length: 72.90 +/- 10.90\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:23:43,664] Trial 17 finished with value: -674.6649119000001 and parameters: {'gamma': 0.001832494345692276, 'max_grad_norm': 0.8826727066389795, 'exponent_n_steps': 10, 'learning_rate': 0.05242890944978285}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-42.52 +/- 44.02\n", "Episode length: 200.70 +/- 62.05\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-12.12 +/- 153.03\n", "Episode length: 364.90 +/- 111.33\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:24:43,476] Trial 18 finished with value: -12.118862699999994 and parameters: {'gamma': 0.014173260160020129, 'max_grad_norm': 3.528745243651732, 'exponent_n_steps': 9, 'learning_rate': 0.004081439484472964}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-131.55 +/- 57.20\n", "Episode length: 108.90 +/- 33.87\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-172.65 +/- 157.90\n", "Episode length: 155.10 +/- 45.01\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:25:36,445] Trial 19 finished with value: -172.6480927 and parameters: {'gamma': 0.020302854729718288, 'max_grad_norm': 3.7930171430988597, 'exponent_n_steps': 11, 'learning_rate': 0.00012307662464193376}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-209.95 +/- 111.37\n", "Episode length: 130.60 +/- 29.65\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-158.03 +/- 87.81\n", "Episode length: 179.90 +/- 28.69\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:26:32,717] Trial 20 finished with value: -158.0252963 and parameters: {'gamma': 0.017993812382632625, 'max_grad_norm': 1.3334094770141058, 'exponent_n_steps': 10, 'learning_rate': 0.0011035387033504359}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-317.59 +/- 130.98\n", "Episode length: 119.60 +/- 21.54\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-20.61 +/- 58.16\n", "Episode length: 194.20 +/- 269.06\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:27:28,542] Trial 21 finished with value: -20.6068542 and parameters: {'gamma': 0.00024860358827230204, 'max_grad_norm': 3.024296290928593, 'exponent_n_steps': 9, 'learning_rate': 0.004635932150831146}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-30.96 +/- 28.96\n", "Episode length: 119.40 +/- 26.68\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-176.14 +/- 148.41\n", "Episode length: 173.70 +/- 107.76\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:28:27,483] Trial 22 finished with value: -176.13734570000003 and parameters: {'gamma': 0.0018149489683248383, 'max_grad_norm': 3.3965331474503784, 'exponent_n_steps': 9, 'learning_rate': 0.00341496029158182}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-307.03 +/- 246.62\n", "Episode length: 246.10 +/- 117.10\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-155.10 +/- 42.32\n", "Episode length: 161.70 +/- 60.52\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:29:22,427] Trial 23 finished with value: -155.10452500000002 and parameters: {'gamma': 0.028763568890075652, 'max_grad_norm': 3.2296972935582584, 'exponent_n_steps': 9, 'learning_rate': 0.023315121877238897}. Best is trial 14 with value: -9.8521702.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-93.23 +/- 68.06\n", "Episode length: 204.60 +/- 83.62\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=6.59 +/- 48.70\n", "Episode length: 806.20 +/- 299.75\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:30:35,119] Trial 24 finished with value: 6.585008299999998 and parameters: {'gamma': 0.006075594024321983, 'max_grad_norm': 1.8559426752164974, 'exponent_n_steps': 9, 'learning_rate': 0.0011176199638550707}. Best is trial 24 with value: 6.585008299999998.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-139.74 +/- 51.47\n", "Episode length: 129.30 +/- 23.86\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-107.00 +/- 114.21\n", "Episode length: 146.00 +/- 28.50\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:31:30,243] Trial 25 finished with value: -107.00469200000002 and parameters: {'gamma': 0.007367043589902397, 'max_grad_norm': 1.958856909878627, 'exponent_n_steps': 10, 'learning_rate': 0.000128280155751115}. Best is trial 24 with value: 6.585008299999998.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-113.75 +/- 66.01\n", "Episode length: 128.50 +/- 25.03\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-6.15 +/- 102.98\n", "Episode length: 656.60 +/- 344.29\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:32:37,039] Trial 26 finished with value: -6.149618299999998 and parameters: {'gamma': 0.00350647970429975, 'max_grad_norm': 1.5621852014555846, 'exponent_n_steps': 9, 'learning_rate': 0.001315131525193004}. Best is trial 24 with value: 6.585008299999998.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-94.57 +/- 40.26\n", "Episode length: 115.50 +/- 28.28\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-10.22 +/- 29.56\n", "Episode length: 158.20 +/- 33.16\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:33:28,571] Trial 27 finished with value: -10.2173887 and parameters: {'gamma': 0.0037296843294675826, 'max_grad_norm': 1.4116003054187924, 'exponent_n_steps': 10, 'learning_rate': 0.0011471197011206687}. Best is trial 24 with value: 6.585008299999998.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-133.97 +/- 107.67\n", "Episode length: 111.20 +/- 12.19\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-134.42 +/- 87.66\n", "Episode length: 278.90 +/- 244.38\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:34:24,583] Trial 28 finished with value: -134.42215259999998 and parameters: {'gamma': 0.0012275078561518743, 'max_grad_norm': 0.8140388935920868, 'exponent_n_steps': 9, 'learning_rate': 0.0002223428920915772}. Best is trial 24 with value: 6.585008299999998.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-103.85 +/- 39.75\n", "Episode length: 90.60 +/- 16.29\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-134.55 +/- 104.62\n", "Episode length: 190.60 +/- 60.08\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:35:15,573] Trial 29 finished with value: -134.5531878 and parameters: {'gamma': 0.005381618601754265, 'max_grad_norm': 0.5552403372064452, 'exponent_n_steps': 10, 'learning_rate': 0.000803613295379163}. Best is trial 24 with value: 6.585008299999998.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-168.45 +/- 82.45\n", "Episode length: 82.10 +/- 10.53\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-121.91 +/- 44.54\n", "Episode length: 94.30 +/- 25.84\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:36:04,766] Trial 30 finished with value: -121.91198100000001 and parameters: {'gamma': 0.009562389344392503, 'max_grad_norm': 1.7861270786108663, 'exponent_n_steps': 12, 'learning_rate': 4.031555371311173e-05}. Best is trial 24 with value: 6.585008299999998.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Eval num_timesteps=10000, episode_reward=-185.67 +/- 116.15\n", "Episode length: 118.50 +/- 25.87\n", "New best mean reward!\n", "Eval num_timesteps=20000, episode_reward=-102.80 +/- 79.61\n", "Episode length: 352.10 +/- 267.15\n", "New best mean reward!\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "[I 2024-06-21 07:37:03,134] Trial 31 finished with value: -102.8034991 and parameters: {'gamma': 0.0036604005201865044, 'max_grad_norm': 1.1902658569581193, 'exponent_n_steps': 10, 'learning_rate': 0.0013976778734531338}. Best is trial 24 with value: 6.585008299999998.\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Number of finished trials: 32\n", "Best trial:\n", " Value: 6.585008299999998\n", " Params: \n", " gamma: 0.006075594024321983\n", " max_grad_norm: 1.8559426752164974\n", " exponent_n_steps: 9\n", " learning_rate: 0.0011176199638550707\n", " User attrs:\n", " gamma_: 0.993924405975678\n", " n_steps: 512\n" ] }, { "output_type": "display_data", "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", "\n", "" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", "\n", "" ] }, "metadata": {} } ] }, { "cell_type": "markdown", "source": [ "oben: 15 Minuten Laufzeot" ], "metadata": { "id": "oYI5vpA4iN3M" } }, { "cell_type": "markdown", "metadata": { "id": "IK_kR78NoNb2" }, "source": [ "## Publish our trained model on the Hub ๐Ÿ”ฅ\n", "Now that we saw we got good results after the training, we can publish our trained model on the hub ๐Ÿค— with one line of code.\n", "\n", "๐Ÿ“š The libraries documentation ๐Ÿ‘‰ https://github.com/huggingface/huggingface_sb3/tree/main#hugging-face--x-stable-baselines3-v20\n", "\n", "Here's an example of a Model Card (with Space Invaders):" ] }, { "cell_type": "markdown", "metadata": { "id": "Gs-Ew7e1gXN3" }, "source": [ "By using `package_to_hub` **you evaluate, record a replay, generate a model card of your agent and push it to the hub**.\n", "\n", "This way:\n", "- You can **showcase our work** ๐Ÿ”ฅ\n", "- You can **visualize your agent playing** ๐Ÿ‘€\n", "- You can **share with the community an agent that others can use** ๐Ÿ’พ\n", "- You can **access a leaderboard ๐Ÿ† to see how well your agent is performing compared to your classmates** ๐Ÿ‘‰ https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard\n" ] }, { "cell_type": "markdown", "metadata": { "id": "JquRrWytA6eo" }, "source": [ "To be able to share your model with the community there are three more steps to follow:\n", "\n", "1๏ธโƒฃ (If it's not already done) create an account on Hugging Face โžก https://huggingface.co/join\n", "\n", "2๏ธโƒฃ Sign in and then, you need to store your authentication token from the Hugging Face website.\n", "- Create a new token (https://huggingface.co/settings/tokens) **with write role**\n", "\n", "\"Create\n", "\n", "- Copy the token\n", "- Run the cell below and paste the token" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "GZiFBBlzxzxY", "colab": { "base_uri": "https://localhost:8080/", "height": 145, "referenced_widgets": [ "5d7cf727641348a49180df0c7b4a3df3", "6fea421c4aad4e5fa6bfe25808de3c43", "7ca3d10bde654836b8c16db772cc9f7b", "aeef9cc2d8164d3ebcf3d6cc2c3e6c3a", "5bfac72d402b40ab9dd9aa17552c9c96", "9c70772b542b4a70ad938d689c57d0bf", "cb8ce74af1ea4b8597d70e11418656ba", "494ad5341905419697037e0609f5c53f", "f0513bc5f5d34bc596839c8e35521481", "79eeebfb3a494837b49e365aa39dd587", "41219e3c3b45475da0d56d338f34d410", "14b46365804242d6a5155f9340d3c5dd", "3dc6acf5609b45d3a447c858fd5476e9", "3ca92a65c7934ce291062e8a26946438", "82c0edc858f4455596e4b91a210aa00d", "20386e53916943ceb97c5bae8bc254bb", "361af4b484af4f0f81c2070a08f2bc7b", "e876c9f2bae9417d8dab87d61cd3afbf", "36a7abdb338e4f9599c1ab042ab70cb7", "c607441c4843412f8bb3ed7600ad054c", "967f1eb8077240d498683c1c87f4357b", "c6096e8a86be40aa9552f7726d97e598", "d692ac35c89a42cebfd610eea6a1c217", "79b5f079c9154817850b62cda3b9ac7e", "2ecccae3f2254a13a1537174af782daf", "1671f050fdf34e01a158438a42f9f316", "f3197044dd6543629ea3e3920969e9f2", "806b74de78a44571a838ff03253c4757", "d569a17b57e241eeace1bcf4c0d9ce7a", "62087bf64ba44cc19fca467abd45f339", "30f1e0c976d44dffa10eb36490ef09e8", "b9c1c3daf6b749ccbbf017cf49107183" ] }, "outputId": "6429fb4b-7b2a-415e-b742-bf3163d25d47" }, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "VBox(children=(HTML(value='
" ] }, { "cell_type": "markdown", "metadata": { "id": "hNPLJF2bfiUw" }, "source": [ "2. Then we just need to use load_from_hub with:\n", "- The repo_id\n", "- The filename: the saved model inside the repo and its extension (*.zip)" ] }, { "cell_type": "markdown", "source": [ "Because the model I download from the Hub was trained with Gym (the former version of Gymnasium) we need to install shimmy a API conversion tool that will help us to run the environment correctly.\n", "\n", "Shimmy Documentation: https://github.com/Farama-Foundation/Shimmy" ], "metadata": { "id": "bhb9-NtsinKB" } }, { "cell_type": "code", "source": [ "!pip install shimmy" ], "metadata": { "id": "03WI-bkci1kH" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "oj8PSGHJfwz3" }, "outputs": [], "source": [ "from huggingface_sb3 import load_from_hub\n", "repo_id = \"humnrdble/DeepRL-unit1\" # The repo_id\n", "filename = \"ppo-LunarLander-v2.zip\" # The model filename.zip\n", "\n", "# When the model was trained on Python 3.8 the pickle protocol is 5\n", "# But Python 3.6, 3.7 use protocol 4\n", "# In order to get compatibility we need to:\n", "# 1. Install pickle5 (we done it at the beginning of the colab)\n", "# 2. Create a custom empty object we pass as parameter to PPO.load()\n", "custom_objects = {\n", " \"learning_rate\": 0.0,\n", " \"lr_schedule\": lambda _: 0.0,\n", " \"clip_range\": lambda _: 0.0,\n", "}\n", "\n", "checkpoint = load_from_hub(repo_id, filename)\n", "model2 = PPO.load(checkpoint, custom_objects=custom_objects, print_system_info=True)" ] }, { "cell_type": "markdown", "metadata": { "id": "Fs0Y-qgPgLUf" }, "source": [ "Let's evaluate this agent:" ] }, { "cell_type": "code", "source": [ "#@title\n", "eval_env = Monitor(gym.make(\"LunarLander-v2\"))\n", "mean_reward, std_reward = evaluate_policy(model2, eval_env, n_eval_episodes=30, deterministic=True)\n", "print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")" ], "metadata": { "id": "PAEVwK-aahfx" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "BQAwLnYFPk-s" }, "source": [ "## Some additional challenges ๐Ÿ†\n", "The best way to learn **is to try things by your own**! As you saw, the current agent is not doing great. As a first suggestion, you can train for more steps. With 1,000,000 steps, we saw some great results!\n", "\n", "In the [Leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) you will find your agents. Can you get to the top?\n", "\n", "Here are some ideas to achieve so:\n", "* Train more steps\n", "* Try different hyperparameters for `PPO`. You can see them at https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html#parameters.\n", "* Check the [Stable-Baselines3 documentation](https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html) and try another model such as DQN.\n", "* **Push your new trained model** on the Hub ๐Ÿ”ฅ\n", "\n", "**Compare the results of your LunarLander-v2 with your classmates** using the [leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) ๐Ÿ†\n", "\n", "Is moon landing too boring for you? Try to **change the environment**, why not use MountainCar-v0, CartPole-v1 or CarRacing-v0? Check how they work [using the gym documentation](https://www.gymlibrary.dev/) and have fun ๐ŸŽ‰." ] }, { "cell_type": "markdown", "metadata": { "id": "9lM95-dvmif8" }, "source": [ "________________________________________________________________________\n", "Congrats on finishing this chapter! That was the biggest one, **and there was a lot of information.**\n", "\n", "If youโ€™re still feel confused with all these elements...it's totally normal! **This was the same for me and for all people who studied RL.**\n", "\n", "Take time to really **grasp the material before continuing and try the additional challenges**. Itโ€™s important to master these elements and have a solid foundations.\n", "\n", "Naturally, during the course, weโ€™re going to dive deeper into these concepts but **itโ€™s better to have a good understanding of them now before diving into the next chapters.**\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "BjLhT70TEZIn" }, "source": [ "Next time, in the bonus unit 1, you'll train Huggy the Dog to fetch the stick.\n", "\n", "\"Huggy\"/\n", "\n", "## Keep learning, stay awesome ๐Ÿค—" ] } ], "metadata": { "accelerator": "GPU", "colab": { "provenance": [], "collapsed_sections": [ "QAN7B0_HCVZC", "1bQzQ-QcE3zo" ], "gpuType": "T4" }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python", "version": "3.9.7" }, "vscode": { "interpreter": { "hash": "ed7f8024e43d3b8f5ca3c5e1a8151ab4d136b3ecee1e3fd59e0766ccc55e1b10" } }, "widgets": { "application/vnd.jupyter.widget-state+json": { "5d7cf727641348a49180df0c7b4a3df3": { "model_module": "@jupyter-widgets/controls", "model_name": "VBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "VBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "VBoxView", "box_style": "", "children": [ "IPY_MODEL_967f1eb8077240d498683c1c87f4357b", "IPY_MODEL_c6096e8a86be40aa9552f7726d97e598", "IPY_MODEL_d692ac35c89a42cebfd610eea6a1c217", "IPY_MODEL_79b5f079c9154817850b62cda3b9ac7e" ], "layout": "IPY_MODEL_cb8ce74af1ea4b8597d70e11418656ba" } }, "6fea421c4aad4e5fa6bfe25808de3c43": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_494ad5341905419697037e0609f5c53f", "placeholder": "โ€‹", "style": "IPY_MODEL_f0513bc5f5d34bc596839c8e35521481", "value": "

Copy a token from your Hugging Face\ntokens page and paste it below.
Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file.
" } }, "7ca3d10bde654836b8c16db772cc9f7b": { "model_module": "@jupyter-widgets/controls", "model_name": "PasswordModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "PasswordModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "PasswordView", "continuous_update": true, "description": "Token:", "description_tooltip": null, "disabled": false, "layout": "IPY_MODEL_79eeebfb3a494837b49e365aa39dd587", "placeholder": "โ€‹", "style": "IPY_MODEL_41219e3c3b45475da0d56d338f34d410", "value": "" } }, "aeef9cc2d8164d3ebcf3d6cc2c3e6c3a": { "model_module": "@jupyter-widgets/controls", "model_name": "CheckboxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "CheckboxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "CheckboxView", "description": "Add token as git credential?", "description_tooltip": null, "disabled": false, "indent": true, "layout": "IPY_MODEL_14b46365804242d6a5155f9340d3c5dd", "style": "IPY_MODEL_3dc6acf5609b45d3a447c858fd5476e9", "value": true } }, "5bfac72d402b40ab9dd9aa17552c9c96": { "model_module": "@jupyter-widgets/controls", "model_name": "ButtonModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ButtonModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ButtonView", "button_style": "", "description": "Login", "disabled": false, "icon": "", "layout": "IPY_MODEL_3ca92a65c7934ce291062e8a26946438", "style": "IPY_MODEL_82c0edc858f4455596e4b91a210aa00d", "tooltip": "" } }, "9c70772b542b4a70ad938d689c57d0bf": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_20386e53916943ceb97c5bae8bc254bb", "placeholder": "โ€‹", "style": "IPY_MODEL_361af4b484af4f0f81c2070a08f2bc7b", "value": "\nPro Tip: If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks.
" } }, "cb8ce74af1ea4b8597d70e11418656ba": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": "center", "align_self": null, "border": null, "bottom": null, "display": "flex", "flex": null, "flex_flow": "column", "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": "50%" } }, "494ad5341905419697037e0609f5c53f": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f0513bc5f5d34bc596839c8e35521481": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "79eeebfb3a494837b49e365aa39dd587": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "41219e3c3b45475da0d56d338f34d410": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "14b46365804242d6a5155f9340d3c5dd": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "3dc6acf5609b45d3a447c858fd5476e9": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "3ca92a65c7934ce291062e8a26946438": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "82c0edc858f4455596e4b91a210aa00d": { "model_module": "@jupyter-widgets/controls", "model_name": "ButtonStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ButtonStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "button_color": null, "font_weight": "" } }, "20386e53916943ceb97c5bae8bc254bb": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "361af4b484af4f0f81c2070a08f2bc7b": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "e876c9f2bae9417d8dab87d61cd3afbf": { "model_module": "@jupyter-widgets/controls", "model_name": "LabelModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "LabelModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "LabelView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_36a7abdb338e4f9599c1ab042ab70cb7", "placeholder": "โ€‹", "style": "IPY_MODEL_c607441c4843412f8bb3ed7600ad054c", "value": "Connecting..." } }, "36a7abdb338e4f9599c1ab042ab70cb7": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "c607441c4843412f8bb3ed7600ad054c": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "967f1eb8077240d498683c1c87f4357b": { "model_module": "@jupyter-widgets/controls", "model_name": "LabelModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "LabelModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "LabelView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_2ecccae3f2254a13a1537174af782daf", "placeholder": "โ€‹", "style": "IPY_MODEL_1671f050fdf34e01a158438a42f9f316", "value": "Token is valid (permission: write)." } }, "c6096e8a86be40aa9552f7726d97e598": { "model_module": "@jupyter-widgets/controls", "model_name": "LabelModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "LabelModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "LabelView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_f3197044dd6543629ea3e3920969e9f2", "placeholder": "โ€‹", "style": "IPY_MODEL_806b74de78a44571a838ff03253c4757", "value": "Your token has been saved in your configured git credential helpers (store)." } }, "d692ac35c89a42cebfd610eea6a1c217": { "model_module": "@jupyter-widgets/controls", "model_name": "LabelModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "LabelModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "LabelView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d569a17b57e241eeace1bcf4c0d9ce7a", "placeholder": "โ€‹", "style": "IPY_MODEL_62087bf64ba44cc19fca467abd45f339", "value": "Your token has been saved to /root/.cache/huggingface/token" } }, "79b5f079c9154817850b62cda3b9ac7e": { "model_module": "@jupyter-widgets/controls", "model_name": "LabelModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "LabelModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "LabelView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_30f1e0c976d44dffa10eb36490ef09e8", "placeholder": "โ€‹", "style": "IPY_MODEL_b9c1c3daf6b749ccbbf017cf49107183", "value": "Login successful" } }, "2ecccae3f2254a13a1537174af782daf": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "1671f050fdf34e01a158438a42f9f316": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "f3197044dd6543629ea3e3920969e9f2": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "806b74de78a44571a838ff03253c4757": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "d569a17b57e241eeace1bcf4c0d9ce7a": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "62087bf64ba44cc19fca467abd45f339": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "30f1e0c976d44dffa10eb36490ef09e8": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "b9c1c3daf6b749ccbbf017cf49107183": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "5cb1276bb36c401e979bb26d15814ddf": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_44e6cf699b634061a45936fcc6ff4e38", "IPY_MODEL_f8ec39e47ccd463bad5f7aacd0980c68", "IPY_MODEL_fb3c4b28b6424ada85dbbc14c0a5137c" ], "layout": "IPY_MODEL_ed5a8728e5bc4092a1f75f982129c425" } }, "44e6cf699b634061a45936fcc6ff4e38": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d289a8381cbc47d98894920476c54cd4", "placeholder": "โ€‹", "style": "IPY_MODEL_8e96e8739e5c468abdea79b1850271fd", "value": "pytorch_variables.pth:โ€‡100%" } }, "f8ec39e47ccd463bad5f7aacd0980c68": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_a045bc4127eb4b79ac44edad1358be9b", "max": 864, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_37b382b9493040b18c4415ac6bd2a85e", "value": 864 } }, "fb3c4b28b6424ada85dbbc14c0a5137c": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d761017d1cbb4bc0b7a3ec72f812e3c4", "placeholder": "โ€‹", "style": "IPY_MODEL_a2988e61ac6649e4a7d96b05d42a6ed8", "value": "โ€‡864/864โ€‡[00:00<00:00,โ€‡2.19kB/s]" } }, "ed5a8728e5bc4092a1f75f982129c425": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "d289a8381cbc47d98894920476c54cd4": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "8e96e8739e5c468abdea79b1850271fd": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "a045bc4127eb4b79ac44edad1358be9b": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "37b382b9493040b18c4415ac6bd2a85e": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "d761017d1cbb4bc0b7a3ec72f812e3c4": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "a2988e61ac6649e4a7d96b05d42a6ed8": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "14ad88be062140a5a23839562f2b2ba8": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_d49e2da566844e7dadf5e75e54b88180", "IPY_MODEL_2402dfe7e91c44c38ef04def73434708", "IPY_MODEL_942e2f18181b44fba1fdfc40e95cf6cf" ], "layout": "IPY_MODEL_0927e41bcd7a4e47b5a59ea80414653a" } }, "d49e2da566844e7dadf5e75e54b88180": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_1a35f269b60b4f2cbfa97a2297b293f7", "placeholder": "โ€‹", "style": "IPY_MODEL_9ac81254382b4b3394d1a7be772d9c99", "value": "policy.pth:โ€‡100%" } }, "2402dfe7e91c44c38ef04def73434708": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_dc6db6eac27f4410b10ad5fc284b2801", "max": 43762, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_e0f9512f3ca24b409f9dc5b535281bd8", "value": 43762 } }, "942e2f18181b44fba1fdfc40e95cf6cf": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_37dba554e9864f7896f2ccb4ef3917a7", "placeholder": "โ€‹", "style": "IPY_MODEL_aa0444e825824ca1b74253c374bb85ad", "value": "โ€‡43.8k/43.8kโ€‡[00:00<00:00,โ€‡42.1kB/s]" } }, "0927e41bcd7a4e47b5a59ea80414653a": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "1a35f269b60b4f2cbfa97a2297b293f7": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "9ac81254382b4b3394d1a7be772d9c99": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "dc6db6eac27f4410b10ad5fc284b2801": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "e0f9512f3ca24b409f9dc5b535281bd8": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "37dba554e9864f7896f2ccb4ef3917a7": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "aa0444e825824ca1b74253c374bb85ad": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "b8e32242bd2343b6aef96192880ec7f5": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_b487ddf19329426e9d8801a6d4dabe6c", "IPY_MODEL_993539b4598a42f3b089978c70ea9946", "IPY_MODEL_e9697cbabc9c4b7294e263985f84ded7" ], "layout": "IPY_MODEL_910d074a63b446469f08eb4a2f7f6393" } }, "b487ddf19329426e9d8801a6d4dabe6c": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_bf9a4a7a734844b4b15c2beb3f35825e", "placeholder": "โ€‹", "style": "IPY_MODEL_e2e7f6fe9d4b40168d96bc113765d921", "value": "policy.optimizer.pth:โ€‡100%" } }, "993539b4598a42f3b089978c70ea9946": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d28289787a694cdebef4cb896faa0291", "max": 88362, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_d028ad52c45540e19809bc7d0d169f30", "value": 88362 } }, "e9697cbabc9c4b7294e263985f84ded7": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_9639989b41c3496c80b21166314f45ba", "placeholder": "โ€‹", "style": "IPY_MODEL_4959a0425083436bb885611ca32a2f76", "value": "โ€‡88.4k/88.4kโ€‡[00:00<00:00,โ€‡42.8kB/s]" } }, "910d074a63b446469f08eb4a2f7f6393": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "bf9a4a7a734844b4b15c2beb3f35825e": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "e2e7f6fe9d4b40168d96bc113765d921": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "d28289787a694cdebef4cb896faa0291": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "d028ad52c45540e19809bc7d0d169f30": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "9639989b41c3496c80b21166314f45ba": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "4959a0425083436bb885611ca32a2f76": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "fc0e76a2fac94894918981e32cfad957": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_42c511789ce0410f87c8971ea50702a5", "IPY_MODEL_a022094227554f3ea2224142b359d25d", "IPY_MODEL_5e6b1af0e62c464cb20fc2ed97f2d464" ], "layout": "IPY_MODEL_9e206cbe55454919ad433f068562294d" } }, "42c511789ce0410f87c8971ea50702a5": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_73f6d7a8aaaa4e2488472b520b02d760", "placeholder": "โ€‹", "style": "IPY_MODEL_e649ae1162f44402950a55b91f546e1e", "value": "Uploadโ€‡4โ€‡LFSโ€‡files:โ€‡100%" } }, "a022094227554f3ea2224142b359d25d": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_debdbfa6591241e0907999776e901d9b", "max": 4, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_38064e76065a4d1e969db20fbd2eab88", "value": 4 } }, "5e6b1af0e62c464cb20fc2ed97f2d464": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_e28de247ab924044928572463cd7cda3", "placeholder": "โ€‹", "style": "IPY_MODEL_27c507463f5c4df5a07b0a7304078698", "value": "โ€‡4/4โ€‡[00:01<00:00,โ€‡โ€‡3.16it/s]" } }, "9e206cbe55454919ad433f068562294d": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "73f6d7a8aaaa4e2488472b520b02d760": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "e649ae1162f44402950a55b91f546e1e": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "debdbfa6591241e0907999776e901d9b": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "38064e76065a4d1e969db20fbd2eab88": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "e28de247ab924044928572463cd7cda3": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "27c507463f5c4df5a07b0a7304078698": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "bd694a96de79468eb9c484ed107e5405": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_d304f8fe0d37411993e8f58eb23c3f0d", "IPY_MODEL_bb9ac82a83dd478da3b3fd26eee1566a", "IPY_MODEL_11e393ee1f2a464e8162305ca654b529" ], "layout": "IPY_MODEL_132169fb97424cfd8998ee0ab83a943c" } }, "d304f8fe0d37411993e8f58eb23c3f0d": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_23382b7976784d9f83aa062f299eba70", "placeholder": "โ€‹", "style": "IPY_MODEL_831c6b9438bd459bb069894b9a664106", "value": "ppo-LunarLander-v2-optuna-tuned.zip:โ€‡100%" } }, "bb9ac82a83dd478da3b3fd26eee1566a": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_edb0a0563739485f877945beb109a6d7", "max": 147994, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_983b42acc5a34331b6b9f499f8126ec5", "value": 147994 } }, "11e393ee1f2a464e8162305ca654b529": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_df4f34e2c64b458b85b8daec78afbbdf", "placeholder": "โ€‹", "style": "IPY_MODEL_dba7cba9f8bc45b1beafdcef7cb1d082", "value": "โ€‡148k/148kโ€‡[00:01<00:00,โ€‡51.4kB/s]" } }, "132169fb97424cfd8998ee0ab83a943c": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "23382b7976784d9f83aa062f299eba70": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "831c6b9438bd459bb069894b9a664106": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "edb0a0563739485f877945beb109a6d7": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "983b42acc5a34331b6b9f499f8126ec5": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "df4f34e2c64b458b85b8daec78afbbdf": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "dba7cba9f8bc45b1beafdcef7cb1d082": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } } } } }, "nbformat": 4, "nbformat_minor": 0 }