File size: 3,558 Bytes

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "machine_shape": "hm",
      "gpuType": "T4",
      "provenance": []
    },
    "accelerator": "GPU",
    "kaggle": {
      "accelerator": "gpu"
    },
    "language_info": {
      "name": "python"
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "## Custom Notebook by VB on HF\n",
        "\n",
        "VB made this notebook and Hugging Face gladly served it! yayy!\n",
        "\n",
        "You can really just do things"
      ],
      "metadata": {
        "id": "LJZNGWgrcYeq"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "!pip install -U transformers"
      ],
      "metadata": {
        "id": "Qghvxbi8cOVr"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Model page: https://huggingface.co/reach-vb/Qwen3-0.6B\n",
        "\n",
        "⚠️ If the generated code snippets do not work, please open an issue on either the [model repo](https://huggingface.co/reach-vb/Qwen3-0.6B)\n",
        "\t\t\tand/or on [huggingface.js](https://github.com/huggingface/huggingface.js/blob/main/packages/tasks/src/model-libraries-snippets.ts) 🙏"
      ],
      "metadata": {
        "id": "a6SL-cvKcOVr"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Use a pipeline as a high-level helper\n",
        "from transformers import pipeline\n",
        "import torch\n",
        "\n",
        "pipe = pipeline(\"text-generation\", model=\"reach-vb/Qwen3-0.6B\", torch_dtype=torch.bfloat16)\n",
        "messages = [\n",
        "    {\"role\": \"user\", \"content\": \"Wo bist du, alter?\"},\n",
        "]\n",
        "pipe(messages, max_new_tokens=512)"
      ],
      "metadata": {
        "id": "muWc9vyhcOVr"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Load model directly\n",
        "from transformers import AutoTokenizer, AutoModelForCausalLM\n",
        "\n",
        "tokenizer = AutoTokenizer.from_pretrained(\"reach-vb/Qwen3-0.6B\")\n",
        "model = AutoModelForCausalLM.from_pretrained(\"reach-vb/Qwen3-0.6B\")"
      ],
      "metadata": {
        "id": "1dy_oF6OcOVs"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Too big to run in Colab?\n",
        "\n",
        "Try using Inference Providers for serverless usage of these models"
      ],
      "metadata": {
        "id": "-r9Z_OnzRSFn"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import os\n",
        "from huggingface_hub import InferenceClient\n",
        "\n",
        "client = InferenceClient(\n",
        "    provider=\"auto\",\n",
        "    api_key=os.environ[\"HF_TOKEN\"],\n",
        ")\n",
        "\n",
        "completion = client.chat.completions.create(\n",
        "    model=\"Qwen/Qwen3-4B\",\n",
        "    messages=[\n",
        "        {\n",
        "            \"role\": \"user\",\n",
        "            \"content\": \"What is the capital of France?\"\n",
        "        }\n",
        "    ],\n",
        ")\n",
        "\n",
        "print(completion.choices[0].message)"
      ],
      "metadata": {
        "id": "ZtpuYegeRYPg"
      },
      "execution_count": null,
      "outputs": []
    }
  ]
}