Spaces:

zen21
/

Flower-Generator

Sleeping

App Files Files Community

zen21 commited on Jul 25, 2023

Commit

6a52406

•

1 Parent(s): 5e5eb4d

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitattributes +2 -0
.gitignore +3 -0
.ipynb_checkpoints/FLOWER_GAN_64-checkpoint.ipynb +0 -0
.ipynb_checkpoints/Untitled-checkpoint.ipynb +6 -0
.ipynb_checkpoints/VAE_64-checkpoint.ipynb +0 -0
Conditional_Diffusion_64.ipynb +3 -0
FLOWER_Conditional_GAN_64.ipynb +0 -0
FLOWER_GAN_128.ipynb +3 -0
FLOWER_GAN_64.ipynb +0 -0
Model_Saved_States/CGAN_64_discriminator.pth +3 -0
Model_Saved_States/CGAN_64_generator.pth +3 -0
Model_Saved_States/GAN_128_discriminator.pth +3 -0
Model_Saved_States/GAN_128_generator.pth +3 -0
Model_Saved_States/conditional_diffusion_64.pth +3 -0
Model_Saved_States/diffusion_64.pth +3 -0
Model_Saved_States/sentence_embedding.pth +3 -0
README.md +2 -8
Sentence_Embeddings.ipynb +454 -0
Unconditional_Diffusion_64.ipynb +0 -0
Untitled.ipynb +263 -0
VAE_64.ipynb +0 -0
app.py +182 -0
flagged/log.csv +2 -0
image_desc.csv +0 -0
jpg/flowers/image_00001.jpg +0 -0
jpg/flowers/image_00002.jpg +0 -0
jpg/flowers/image_00003.jpg +0 -0
jpg/flowers/image_00004.jpg +0 -0
jpg/flowers/image_00005.jpg +0 -0
jpg/flowers/image_00006.jpg +0 -0
jpg/flowers/image_00007.jpg +0 -0
jpg/flowers/image_00008.jpg +0 -0
jpg/flowers/image_00009.jpg +0 -0
jpg/flowers/image_00010.jpg +0 -0
jpg/flowers/image_00011.jpg +0 -0
jpg/flowers/image_00012.jpg +0 -0
jpg/flowers/image_00013.jpg +0 -0
jpg/flowers/image_00014.jpg +0 -0
jpg/flowers/image_00015.jpg +0 -0
jpg/flowers/image_00016.jpg +0 -0
jpg/flowers/image_00017.jpg +0 -0
jpg/flowers/image_00018.jpg +0 -0
jpg/flowers/image_00019.jpg +0 -0
jpg/flowers/image_00020.jpg +0 -0
jpg/flowers/image_00021.jpg +0 -0
jpg/flowers/image_00022.jpg +0 -0
jpg/flowers/image_00023.jpg +0 -0
jpg/flowers/image_00024.jpg +0 -0
jpg/flowers/image_00025.jpg +0 -0
jpg/flowers/image_00026.jpg +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+Conditional_Diffusion_64.ipynb filter=lfs diff=lfs merge=lfs -text
+FLOWER_GAN_128.ipynb filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,3 @@

+jpg/
+Model_Saved_State/
+image*

.ipynb_checkpoints/FLOWER_GAN_64-checkpoint.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

.ipynb_checkpoints/Untitled-checkpoint.ipynb ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+ "cells": [],
+ "metadata": {},
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

.ipynb_checkpoints/VAE_64-checkpoint.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

Conditional_Diffusion_64.ipynb ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5e03d955344baf2568ba3de198643ce2b3dc16402fc0b3ee309e869c4ced195a
+size 17604999

FLOWER_Conditional_GAN_64.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

FLOWER_GAN_128.ipynb ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8449a476099b632280ffb66c13414f45a98e196040895f870188a97b16ade374
+size 133325097

FLOWER_GAN_64.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

Model_Saved_States/CGAN_64_discriminator.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:43acdc673ec2922267e3e2b8b398629c94f2bdd8004faf4e47308b66c12fd8eb
+size 16814806

Model_Saved_States/CGAN_64_generator.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:27dd926206fd0acdf0071d71cd5263f634ba7f9e778f5bd8ce20ab1a9ed51e5c
+size 25469566

Model_Saved_States/GAN_128_discriminator.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:69607ce6d1e68b66a80907c32b6e0adaebae049da7a51d8447a85514cfdc9f58
+size 17113810

Model_Saved_States/GAN_128_generator.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b9a804f21b9b4a63ef72ca7e11bfeb767d636403c238d36ee06d7b0127d2eccf
+size 34145406

Model_Saved_States/conditional_diffusion_64.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4acb997e447b573366c76f441abafa6f56afd1dfedd0de288bc384a5d329b256
+size 181400873

Model_Saved_States/diffusion_64.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21a862c67c65539ec93da68a5a278d1205c89d5d0c858bc9d3614aa2427ecf7d
+size 89200917

Model_Saved_States/sentence_embedding.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:de5f9fa514895fa02d90200de434906018832a2725bb30e98f81c67ed57c2059
+size 91405603

README.md CHANGED Viewed

@@ -1,12 +1,6 @@
 ---
-title: First Space
-emoji: 💻
-colorFrom: red
-colorTo: purple
 sdk: gradio
 sdk_version: 3.38.0
-app_file: app.py
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: first_space
+app_file: app.py
 sdk: gradio
 sdk_version: 3.38.0
 ---

Sentence_Embeddings.ipynb ADDED Viewed

	@@ -0,0 +1,454 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "62c37427",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoModel, AutoTokenizer\n",
+    "from sentence_transformers import SentenceTransformer\n",
+    "import torch\n",
+    "import torch.nn as nn\n",
+    "from torch.utils.data import Dataset, DataLoader\n",
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "import torch.nn.functional as F\n",
+    "from tqdm import tqdm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "ca8d35e3",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "cuda\n"
+     ]
+    }
+   ],
+   "source": [
+    "sentence_model = SentenceTransformer('all-MiniLM-L6-v2')\n",
+    "\n",
+    "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
+    "print(device)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "27935f40",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class ImageDataset(Dataset):\n",
+    "    def __init__(self, csv_file, transform=None):\n",
+    "        self.annotations = csv_file\n",
+    "        self.transform=transform\n",
+    "    \n",
+    "    def __len__(self):\n",
+    "        return len(self.annotations)\n",
+    "    \n",
+    "    def __getitem__(self,index):\n",
+    "        img_desc = self.annotations.iloc[index, 2]\n",
+    "\n",
+    "        label=torch.tensor(int(self.annotations.iloc[index, 3]))\n",
+    "        \n",
+    "        if self.transform:\n",
+    "            img_desc = self.transform(img_desc)\n",
+    "            \n",
+    "        return (img_desc, label)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "d96cfab6",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "81890\n"
+     ]
+    }
+   ],
+   "source": [
+    "df = pd.read_csv('image_desc.csv')\n",
+    "dataset = ImageDataset(df)\n",
+    "train_size = int(0.85 * len(dataset))\n",
+    "test_size = len(dataset) - train_size\n",
+    "train_set, test_set = torch.utils.data.random_split(dataset, [train_size, test_size])\n",
+    "print(len(dataset))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "d12e4992",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "batch_size=16\n",
+    "train_loader=DataLoader(train_set, batch_size=batch_size, shuffle=True)\n",
+    "test_loader=DataLoader(test_set, batch_size=batch_size, shuffle=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "id": "4e1f90e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class MyModel(nn.Module):\n",
+    "    def __init__(self, sentence_model, hidden_dim, output_dim):\n",
+    "        super(MyModel, self).__init__()\n",
+    "        self.sentence_model = sentence_model\n",
+    "        self.fc1 = nn.Linear(384, hidden_dim)\n",
+    "        self.fc2 = nn.Linear(hidden_dim, output_dim)\n",
+    "        self.sig = nn.Sigmoid()\n",
+    "\n",
+    "    def forward(self, x):\n",
+    "        sentence_embeddings = self.sentence_model.encode(x, convert_to_tensor=True)\n",
+    "        sentence_embeddings = sentence_embeddings.to(device)\n",
+    "        hidden = self.fc1(sentence_embeddings)\n",
+    "        hidden = F.relu(hidden)\n",
+    "        logits = self.fc2(hidden)\n",
+    "#         logits = torch.clamp(logits, min=1e-5)\n",
+    "        logits = self.sig(logits)\n",
+    "        return logits\n",
+    "\n",
+    "output_dim = 102\n",
+    "hidden_dim = 256\n",
+    "\n",
+    "model = MyModel(sentence_model, hidden_dim, output_dim).to(device)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 37,
+   "id": "85c41c63",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████████████████████████████████████████████████████████████████████████| 4351/4351 [01:42<00:00, 42.36it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "tensor(1.0000, device='cuda:0', grad_fn=<SumBackward1>)\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "min = torch.tensor(1).to(device)\n",
+    "similarity = nn.CosineSimilarity(dim = 0)\n",
+    "for sample_batch, sample_label in tqdm(train_loader):\n",
+    "    i = sample_batch[0]\n",
+    "    j = sample_batch[1]\n",
+    "    output_i = model(i)\n",
+    "    output_j = model(j)\n",
+    "    sim_i_j = similarity(output_i, output_j)\n",
+    "    if sim_i_j < min:\n",
+    "        min = sim_i_j\n",
+    "        \n",
+    "print(min)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "id": "e99a5150",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "criterion = nn.CrossEntropyLoss()\n",
+    "# criterion = nn.MSELoss()\n",
+    "optimizer = torch.optim.Adam(model.parameters(), lr=0.005)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 56,
+   "id": "7957341a",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████████████████████████████████████████████████████████████████████████| 4351/4351 [01:06<00:00, 65.25it/s]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch: 1/4, Loss: 1116.4719812870026\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████████████████████████████████████████████████████████████████████████| 4351/4351 [01:06<00:00, 65.90it/s]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch: 2/4, Loss: 1087.523635149002\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████████████████████████████████████████████████████████████████████████| 4351/4351 [01:06<00:00, 65.20it/s]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch: 3/4, Loss: 1079.509438186884\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████████████████████████████████████████████████████████████████████████| 4351/4351 [01:07<00:00, 64.31it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch: 4/4, Loss: 1074.7653084248304\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "num_epochs = 4\n",
+    "for epoch in range(num_epochs):\n",
+    "    model.train()\n",
+    "    losses = []\n",
+    "\n",
+    "    for i, (sentences_batch, labels_batch) in enumerate(tqdm(train_loader)):\n",
+    "        labels_batch = labels_batch.to(device)\n",
+    "        labels_batch = F.one_hot(labels_batch, num_classes = 102).float()\n",
+    "        optimizer.zero_grad()\n",
+    "        # Forward pass\n",
+    "        logits = model(sentences_batch).float()\n",
+    "        loss = criterion(logits, labels_batch)\n",
+    "        \n",
+    "        # Backward pass and optimization\n",
+    "        loss.backward()\n",
+    "        optimizer.step()\n",
+    "        curr_loss = loss.item()\n",
+    "        losses.append(curr_loss)\n",
+    "        \n",
+    "    running_loss = sum(losses)\n",
+    "    \n",
+    "    # Print the average loss for every epoch\n",
+    "    epoch_loss = running_loss / batch_size\n",
+    "    print(f\"Epoch: {epoch+1}/{num_epochs}, Loss: {epoch_loss}\")\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 47,
+   "id": "4ecaab5d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████████████████████████████████████████████████████████████████████████| 4351/4351 [01:33<00:00, 46.66it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "tensor(0., device='cuda:0', grad_fn=<SumBackward1>)\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "for sample_batch, sample_label in tqdm(train_loader):\n",
+    "    i = sample_batch[0]\n",
+    "    j = sample_batch[1]\n",
+    "    output_i = model(i)\n",
+    "    output_j = model(j)\n",
+    "    sim_i_j = similarity(output_i, output_j)\n",
+    "    if sim_i_j < min:\n",
+    "        min = sim_i_j\n",
+    "        \n",
+    "print(min)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 57,
+   "id": "a0d95f76",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████████████████████████████████████████████████████████████████████████| 4351/4351 [01:04<00:00, 67.76it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Accuracy: 0.2659109846852283\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "model.eval()\n",
+    "total_correct = 0\n",
+    "total_samples = 0\n",
+    "\n",
+    "with torch.no_grad():\n",
+    "    for i, (sentences_batch, labels_batch) in enumerate(tqdm(train_loader)):\n",
+    "        labels_batch = labels_batch.to(device)\n",
+    "#         labels_batch = F.one_hot(labels_batch, num_classes = 102).float()\n",
+    "        logits = model(sentences_batch).float()\n",
+    "        predicted = torch.argmax(logits, dim = 1)\n",
+    "        total_samples += labels_batch.size(0)\n",
+    "        total_correct += (predicted == labels_batch).sum().item()\n",
+    "\n",
+    "accuracy = total_correct / total_samples\n",
+    "print(\"Accuracy:\", accuracy)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 58,
+   "id": "da1763b7",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "ENTER DESCRIPTION pink\n",
+      "tensor([9.6708e-15, 1.0179e-04, 4.2242e-08, 1.3063e-15, 8.8056e-03, 0.0000e+00,\n",
+      "        1.6553e-14, 8.9271e-33, 9.0644e-27, 5.9910e-19, 2.2721e-24, 7.7432e-03,\n",
+      "        3.9587e-36, 7.1618e-07, 2.7430e-08, 0.0000e+00, 0.0000e+00, 1.4562e-03,\n",
+      "        9.8114e-06, 9.2844e-24, 7.8520e-33, 2.9296e-22, 3.5067e-13, 1.3316e-05,\n",
+      "        7.7768e-11, 9.2201e-39, 5.0639e-22, 1.6904e-19, 3.2689e-35, 1.0034e-14,\n",
+      "        9.8686e-01, 4.1330e-05, 6.3048e-01, 9.5960e-23, 1.2662e-14, 2.4540e-22,\n",
+      "        1.4413e-08, 9.9928e-01, 2.8299e-02, 4.9763e-10, 2.7364e-04, 9.9878e-01,\n",
+      "        0.0000e+00, 9.9998e-01, 6.7328e-02, 2.9939e-13, 1.9145e-17, 0.0000e+00,\n",
+      "        0.0000e+00, 0.0000e+00, 0.0000e+00, 9.9998e-01, 1.1818e-30, 2.2513e-22,\n",
+      "        0.0000e+00, 1.0346e-32, 8.8656e-21, 9.9353e-01, 4.3037e-03, 8.6023e-39,\n",
+      "        3.6964e-10, 3.3164e-21, 1.9611e-15, 0.0000e+00, 3.7135e-38, 1.3163e-34,\n",
+      "        1.8906e-07, 7.0084e-30, 1.0882e-20, 2.6501e-33, 8.9597e-39, 5.0791e-37,\n",
+      "        1.0000e+00, 5.7929e-03, 1.3252e-03, 1.4498e-23, 1.3656e-02, 2.0226e-07,\n",
+      "        8.3005e-01, 8.4326e-14, 2.1941e-03, 3.8749e-28, 9.8803e-01, 9.9992e-01,\n",
+      "        4.3195e-11, 7.0360e-01, 1.0000e+00, 1.5408e-02, 9.9689e-01, 8.0569e-15,\n",
+      "        1.4282e-22, 9.6706e-03, 4.9712e-03, 4.8348e-05, 1.2486e-05, 9.9923e-01,\n",
+      "        6.3526e-06, 1.7522e-01, 8.8239e-01, 2.0713e-11, 2.2530e-20, 2.1032e-05],\n",
+      "       device='cuda:0', grad_fn=<SigmoidBackward0>)\n",
+      "tensor(72, device='cuda:0')\n"
+     ]
+    }
+   ],
+   "source": [
+    "sentence = input(\"ENTER DESCRIPTION \")\n",
+    "output = model(sentence)\n",
+    "predicted = torch.argmax(output)\n",
+    "print(output)\n",
+    "print(predicted)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 59,
+   "id": "bcf4856c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "torch.save(model.state_dict(), \"sentence_embedding.pth\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5f6f22f4",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

Unconditional_Diffusion_64.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

Untitled.ipynb ADDED Viewed

	@@ -0,0 +1,263 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "68fece49",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import gradio as gr\n",
+    "import torch\n",
+    "import torch.nn as nn\n",
+    "import torch.nn.functional as F\n",
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "class DoubleConv(nn.Module):\n",
+    "    def __init__(self, in_channels, out_channels, mid_channels=None, residual=False):\n",
+    "        super().__init__()\n",
+    "        self.residual = residual\n",
+    "        if not mid_channels:\n",
+    "            mid_channels = out_channels\n",
+    "        self.double_conv = nn.Sequential(\n",
+    "            nn.Conv2d(in_channels, mid_channels, kernel_size=3, padding=1, bias=False),\n",
+    "            nn.GroupNorm(1, mid_channels),\n",
+    "            nn.GELU(),\n",
+    "            nn.Conv2d(mid_channels, out_channels, kernel_size=3, padding=1, bias=False),\n",
+    "            nn.GroupNorm(1, out_channels),\n",
+    "        )\n",
+    "\n",
+    "    def forward(self, x):\n",
+    "        if self.residual:\n",
+    "            return F.gelu(x + self.double_conv(x))\n",
+    "        else:\n",
+    "            return self.double_conv(x)\n",
+    "\n",
+    "class Down(nn.Module):\n",
+    "    def __init__(self, in_channels, out_channels, emb_dim=256):\n",
+    "        super().__init__()\n",
+    "        self.maxpool_conv = nn.Sequential(\n",
+    "            nn.MaxPool2d(2),\n",
+    "            DoubleConv(in_channels, in_channels, residual=True),\n",
+    "            DoubleConv(in_channels, out_channels),\n",
+    "        )\n",
+    "\n",
+    "        self.emb_layer = nn.Sequential(\n",
+    "            nn.SiLU(),\n",
+    "            nn.Linear(\n",
+    "                emb_dim,\n",
+    "                out_channels\n",
+    "            ),\n",
+    "        )\n",
+    "\n",
+    "    def forward(self, x, t):\n",
+    "        x = self.maxpool_conv(x)\n",
+    "        emb = self.emb_layer(t)[:, :, None, None].repeat(1, 1, x.shape[-2], x.shape[-1])\n",
+    "        return x + emb\n",
+    "\n",
+    "class Up(nn.Module):\n",
+    "    def __init__(self, in_channels, out_channels, emb_dim=256):\n",
+    "        super().__init__()\n",
+    "\n",
+    "        self.up = nn.Upsample(scale_factor=2, mode=\"bilinear\", align_corners=True)\n",
+    "        self.conv = nn.Sequential(\n",
+    "            DoubleConv(in_channels, in_channels, residual=True),\n",
+    "            DoubleConv(in_channels, out_channels, in_channels // 2),\n",
+    "        )\n",
+    "\n",
+    "        self.emb_layer = nn.Sequential(\n",
+    "            nn.SiLU(),\n",
+    "            nn.Linear(\n",
+    "                emb_dim,\n",
+    "                out_channels\n",
+    "            ),\n",
+    "        )\n",
+    "\n",
+    "    def forward(self, x, skip_x, t):\n",
+    "        x = self.up(x)\n",
+    "        x = torch.cat([skip_x, x], dim=1)\n",
+    "        x = self.conv(x)\n",
+    "        emb = self.emb_layer(t)[:, :, None, None].repeat(1, 1, x.shape[-2], x.shape[-1])\n",
+    "        return x + emb\n",
+    "\n",
+    "class UNet(nn.Module):\n",
+    "    def __init__(self, c_in=3, c_out=3, time_dim=256, device=\"cuda\"):\n",
+    "        super().__init__()\n",
+    "        self.device = device\n",
+    "        self.time_dim = time_dim\n",
+    "\n",
+    "        self.inc = DoubleConv(c_in, 64)\n",
+    "        self.down1 = Down(64, 128)\n",
+    "        self.down2 = Down(128, 256)\n",
+    "        self.down3 = Down(256, 256)\n",
+    "\n",
+    "        self.bot1 = DoubleConv(256, 512)\n",
+    "        self.bot2 = DoubleConv(512, 512)\n",
+    "        self.bot3 = DoubleConv(512, 256)\n",
+    "\n",
+    "        self.up1 = Up(512, 128)\n",
+    "        self.up2 = Up(256, 64)\n",
+    "        self.up3 = Up(128, 64)\n",
+    "        self.outc = nn.Conv2d(64, c_out, kernel_size=1)\n",
+    "\n",
+    "    def positional_encoding(self, t, channels):\n",
+    "        inv_freq = 1.0 / (\n",
+    "            10000\n",
+    "            ** (torch.arange(0, channels, 2, device=self.device).float() / channels)\n",
+    "        )\n",
+    "        pos_enc_a = torch.sin(t.repeat(1, channels // 2) * inv_freq)\n",
+    "        pos_enc_b = torch.cos(t.repeat(1, channels // 2) * inv_freq)\n",
+    "        pos_enc = torch.cat([pos_enc_a, pos_enc_b], dim=-1)\n",
+    "        return pos_enc\n",
+    "\n",
+    "    def forward(self, image, t):\n",
+    "        t = t.unsqueeze(-1).type(torch.float)\n",
+    "        t = self.positional_encoding(t, self.time_dim)\n",
+    "\n",
+    "        x1 = self.inc(image)\n",
+    "        x2 = self.down1(x1, t)\n",
+    "        x3 = self.down2(x2, t)\n",
+    "        x4 = self.down3(x3, t)\n",
+    "\n",
+    "        x4 = self.bot1(x4)\n",
+    "        # x4 = self.bot2(x4)\n",
+    "        x4 = self.bot3(x4)\n",
+    "\n",
+    "        x = self.up1(x4, x3, t)\n",
+    "        x = self.up2(x, x2, t)\n",
+    "        x = self.up3(x, x1, t)\n",
+    "        output = self.outc(x)\n",
+    "        return output\n",
+    "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
+    "model = UNet(device = device).to(device)\n",
+    "model.load_state_dict(torch.load('Model_Saved_States/diffusion_64.pth'))\n",
+    "img_size = 64\n",
+    "class Diffusion():\n",
+    "  def __init__(self, time_steps = 500, beta_start = 0.0001, beta_stop = 0.02, image_size = 64, device = device):\n",
+    "    self.time_steps = time_steps\n",
+    "    self.beta_start = beta_start\n",
+    "    self.beta_stop = beta_stop\n",
+    "    self.img_size = image_size\n",
+    "    self.device = device\n",
+    "\n",
+    "    self.beta = self.beta_schedule()\n",
+    "    self.beta = self.beta.to(device)\n",
+    "    self.alpha = 1 - self.beta\n",
+    "    self.alpha = self.alpha.to(device)\n",
+    "    self.alpha_hat = torch.cumprod(self.alpha, dim = 0).to(device)\n",
+    "\n",
+    "\n",
+    "  def beta_schedule(self):\n",
+    "    return torch.linspace(self.beta_start, self.beta_stop, self.time_steps)\n",
+    "\n",
+    "  def noise_images(self, images, t):\n",
+    "    sqrt_alpha_hat = torch.sqrt(self.alpha_hat[t])[:, None, None, None,]\n",
+    "    sqrt_one_minus_alpha_hat = torch.sqrt(1 - self.alpha_hat[t])[:, None, None, None,]\n",
+    "    noises = torch.randn_like(images)\n",
+    "    noised_images = sqrt_alpha_hat * images + sqrt_one_minus_alpha_hat * noises\n",
+    "    return noised_images, noises\n",
+    "\n",
+    "  def random_timesteps(self, n):\n",
+    "    return torch.randint(low=1, high=self.time_steps, size=(n,))\n",
+    "\n",
+    "  def generate_samples(self, model, n):\n",
+    "    with torch.no_grad():\n",
+    "            x = torch.randn((n, 3, self.img_size, self.img_size)).to(self.device)\n",
+    "            for i in range(self.time_steps - 1, 1, -1):\n",
+    "                t = (torch.ones(n) * i).long().to(self.device)\n",
+    "                predicted_noise = model(x, t)\n",
+    "                alpha = self.alpha[t][:, None, None, None]\n",
+    "                alpha_hat = self.alpha_hat[t][:, None, None, None]\n",
+    "                beta = self.beta[t][:, None, None, None]\n",
+    "                if i > 1:\n",
+    "                    noise = torch.randn_like(x)\n",
+    "                else:\n",
+    "                    noise = torch.zeros_like(x)\n",
+    "                x = 1 / torch.sqrt(alpha) * (x - ((1 - alpha) / (torch.sqrt(1 - alpha_hat))) * predicted_noise) + torch.sqrt(beta) * noise\n",
+    "\n",
+    "    return (x[0].cpu().numpy().transpose(1, 2, 0) / 255)\n",
+    "      #show_images\n",
+    "\n",
+    "diffusion = Diffusion()\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "id": "a80516cd",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Running on local URL:  http://127.0.0.1:7867\n",
+      "Running on public URL: https://080248f8c7c14eec1e.gradio.live\n",
+      "\n",
+      "This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)\n"
+     ]
+    },
+    {
+     "data": {
+      "text/html": [
+       "<div><iframe src=\"https://080248f8c7c14eec1e.gradio.live\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "text/plain": []
+     },
+     "execution_count": 26,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import numpy as np\n",
+    "def greet(n):\n",
+    "    image = diffusion.generate_samples(model, n = 1)\n",
+    "    image = (np.clip(image * 255, -1, 1) + 1) / 2\n",
+    "    plt.imshow(image)\n",
+    "    return image\n",
+    "\n",
+    "iface = gr.Interface(fn=greet, inputs=\"number\", outputs=\"image\")\n",
+    "iface.launch(share = True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cc6f5064",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

VAE_64.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

app.py ADDED Viewed

	@@ -0,0 +1,182 @@

+import gradio as gr
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import matplotlib.pyplot as plt
+class DoubleConv(nn.Module):
+    def __init__(self, in_channels, out_channels, mid_channels=None, residual=False):
+        super().__init__()
+        self.residual = residual
+        if not mid_channels:
+            mid_channels = out_channels
+        self.double_conv = nn.Sequential(
+            nn.Conv2d(in_channels, mid_channels, kernel_size=3, padding=1, bias=False),
+            nn.GroupNorm(1, mid_channels),
+            nn.GELU(),
+            nn.Conv2d(mid_channels, out_channels, kernel_size=3, padding=1, bias=False),
+            nn.GroupNorm(1, out_channels),
+        )
+    def forward(self, x):
+        if self.residual:
+            return F.gelu(x + self.double_conv(x))
+        else:
+            return self.double_conv(x)
+class Down(nn.Module):
+    def __init__(self, in_channels, out_channels, emb_dim=256):
+        super().__init__()
+        self.maxpool_conv = nn.Sequential(
+            nn.MaxPool2d(2),
+            DoubleConv(in_channels, in_channels, residual=True),
+            DoubleConv(in_channels, out_channels),
+        )
+        self.emb_layer = nn.Sequential(
+            nn.SiLU(),
+            nn.Linear(
+                emb_dim,
+                out_channels
+            ),
+        )
+    def forward(self, x, t):
+        x = self.maxpool_conv(x)
+        emb = self.emb_layer(t)[:, :, None, None].repeat(1, 1, x.shape[-2], x.shape[-1])
+        return x + emb
+class Up(nn.Module):
+    def __init__(self, in_channels, out_channels, emb_dim=256):
+        super().__init__()
+        self.up = nn.Upsample(scale_factor=2, mode="bilinear", align_corners=True)
+        self.conv = nn.Sequential(
+            DoubleConv(in_channels, in_channels, residual=True),
+            DoubleConv(in_channels, out_channels, in_channels // 2),
+        )
+        self.emb_layer = nn.Sequential(
+            nn.SiLU(),
+            nn.Linear(
+                emb_dim,
+                out_channels
+            ),
+        )
+    def forward(self, x, skip_x, t):
+        x = self.up(x)
+        x = torch.cat([skip_x, x], dim=1)
+        x = self.conv(x)
+        emb = self.emb_layer(t)[:, :, None, None].repeat(1, 1, x.shape[-2], x.shape[-1])
+        return x + emb
+class UNet(nn.Module):
+    def __init__(self, c_in=3, c_out=3, time_dim=256, device="cuda"):
+        super().__init__()
+        self.device = device
+        self.time_dim = time_dim
+        self.inc = DoubleConv(c_in, 64)
+        self.down1 = Down(64, 128)
+        self.down2 = Down(128, 256)
+        self.down3 = Down(256, 256)
+        self.bot1 = DoubleConv(256, 512)
+        self.bot2 = DoubleConv(512, 512)
+        self.bot3 = DoubleConv(512, 256)
+        self.up1 = Up(512, 128)
+        self.up2 = Up(256, 64)
+        self.up3 = Up(128, 64)
+        self.outc = nn.Conv2d(64, c_out, kernel_size=1)
+    def positional_encoding(self, t, channels):
+        inv_freq = 1.0 / (
+            10000
+            ** (torch.arange(0, channels, 2, device=self.device).float() / channels)
+        )
+        pos_enc_a = torch.sin(t.repeat(1, channels // 2) * inv_freq)
+        pos_enc_b = torch.cos(t.repeat(1, channels // 2) * inv_freq)
+        pos_enc = torch.cat([pos_enc_a, pos_enc_b], dim=-1)
+        return pos_enc
+    def forward(self, image, t):
+        t = t.unsqueeze(-1).type(torch.float)
+        t = self.positional_encoding(t, self.time_dim)
+        x1 = self.inc(image)
+        x2 = self.down1(x1, t)
+        x3 = self.down2(x2, t)
+        x4 = self.down3(x3, t)
+        x4 = self.bot1(x4)
+        # x4 = self.bot2(x4)
+        x4 = self.bot3(x4)
+        x = self.up1(x4, x3, t)
+        x = self.up2(x, x2, t)
+        x = self.up3(x, x1, t)
+        output = self.outc(x)
+        return output
+device = 'cuda' if torch.cuda.is_available() else 'cpu'
+model = UNet(device = device).to(device)
+model.load_state_dict(torch.load('Model_Saved_States/diffusion_64.pth'))
+img_size = 64
+class Diffusion():
+  def __init__(self, time_steps = 500, beta_start = 0.0001, beta_stop = 0.02, image_size = 64, device = device):
+    self.time_steps = time_steps
+    self.beta_start = beta_start
+    self.beta_stop = beta_stop
+    self.img_size = image_size
+    self.device = device
+    self.beta = self.beta_schedule()
+    self.beta = self.beta.to(device)
+    self.alpha = 1 - self.beta
+    self.alpha = self.alpha.to(device)
+    self.alpha_hat = torch.cumprod(self.alpha, dim = 0).to(device)
+  def beta_schedule(self):
+    return torch.linspace(self.beta_start, self.beta_stop, self.time_steps)
+  def noise_images(self, images, t):
+    sqrt_alpha_hat = torch.sqrt(self.alpha_hat[t])[:, None, None, None,]
+    sqrt_one_minus_alpha_hat = torch.sqrt(1 - self.alpha_hat[t])[:, None, None, None,]
+    noises = torch.randn_like(images)
+    noised_images = sqrt_alpha_hat * images + sqrt_one_minus_alpha_hat * noises
+    return noised_images, noises
+  def random_timesteps(self, n):
+    return torch.randint(low=1, high=self.time_steps, size=(n,))
+  def generate_samples(self, model, n):
+    with torch.no_grad():
+            x = torch.randn((n, 3, self.img_size, self.img_size)).to(self.device)
+            for i in range(self.time_steps - 1, 1, -1):
+                t = (torch.ones(n) * i).long().to(self.device)
+                predicted_noise = model(x, t)
+                alpha = self.alpha[t][:, None, None, None]
+                alpha_hat = self.alpha_hat[t][:, None, None, None]
+                beta = self.beta[t][:, None, None, None]
+                if i > 1:
+                    noise = torch.randn_like(x)
+                else:
+                    noise = torch.zeros_like(x)
+                x = 1 / torch.sqrt(alpha) * (x - ((1 - alpha) / (torch.sqrt(1 - alpha_hat))) * predicted_noise) + torch.sqrt(beta) * noise
+    return (x[0].cpu().numpy().transpose(1, 2, 0) / 255)
+      #show_images
+diffusion = Diffusion()
+import numpy as np
+def greet(n):
+    image = diffusion.generate_samples(model, n = 1)
+    image = (np.clip(image * 255, -1, 1) + 1) / 2
+    plt.imshow(image)
+    return image
+iface = gr.Interface(fn=greet, inputs="number", outputs="image")
+iface.launch(share = True)