nevoit
/

Synthesizing-Tabular-Data-Using-GAN

GAN

Model card Files Files and versions Community

nevoit commited on Aug 9, 2023

Commit

3f933b6

•

1 Parent(s): da26d66

Update README.md

Browse files

Files changed (1) hide show

README.md +23 -23

README.md CHANGED Viewed

@@ -35,11 +35,11 @@ We implemented this assignment using mainly Keras and Sklearn.
 An example for the ‘Adults’ dataset:
-![](https://huggingface.co/nevoit//Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.001.png)
 An example for the ‘Bank-full’ dataset:
-![](https://huggingface.co/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.002.png)
 **Code Design:**
@@ -150,15 +150,15 @@ For adults dataset, the results of the model were:
 - MMDNF = Mean minimum euclidean distance for the not fooled samples was 0.422
 - Several samples that “fooled” the detector:
-![](huggingface.co/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.003.png)
 - Several samples that “not fooled” the detector:
-![](huggingface.co/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.004.png)
 - Plotting the PCA shows that the fooled samples are very similar to the real data and the not fooled samples are less similar.
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.005.png)
 - Out of 100 samples, 74 samples were fooled by the discriminator and 26 samples were not fooled by the discriminator.
 - A graph describing the loss of the generator and the discriminator:
@@ -166,7 +166,7 @@ For adults dataset, the results of the model were:
   - The generator loss was extremely decreased while the discriminator loss was quite the same.
   - Eventually the generator and the discriminator were quite coveraged nearly a loss of 0.6.
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.006.png)
 For bank-full dataset, the results of the model were:
@@ -174,15 +174,15 @@ For bank-full dataset, the results of the model were:
 - MMDNF = Mean minimum euclidean distance for the not fooled samples was 0.305854238
 - Several samples that “fooled” the detector:
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.007.png)
 - Several samples that “not fooled” the detector:
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.008.png)
 - Plotting the PCA shows that the fooled samples are very similar to the real data and the not fooled samples are less similar.
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.009.png)
 - Out of 100 samples, 32 samples were fooled by the discriminator and 68 samples were not fooled by the discriminator.
 - A graph describing the loss of the generator and the discriminator:
@@ -190,7 +190,7 @@ For bank-full dataset, the results of the model were:
   - The generator loss was extremely decreased while the discriminator loss was quite the same.
   - Eventually the generator and the discriminator were coveraged nearly a loss of 0.5.
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.010.png)
 ## General Generator (Part 2)
@@ -251,14 +251,14 @@ In this part the main goal was for the distribution of confidence probabilities
 - Class distribution:
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.011.png)
 - Note that there is some imbalance here, which is nearly identical to the ratio between the mean confidence scores for each class.
 - Probability distribution for class 0 and class 1, for the **test set**:
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.012.png)
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.013.png)
 - Note that the images mirror each other.
@@ -272,15 +272,15 @@ In this part the main goal was for the distribution of confidence probabilities
 Class distribution:
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.014.png)
 - The data here is even more imbalanced. The confidence scores reflect this.
 - Confidence score distribution for test set:
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.015.png)
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.016.png)
 **Generator Results:**
@@ -291,14 +291,14 @@ Here we first uniformly sampled 1000 confidence rates from [0,1]. Then, based on
 - Training loss:
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.017.png)
 - Confidence score distribution for each class:
   - Note that they mirror each other.
-  ![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.018.png)
-  ![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.019.png)
   - The results are far from uniform, but it is obvious that they are skewed towards the original confidence scores.
@@ -309,21 +309,21 @@ Here we first uniformly sampled 1000 confidence rates from [0,1]. Then, based on
 - Training loss:
 -
-![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.021.png)
 - Confidence score distribution for each class:
   - As before, they mirror each other.
   - The distribution isn’t uniform, and is slightly skewed in the opposite direction of the distribution for the test set.
-  ![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.022.png)
-  ![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.023.png)
 - Error rates for class 1:
   - **The lowest error rates were achieved for probabilities of around 0.4~**. The highest was for probability of 0.
-  ![](https://github.com/nevoit/Generative-Adversarial-Network-/blob/master/figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.024.png)
 ## Discussion

 An example for the ‘Adults’ dataset:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.001.png)
 An example for the ‘Bank-full’ dataset:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.002.png)
 **Code Design:**
 - MMDNF = Mean minimum euclidean distance for the not fooled samples was 0.422
 - Several samples that “fooled” the detector:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.003.png)
 - Several samples that “not fooled” the detector:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.004.png)
 - Plotting the PCA shows that the fooled samples are very similar to the real data and the not fooled samples are less similar.
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.005.png)
 - Out of 100 samples, 74 samples were fooled by the discriminator and 26 samples were not fooled by the discriminator.
 - A graph describing the loss of the generator and the discriminator:
   - The generator loss was extremely decreased while the discriminator loss was quite the same.
   - Eventually the generator and the discriminator were quite coveraged nearly a loss of 0.6.
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.006.png)
 For bank-full dataset, the results of the model were:
 - MMDNF = Mean minimum euclidean distance for the not fooled samples was 0.305854238
 - Several samples that “fooled” the detector:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.007.png)
 - Several samples that “not fooled” the detector:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.008.png)
 - Plotting the PCA shows that the fooled samples are very similar to the real data and the not fooled samples are less similar.
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.009.png)
 - Out of 100 samples, 32 samples were fooled by the discriminator and 68 samples were not fooled by the discriminator.
 - A graph describing the loss of the generator and the discriminator:
   - The generator loss was extremely decreased while the discriminator loss was quite the same.
   - Eventually the generator and the discriminator were coveraged nearly a loss of 0.5.
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.010.png)
 ## General Generator (Part 2)
 - Class distribution:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.011.png)
 - Note that there is some imbalance here, which is nearly identical to the ratio between the mean confidence scores for each class.
 - Probability distribution for class 0 and class 1, for the **test set**:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.012.png)
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.013.png)
 - Note that the images mirror each other.
 Class distribution:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.014.png)
 - The data here is even more imbalanced. The confidence scores reflect this.
 - Confidence score distribution for test set:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.015.png)
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.016.png)
 **Generator Results:**
 - Training loss:
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.017.png)
 - Confidence score distribution for each class:
   - Note that they mirror each other.
+  ![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.018.png)
+  ![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.019.png)
   - The results are far from uniform, but it is obvious that they are skewed towards the original confidence scores.
 - Training loss:
 -
+![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.021.png)
 - Confidence score distribution for each class:
   - As before, they mirror each other.
   - The distribution isn’t uniform, and is slightly skewed in the opposite direction of the distribution for the test set.
+  ![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.022.png)
+  ![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.023.png)
 - Error rates for class 1:
   - **The lowest error rates were achieved for probabilities of around 0.4~**. The highest was for probability of 0.
+  ![](figures/Aspose.Words.36be2542-1776-4b1c-8010-360ae82480ae.024.png)
 ## Discussion