Keras
harpomaxx commited on
Commit
5138de6
1 Parent(s): b2448f2

add Readme information

Browse files
Files changed (1) hide show
  1. README.md +18 -15
README.md CHANGED
@@ -4,30 +4,33 @@ library_name: keras
4
 
5
  ## Model description
6
 
7
- More information needed
 
 
8
 
9
  ## Intended uses & limitations
10
 
11
- More information needed
12
-
13
  ## Training and evaluation data
14
 
15
- More information needed
 
 
16
 
17
  ## Training procedure
18
 
 
 
19
  ### Training hyperparameters
20
 
21
  The following hyperparameters were used during training:
22
 
23
- | Hyperparameters | Value |
24
- | :-- | :-- |
25
- | name | Adam |
26
- | learning_rate | 0.0010000000474974513 |
27
- | decay | 0.0 |
28
- | beta_1 | 0.8999999761581421 |
29
- | beta_2 | 0.9990000128746033 |
30
- | epsilon | 1e-07 |
31
- | amsgrad | False |
32
- | training_precision | float32 |
33
-
 
4
 
5
  ## Model description
6
 
7
+ A Domain Generation Algoritm (DGA) lexicographical detector using 1DCNN. As described in the article [*Deep Convolutional Neural Networks for DGA Detection* (Catania et al.,2018)](https://link.springer.com/chapter/10.1007/978-3-030-20787-8_23)\
8
+ \
9
+ The rest of source code is available at [GitHub](https://github.com/harpomaxx/deepseq)
10
 
11
  ## Intended uses & limitations
12
 
 
 
13
  ## Training and evaluation data
14
 
15
+ The DGA detection method was trained and evaluated on a dataset containing both DGA and normal domain names.
16
+
17
+ The normal domain names were taken from the Alexa top one million domains. An additional 3,161 normal domains were included in the dataset, provided by the Bambenek Consulting feed. This later group is particularly interesting since it consists of suspicious domain names that were not generated by DGA. Therefore, the total amount of domains normal in the dataset is 1,003,161. DGA domains were obtained from the repositories of DGA domains of [Andrey Abakumov](https://github.com/andrewaeva/DGA) and [John Bambenek](http://osint.bambenekconsulting.com/feeds/) . The total amount of DGA domains is 1,915,335, and they correspond to 51 different malware families.
18
 
19
  ## Training procedure
20
 
21
+ A traditional grid search was conducted through a specified subset on the training set. For a robust estimation, the evaluation of each parameter combination was carried out using a k-fold cross validation with *k=10* folds. The 1D-CNN layer was trained using the back propagation algorithm considering the Adaptive Moment Estimation optimizer. The 1D-CNN training was carried out during 10 epochs. The number of epochs was selected to avoid overfitting.
22
+
23
  ### Training hyperparameters
24
 
25
  The following hyperparameters were used during training:
26
 
27
+ | Hyperparameters | Value |
28
+ |:-------------------|:----------------------|
29
+ | name | Adam |
30
+ | learning_rate | 0.0010000000474974513 |
31
+ | decay | 0.0 |
32
+ | beta_1 | 0.8999999761581421 |
33
+ | beta_2 | 0.9990000128746033 |
34
+ | epsilon | 1e-07 |
35
+ | amsgrad | False |
36
+ | training_precision | float32 |