File size: 10,113 Bytes
05e2b8a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
license: apache-2.0
library_name: sklearn
tags:
- tabular-classification
- baseline-trainer
---

## Baseline Model trained on arinfo_sample_dataset_finaltffwjv58 to apply classification on model

**Metrics of the best model:**

accuracy           0.930688

recall_macro       0.655991

precision_macro    0.640972

f1_macro           0.638021

Name: DecisionTreeClassifier(class_weight='balanced', max_depth=2249), dtype: float64



**See model plot below:**

<style>#sk-container-id-4 {color: black;background-color: white;}#sk-container-id-4 pre{padding: 0;}#sk-container-id-4 div.sk-toggleable {background-color: white;}#sk-container-id-4 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-4 label.sk-toggleable__label-arrow:before {content: "▸";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-4 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-4 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-4 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-4 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-4 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-4 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: "▾";}#sk-container-id-4 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-4 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-4 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-4 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-4 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-4 div.sk-parallel-item::after {content: "";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-4 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-4 div.sk-serial::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-4 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-4 div.sk-item {position: relative;z-index: 1;}#sk-container-id-4 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-4 div.sk-item::before, #sk-container-id-4 div.sk-parallel-item::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-4 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-4 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-4 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-4 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-4 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-4 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-4 div.sk-label-container {text-align: center;}#sk-container-id-4 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-4 div.sk-text-repr-fallback {display: none;}</style><div id="sk-container-id-4" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>Pipeline(steps=[(&#x27;easypreprocessor&#x27;,EasyPreprocessor(types=             continuous  dirty_float  low_card_int  ...   date  free_string  useless
rto               False        False         False  ...  False         True    False
ownerNum          False        False         False  ...  False        False    False
cc                False        False         False  ...  False        False    False
insurance         False        False         False  ...  False        False    False
weight             True        False         False  ...  False        False    False
financer          False        False         False  ...  False         True    False
fu...
class             False        False         False  ...  False        False    False
state             False        False         False  ...  False        False    False
year              False        False         False  ...  False        False    False
categoryId        False        False         False  ...  False        False    False
onroadPrice        True        False         False  ...  False        False    False
price_FAIR         True        False         False  ...  False        False    False[13 rows x 7 columns])),(&#x27;decisiontreeclassifier&#x27;,DecisionTreeClassifier(class_weight=&#x27;balanced&#x27;,max_depth=2249))])</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-10" type="checkbox" ><label for="sk-estimator-id-10" class="sk-toggleable__label sk-toggleable__label-arrow">Pipeline</label><div class="sk-toggleable__content"><pre>Pipeline(steps=[(&#x27;easypreprocessor&#x27;,EasyPreprocessor(types=             continuous  dirty_float  low_card_int  ...   date  free_string  useless
rto               False        False         False  ...  False         True    False
ownerNum          False        False         False  ...  False        False    False
cc                False        False         False  ...  False        False    False
insurance         False        False         False  ...  False        False    False
weight             True        False         False  ...  False        False    False
financer          False        False         False  ...  False         True    False
fu...
class             False        False         False  ...  False        False    False
state             False        False         False  ...  False        False    False
year              False        False         False  ...  False        False    False
categoryId        False        False         False  ...  False        False    False
onroadPrice        True        False         False  ...  False        False    False
price_FAIR         True        False         False  ...  False        False    False[13 rows x 7 columns])),(&#x27;decisiontreeclassifier&#x27;,DecisionTreeClassifier(class_weight=&#x27;balanced&#x27;,max_depth=2249))])</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-11" type="checkbox" ><label for="sk-estimator-id-11" class="sk-toggleable__label sk-toggleable__label-arrow">EasyPreprocessor</label><div class="sk-toggleable__content"><pre>EasyPreprocessor(types=             continuous  dirty_float  low_card_int  ...   date  free_string  useless
rto               False        False         False  ...  False         True    False
ownerNum          False        False         False  ...  False        False    False
cc                False        False         False  ...  False        False    False
insurance         False        False         False  ...  False        False    False
weight             True        False         False  ...  False        False    False
financer          False        False         False  ...  False         True    False
fuelType          False        False         False  ...  False        False    False
class             False        False         False  ...  False        False    False
state             False        False         False  ...  False        False    False
year              False        False         False  ...  False        False    False
categoryId        False        False         False  ...  False        False    False
onroadPrice        True        False         False  ...  False        False    False
price_FAIR         True        False         False  ...  False        False    False[13 rows x 7 columns])</pre></div></div></div><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-12" type="checkbox" ><label for="sk-estimator-id-12" class="sk-toggleable__label sk-toggleable__label-arrow">DecisionTreeClassifier</label><div class="sk-toggleable__content"><pre>DecisionTreeClassifier(class_weight=&#x27;balanced&#x27;, max_depth=2249)</pre></div></div></div></div></div></div></div>

**Disclaimer:** This model is trained with dabl library as a baseline, for better results, use [AutoTrain](https://huggingface.co/autotrain).

**Logs of training** including the models tried in the process can be found in logs.txt