File size: 2,513 Bytes
09c2920
2e1d656
 
 
675701e
2e1d656
 
675701e
 
 
 
 
 
 
 
 
 
 
 
 
 
2e1d656
 
675701e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0866d6f
 
 
675701e
0866d6f
 
 
 
 
675701e
 
 
0866d6f
2e1d656
09c2920
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>PyScript Test</title>
    <link rel="stylesheet" href="https://pyscript.net/alpha/pyscript.css" />
    <script defer src="https://pyscript.net/alpha/pyscript.js"></script>
    <py-env>
        - scikit-learn
        - tabulate
    </py-env>

    <!-- from https://stackoverflow.com/a/62032824 -->
    <link rel="stylesheet"
          href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.6.0/styles/default.min.css">
    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.6.0/highlight.min.js"
            integrity="sha512-gU7kztaQEl7SHJyraPfZLQCNnrKdaQi5ndOyt4L4UPL/FHDd/uB9Je6KDARIqwnNNE27hnqoWLBq+Kpe4iHfeQ=="
            crossorigin="anonymous"
            referrerpolicy="no-referrer"></script>
    <script>hljs.initHighlightingOnLoad();</script>

  </head>
  <body>
      <p>Define your own sklearn classifier and evaluate it on the toy dataset. An example is shown below:</p>
      <pre>
          <code class="python">from sklearn.linear_model import LogisticRegression
clf = LogisticRegression(random_state=0)
evaluate(clf)</code>
      </pre>
      Try to achieve a test accuracy of 0.85 or better! Get some inspiration for possible classifiers <a href="https://scikit-learn.org/stable/supervised_learning.html" title="List of sklearn estimators">here</a>.
      <br><br>
      Enter your code below, then press Shift+Enter:
      <py-script>
          from statistics import mean
          from sklearn.datasets import make_classification
          from sklearn.model_selection import cross_validate
          import tabulate

          X, y = make_classification(n_samples=1000, n_informative=10, random_state=0)

          def evaluate(clf):
              cv_result = cross_validate(clf, X, y, scoring='accuracy', cv=5)
              time_fit = sum(cv_result['fit_time'])
              time_score = sum(cv_result['score_time'])

              print(f"Mean test accuracy: {mean(cv_result['test_score']):.3f}")
              print(f"Total training time: {time_fit:.1f} seconds")
              print(f"Total time for scoring: {time_score:.1f} seconds")

              show_result = {'split': [1, 2, 3, 4, 5], 'accuracy': cv_result['test_score']}
              print("Accuracy for each cross validation split:")
              return tabulate.tabulate(show_result, tablefmt='html', headers='keys', floatfmt='.3')
      </py-script>

      <py-repl auto-generate="true"></py-repl>
  </body>
</html>