jpmartinezc commited on
Commit
e0f3bbf
·
verified ·
1 Parent(s): c9bc6e6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -0
README.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Name
2
+
3
+ This model predicts whether a chat message should earn participation points. It was developed for the FEV Participation Points project, which studied an intervention where elementary and middle school tutors received guidance on awarding participation points during math tutoring sessions. The tutoring is chat-based.
4
+
5
+ ---
6
+
7
+ ## Training Details
8
+
9
+ ## Base model
10
+
11
+ Bert base model
12
+
13
+ ### Datasets
14
+ The dataset consisted on a subset of 1,000 messages that include the word "point" in the utterance.
15
+
16
+ | Dataset | Split | Size | Source | Notes |
17
+ |---------|-------|------|--------|-------|
18
+ | Tutor math chats | train | 1,000 | Shared by tutoring provider| Contains only utterances with the word "point" |
19
+
20
+
21
+ ### Hyperparameters
22
+
23
+ | Parameter | Value |
24
+ |-----------|-------|
25
+ | Learning rate | 1e-5 |
26
+ | Batch size | 8 |
27
+ | Optimizer | AdamW (beta1=0.9, beta2=0.999, epsilon=1*10-8) |
28
+ | Epochs / Steps | 20 epochs with early stopping (F1 on minority class)|
29
+ | Warmup | 0 |
30
+ | Weight decay | 0.01 |
31
+
32
+ ---
33
+
34
+ ## Evaluation
35
+
36
+ ### Results
37
+
38
+ | Model | Dataset | Split | Metric | Score |
39
+ |-------|---------|-------|--------|-------|
40
+ | This model | Subset of math messages with points awarded | test | F1 - Yes | 0.9943 |
41
+ | This model | Subset of math messages with points awarded | test | F1 - No | 0.9583 |
42
+
43
+ ### Limitations and Caveats
44
+
45
+ - Model is highly specific for taks related to FEV Participation points
46
+ - The model was trained on a subset of messages that include the word "point" in the utterance
47
+
48
+ ---
49
+
50
+ ## How to Use
51
+
52
+ ### Message Structure
53
+
54
+ The classifier predicts directly on the message, with no previous context or following utterances.
55
+
56
+
57
+ ### Running instructions
58
+
59
+ ```python
60
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
61
+ import torch
62
+
63
+ model_dir = "model_outputs" # or a specific checkpoint folder
64
+ tokenizer = AutoTokenizer.from_pretrained(model_dir)
65
+ model = AutoModelForSequenceClassification.from_pretrained(model_dir)
66
+ model.eval()
67
+
68
+ text = "Your message here"
69
+ inputs = tokenizer(text, return_tensors="pt", truncation=True)
70
+ with torch.no_grad():
71
+ logits = model(**inputs).logits
72
+ pred_id = logits.argmax(dim=-1).item()
73
+ label = {0: "no", 1: "yes"}[pred_id]
74
+ print(label)
75
+ ```
76
+
77
+ ---
78
+
79
+ ## Code and Responsibles
80
+
81
+ **Repository:** https://github.com/scale-nssa/fev_partpoints_nlp
82
+ **Maintainers / Contributors:** FEV Participation Points team (lead: JP Martinez)
83
+
84
+ ---
85
+
86
+ ## Bias and Fairness
87
+
88
+ Dataset does not have information about the tutor or student demographic
89
+
90
+ ---
91
+
92
+ ## License
93
+
94
+ This model is released under [License Name](https://example.com/license).
95
+
96
+ ---