simonschoe commited on
Commit
82aa653
1 Parent(s): af24f2b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -6
README.md CHANGED
@@ -4,13 +4,70 @@ language:
4
  pipeline_tag: text-classification
5
  tags:
6
  widget:
7
- - text: "transformation"
8
- example_title: "transformation"
9
- - text: "sustainability"
10
- example_title: "sustainability"
 
 
11
 
12
  ---
13
 
14
- # Model Card
15
 
16
- Description here
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  pipeline_tag: text-classification
5
  tags:
6
  widget:
7
+ - text: "And it was great to see how our Chinese team very much aware of that and of shifting all the resourcing to really tap into these opportunities."
8
+ example_title: "Examplary Transformation Sentence"
9
+ - text: "But we will continue to recruit even after that because we expect that the volumes are going to continue to grow."
10
+ example_title: "Examplary Non-Transformation Sentence"
11
+ - text: "So and again, we'll be disclosing the current taxes that are there in Guyana, along with that revenue adjustment."
12
+ example_title: "Examplary Non-Transformation Sentence"
13
 
14
  ---
15
 
16
+ # TransformationTransformer
17
 
18
+ **TransformationTransformer** is a fine-tuned [distilroberta](https://huggingface.co/distilroberta-base) model. It is trained and evaluated on 10,000 manually annotated sentences gleaned from the Q&A-section of quarterly earnings conference calls. In particular, it was trained on sentences issued by firm executives to discriminate between setnences that allude to **business transformation** vis-à-vis those that discuss topics other than business transformations. More details about the training procedure can be found [below](#model-training).
19
+
20
+
21
+ ## Background
22
+
23
+ Context on the project.
24
+
25
+
26
+ ## Usage
27
+
28
+ The model is intented to be used for sentence classification: It creates a contextual text representation from the input sentence and outputs a probability value. `LABEL_1` refers to a sentence that is predicted to contains transformation-related content (vice versa for `LABEL_0`). The query should consist of a single sentence.
29
+
30
+
31
+ ## Usage (API)
32
+
33
+ ```python
34
+ import json
35
+ import requests
36
+
37
+ API_TOKEN = <TOKEN>
38
+
39
+ headers = {"Authorization": f"Bearer {API_TOKEN}"}
40
+ API_URL = "https://api-inference.huggingface.co/models/simonschoe/call2vec"
41
+
42
+ def query(payload):
43
+ data = json.dumps(payload)
44
+ response = requests.request("POST", API_URL, headers=headers, data=data)
45
+ return json.loads(response.content.decode("utf-8"))
46
+
47
+ query({"inputs": "<insert-sentence-here>"})
48
+ ```
49
+
50
+ ## Usage (transformers)
51
+
52
+ ```python
53
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
54
+
55
+ tokenizer = AutoTokenizer.from_pretrained("simonschoe/TransformationTransformer")
56
+ model = AutoModelForSequenceClassification.from_pretrained("simonschoe/TransformationTransformer")
57
+
58
+ classifier = pipeline('text-classification', model=model, tokenizer=tokenizer)
59
+ classifier('<insert-sentence-here>')
60
+ ```
61
+
62
+
63
+ ## Model Training
64
+
65
+ The model has been trained on text data stemming from earnings call transcripts. The data is restricted to a call's question-and-answer (Q&A) section and the remarks by firm executives. The data has been segmented into individual sentences using [`spacy`](https://spacy.io/).
66
+
67
+ **Statistics of Training Data:**
68
+ - Labeled sentences: 10,000
69
+ - Data distribution: xxx
70
+ - Inter-coder agreement: xxx
71
+
72
+ The following code snippets presents the training pipeline:
73
+ <link to script>