siebert commited on
Commit
1afccb3
·
1 Parent(s): 4d7a7e7

Fixed minor details in README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -12,13 +12,13 @@ This model is a fine-tuned checkpoint of [RoBERTa-large](https://huggingface.co/
12
 
13
 
14
  # Predictions on a data set
15
- If you want to predict sentiment for your own data, we provide an example script via [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb). You can load your data to a Google Drive and run the script for free on a Colab GPU. Set-up takes only a few minutes. We suggest that you manually label a subset of your data to evaluate performance for your use case. For performance benchmark values across different sentiment analysis contexts, please refer to our paper ([Heitmann et al. 2020](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489963)).
16
 
17
  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/chrsiebert/sentiment-roberta-large-english/blob/main/sentiment_roberta_prediction_example.ipynb)
18
 
19
 
20
  # Use in a Hugging Face pipeline
21
- The easiest way to use the model for single predictions is Hugging Face's [sentiment analysis pipeline](https://huggingface.co/transformers/quicktour.html#getting-started-on-a-task-with-a-pipeline), which only needs a couple lines of code as in the following example:
22
  ```
23
  from transformers import pipeline
24
  sentiment_analysis = pipeline("sentiment-analysis",model="siebert/sentiment-roberta-large-english")
@@ -29,11 +29,11 @@ print(sentiment_analysis("I love this!"))
29
 
30
 
31
  # Use for further fine-tuning
32
- The model can also be used as a starting point for further fine-tuning on your specific data. Please refer to Hugging Face's [documentation](https://huggingface.co/transformers/custom_datasets.html#fine-tuning-with-trainer) for further details and example code.
33
 
34
 
35
  # Performance
36
- To evaluate the performance of our general-purpose sentiment analysis model, we set aside an evaluation set from each data set, which was not used for training. On average, our model outperforms a [DistilBERT-based model](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) (which is solely fine-tuned on the popular SST-2 data set) by more than 15 percentage points (78.1 vs. 93.2, see table below). As a robustness check, we evaluate the model in a leave-on-out manner (training on 14 data sets, evaluating on the one left out), which decreases model performance by only about 3 percentage points on average and underscores its generalizability. Model performance is given as evaluation set accuracy in percent.
37
 
38
  |Dataset|DistilBERT SST-2|This model|
39
  |---|---|---|
 
12
 
13
 
14
  # Predictions on a data set
15
+ If you want to predict sentiment for your own data, we provide an example script via [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb). You can load your data to a Google Drive and run the script for free on a Colab GPU. Set-up takes only a few minutes. We suggest that you manually label a subset of your data to evaluate performance for your use case. For performance benchmark values across various sentiment analysis contexts, please refer to our paper ([Heitmann et al. 2020](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3489963)).
16
 
17
  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/chrsiebert/sentiment-roberta-large-english/blob/main/sentiment_roberta_prediction_example.ipynb)
18
 
19
 
20
  # Use in a Hugging Face pipeline
21
+ The easiest way to use the model for single predictions is Hugging Face's [sentiment analysis pipeline](https://huggingface.co/transformers/quicktour.html#getting-started-on-a-task-with-a-pipeline), which only needs a couple lines of code as shown in the following example:
22
  ```
23
  from transformers import pipeline
24
  sentiment_analysis = pipeline("sentiment-analysis",model="siebert/sentiment-roberta-large-english")
 
29
 
30
 
31
  # Use for further fine-tuning
32
+ The model can also be used as a starting point for further fine-tuning of RoBERTa on your specific data. Please refer to Hugging Face's [documentation](https://huggingface.co/transformers/custom_datasets.html#fine-tuning-with-trainer) for further details and example code.
33
 
34
 
35
  # Performance
36
+ To evaluate the performance of our general-purpose sentiment analysis model, we set aside an evaluation set from each data set, which was not used for training. On average, our model outperforms a [DistilBERT-based model](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) (which is solely fine-tuned on the popular SST-2 data set) by more than 15 percentage points (78.1 vs. 93.2 percent, see table below). As a robustness check, we evaluate the model in a leave-on-out manner (training on 14 data sets, evaluating on the one left out), which decreases model performance by only about 3 percentage points on average and underscores its generalizability. Model performance is given as evaluation set accuracy in percent.
37
 
38
  |Dataset|DistilBERT SST-2|This model|
39
  |---|---|---|