File size: 977 Bytes
4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc f5ff003 4a746dc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
---
library_name: transformers
tags: []
---
## Fine-tuned roberta-base for detecting paragraphs with eHRAF-assigned two-digit id '610'
## Description
This is a fine tuned roberta-base model for detecting whether paragraphs drawn from ethnographic source material classified under the main subject 'Marriage, Family, Kinship and Social Organization' is more specifically about '610'.
## Usage
The easiest way to use this model at inference time is with the HF pipelines API.
```python
from transformers import pipeline
classifier = pipeline("text-classification", model="gptmurdock/classifier-610")
classifier("Example text to classify")
```
## Training data
...
## Training procedure
...
We use a 60-20-20 train-val-test split, and fine-tuned roberta-base for 5 epochs (lr = 2e-5, batch size = 40).
## Evaluation
Evals on the test set are reported below.
| Metric | Value |
|-----------|-------|
| Precision | 91.2 |
| Recall | 91.3 |
| F1 | 91.2 |
|