metadata
pipeline_tag: text-classification
extra_gated_prompt: >-
**AI2 ImpACT License – Low Risk Artifacts (LR Agreement)**
[https://allenai.org/impact-license](https://allenai.org/impact-license)
extra_gated_fields:
Name: text
Organization/Entity: text
Email: text
State/Country: text
Intended Use: text
I AGREE to the terms and conditions of the LR Agreement above: checkbox
I AGREE to AI2’s use of my information for legal notices and administrative matters: checkbox
I CERTIFY that the information I have provided is true and accurate: checkbox
Field of study pytorch model is a finetuned version of allenai/scibert_scivocab_uncased.
Data for finetuning was harvested utlizing OpenAI models using following prompt:
def prompt_with_journal(title, abstract, journal_name):
message = [
{"role": "system", "content": "You are a highly intelligent and accurate information extraction system. You take title, abstract, journal name of a \
scientific article as input and your task is to classify the scientific field of study of the passage.",
"role": "user", "content": "You need to classify it with key: 'field_of_study' assign as many 'field_of_study' as you find it fit: \
'Agricultural and Food sciences', \
'Art', \
'Biology',\
'Business',\
'Chemistry',\
'Computer science',\
'Economics',\
'Education',\
'Engineering',\
'Environmental science',\
'Geography',\
'Geology',\
'History',\
'Law',\
'Linguistics',\
'Materials science',\
'Mathematics',\
'Medicine',\
'Philosophy',\
'Physics',\
'Political science',\
'Psychology',\
'Sociology'\
Only select from the above list, or 'Other'."},
{"role": "assistant",
"content": ("```python \n"
f"title = { title } \n"
f"abstract = { abstract }\n"
f"journal_name = { journal_name }\n"
"{'field_of_study': ["
)},]
return message