rudyvdbrink
/

Llama-3.2-1B-binary-citation-classifier

Generated from Trainer

Model card Files Files and versions Community

rudyvdbrink commited on Jul 17

Commit

3553205

·

verified ·

1 Parent(s): 93713ad

Update README.md

Files changed (1) hide show

README.md +23 -22

README.md CHANGED Viewed

@@ -1,25 +1,23 @@
----
-library_name: peft
-license: llama3.2
-base_model: meta-llama/Llama-3.2-1B
-tags:
-- generated_from_trainer
-metrics:
-- accuracy
-- f1
-- precision
-- recall
-model-index:
-- name: Llama-3.2-1B-binary-citation-classifier
-  results: []
----
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # Llama-3.2-1B-binary-citation-classifier
-This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.5450
 - Accuracy: 0.746
@@ -29,18 +27,21 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:

+---
+library_name: peft
+license: llama3.2
+base_model: meta-llama/Llama-3.2-1B
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+- f1
+- precision
+- recall
+model-index:
+- name: Llama-3.2-1B-binary-citation-classifier
+  results: []
+---
 # Llama-3.2-1B-binary-citation-classifier
+This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on a dataset of scientific abstracts and citation counts.
+Its aim is to predict, based on an article abstract, if an article will be cited within five years or not.
 It achieves the following results on the evaluation set:
 - Loss: 0.5450
 - Accuracy: 0.746
 ## Model description
+Llama-3.2-1B architecture, modified with a rank 8 LORA adapter.
 ## Intended uses & limitations
+Intended use is binary classification. The training set consists of PubMed indexed neuroscience-related articles exclusively.
 ## Training and evaluation data
+[Training and evalutation data](https://huggingface.co/datasets/rudyvdbrink/CitationDatabase)
 ## Training procedure
+Pre-training following Meta's procedures.
+LORA fine tuning with PEFT on 16k abstracts (8k cited, 8k uncited)
 ### Training hyperparameters
 The following hyperparameters were used during training: