sknow-lab
/

Qwen2.5-14B-CIC-ACLARC

@@ -16,7 +16,7 @@ tags:
 pipeline_tag: zero-shot-classification
 ---
-# Qwen2.5-14-CIC-ACLARC
 A fine-tuned model for Citation Intent Classification, based on [Qwen 2.5 14B Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) and trained on the [ACL-ARC](https://huggingface.co/datasets/kejian/ACL-ARC) dataset.
@@ -28,14 +28,91 @@ A fine-tuned model for Citation Intent Classification, based on [Qwen 2.5 14B In
 | Background | The cited paper provides relevant Background information or is part of the body of literature.|
 | Motivation | The citing paper is directly motivated by the cited paper. |
 | Uses | The citing paper uses the methodology or tools created by the cited paper.|
-| Extension | The citing paper extends the methods, tools or data, etc. of the cited paper. |
 | Comparison or Contrast | The citing paper expresses similarities or differences to, or disagrees with, the cited paper. |
 | Future | *The cited paper may be a potential avenue for future work.|
 ## Quickstart
 ```python
-# TODO
 ```
 Details about the system prompts and query templates can be found in the paper.

 pipeline_tag: zero-shot-classification
 ---
+# Qwen2.5-14B-CIC-ACLARC
 A fine-tuned model for Citation Intent Classification, based on [Qwen 2.5 14B Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) and trained on the [ACL-ARC](https://huggingface.co/datasets/kejian/ACL-ARC) dataset.
 | Background | The cited paper provides relevant Background information or is part of the body of literature.|
 | Motivation | The citing paper is directly motivated by the cited paper. |
 | Uses | The citing paper uses the methodology or tools created by the cited paper.|
+| Extends | The citing paper extends the methods, tools or data, etc. of the cited paper. |
 | Comparison or Contrast | The citing paper expresses similarities or differences to, or disagrees with, the cited paper. |
 | Future | *The cited paper may be a potential avenue for future work.|
 ## Quickstart
 ```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "sknow-lab/Qwen2.5-14B-CIC-ACLARC"
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+system_prompt = """
+# CONTEXT #
+You are an expert researcher tasked with classifying the intent of a citation in a scientific publication.
+########
+# OBJECTIVE #
+You will be given a sentence containing a citation, you must output the appropriate class as an answer.
+########
+# CLASS DEFINITIONS #
+The six (6) possible classes are the following: "BACKGROUND", "MOTIVATION", "USES", "EXTENDS", "COMPARES_CONTRASTS", "FUTURE".
+The definitions of the classes are:
+1 - BACKGROUND: The cited paper provides relevant Background information or is part of the body of literature.
+2 - MOTIVATION: The citing paper is directly motivated by the cited paper.
+3 - USES: The citing paper uses the methodology or tools created by the cited paper.
+4 - EXTENDS: The citing paper extends the methods, tools or data, etc. of the cited paper.
+5 - COMPARES_CONTRASTS: The citing paper expresses similarities or differences to, or disagrees with, the cited paper.
+6 - FUTURE: The cited paper may be a potential avenue for future work.
+########
+# RESPONSE RULES #
+- Analyze only the citation marked with the @@CITATION@@ tag.
+- Assign exactly one class to each citation.
+- Respond only with the exact name of one of the following classes: "BACKGROUND", "MOTIVATION", "USES", "EXTENDS", "COMPARES_CONTRASTS", "FUTURE".
+- Do not provide any explanation or elaboration.
+"""
+test_citing_sentence = "However , the method we are currently using in the ATIS domain ( @@CITATION@@ ) represents our most promising approach to this problem."
+user_prompt = f"""
+{test_citing_sentence}
+### Question: Which is the most likely intent for this citation?
+a) BACKGROUND
+b) MOTIVATION
+c) USES
+d) EXTENDS
+e) COMPARES_CONTRASTS
+f) FUTURE
+### Answer:
+"""
+messages = [
+    {"role": "system", "content": system_prompt},
+    {"role": "user", "content": user_prompt}
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+generated_ids = model.generate(
+    **model_inputs,
+    max_new_tokens=512
+)
+generated_ids = [
+    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+# Response: USES
 ```
 Details about the system prompts and query templates can be found in the paper.