nielsr HF staff commited on
Commit
e876fa0
Β·
verified Β·
1 Parent(s): b9f566f

Improve model card

Browse files

This PR improves the model card by adding a more detailed model description and information about the SemViQA system from the Github README.

Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -2,19 +2,27 @@
2
  language:
3
  - vi
4
  library_name: transformers
 
 
5
  tags:
6
  - SemViQA
7
  - three-class-classification
8
  - fact-checking
9
- pipeline_tag: text-classification
10
- license: mit
11
  ---
12
 
13
  # SemViQA-TC: Vietnamese Three-class Classification for Claim Verification
14
 
15
  ## Model Description
16
 
17
- **SemViQA-TC** is one of the key components of the **SemViQA** system, designed for **three-class classification** in Vietnamese fact-checking. This model classifies a given claim into one of three categories: **SUPPORTED**, **REFUTED**, or **NOT ENOUGH INFORMATION (NEI)** based on retrieved evidence.
 
 
 
 
 
 
 
 
18
 
19
  ### **Model Information**
20
  - **Developed by:** [SemViQA Research Team](https://huggingface.co/SemViQA)
@@ -25,6 +33,15 @@ license: mit
25
 
26
  SemViQA-TC serves as the **first step in the two-step classification process** of the SemViQA system. It initially categorizes claims into three classes: **SUPPORTED, REFUTED, or NEI**. For claims classified as **SUPPORTED** or **REFUTED**, a secondary **binary classification model (SemViQA-BC)** further refines the prediction. This hierarchical classification strategy enhances the accuracy of fact verification.
27
 
 
 
 
 
 
 
 
 
 
28
  ## Usage Example
29
 
30
  Direct Model Usage
 
2
  language:
3
  - vi
4
  library_name: transformers
5
+ license: mit
6
+ pipeline_tag: text-classification
7
  tags:
8
  - SemViQA
9
  - three-class-classification
10
  - fact-checking
 
 
11
  ---
12
 
13
  # SemViQA-TC: Vietnamese Three-class Classification for Claim Verification
14
 
15
  ## Model Description
16
 
17
+ The rise of misinformation, exacerbated by Large Language Models (LLMs) like GPT and Gemini, demands robust fact-checking solutions, especially for low-resource languages like Vietnamese. Existing methods struggle with semantic ambiguity, homonyms, and complex linguistic structures, often trading accuracy for efficiency. We introduce SemViQA, a novel Vietnamese fact-checking framework integrating Semantic-based Evidence Retrieval (SER) and Two-step Verdict Classification (TVC). Our approach balances precision and speed, achieving state-of-the-art results with 78.97\% strict accuracy on ISE-DSC01 and 80.82\% on ViWikiFC, securing 1st place in the UIT Data Science Challenge. Additionally, SemViQA Faster improves inference speed 7x while maintaining competitive accuracy. SemViQA sets a new benchmark for Vietnamese fact verification, advancing the fight against misinformation.
18
+
19
+ **SemViQA-TC** is one of the key components of the **SemViQA** system, designed for **three-class classification** in Vietnamese fact-checking. This model classifies a given claim into one of three categories: **SUPPORTED**, **REFUTED**, or **NOT ENOUGH INFORMATION (NEI)** based on retrieved evidence. To address these challenges, SemViQA integrates:
20
+
21
+ - **Semantic-based Evidence Retrieval (SER)**: Combines **TF-IDF** with a **Question Answering Token Classifier (QATC)** to enhance retrieval precision while reducing inference time.
22
+ - **Two-step Verdict Classification (TVC)**: Uses hierarchical classification optimized with **Cross-Entropy and Focal Loss**, improving claim verification across three categories:
23
+ - **Supported** βœ…
24
+ - **Refuted** ❌
25
+ - **Not Enough Information (NEI)** πŸ€·β€β™‚οΈ
26
 
27
  ### **Model Information**
28
  - **Developed by:** [SemViQA Research Team](https://huggingface.co/SemViQA)
 
33
 
34
  SemViQA-TC serves as the **first step in the two-step classification process** of the SemViQA system. It initially categorizes claims into three classes: **SUPPORTED, REFUTED, or NEI**. For claims classified as **SUPPORTED** or **REFUTED**, a secondary **binary classification model (SemViQA-BC)** further refines the prediction. This hierarchical classification strategy enhances the accuracy of fact verification.
35
 
36
+ ### **πŸ† Achievements**
37
+ - **1st place** in the **UIT Data Science Challenge** πŸ…
38
+ - **State-of-the-art** performance on:
39
+ - **ISE-DSC01** β†’ **78.97% strict accuracy**
40
+ - **ViWikiFC** β†’ **80.82% strict accuracy**
41
+ - **SemViQA Faster**: **7x speed improvement** over the standard model πŸš€
42
+
43
+ These results establish **SemViQA** as a **benchmark for Vietnamese fact verification**, advancing efforts to combat misinformation and ensure **information integrity**.
44
+
45
  ## Usage Example
46
 
47
  Direct Model Usage