SemViQA
/

tc-infoxlm-viwikifc

@@ -2,19 +2,27 @@
 language:
 - vi
 library_name: transformers
 tags:
 - SemViQA
 - three-class-classification
 - fact-checking
-pipeline_tag: text-classification
-license: mit
 ---
 # SemViQA-TC: Vietnamese Three-class Classification for Claim Verification
 ## Model Description
-**SemViQA-TC** is one of the key components of the **SemViQA** system, designed for **three-class classification** in Vietnamese fact-checking. This model classifies a given claim into one of three categories: **SUPPORTED**, **REFUTED**, or **NOT ENOUGH INFORMATION (NEI)** based on retrieved evidence.
 ### **Model Information**
 - **Developed by:** [SemViQA Research Team](https://huggingface.co/SemViQA)
@@ -25,6 +33,15 @@ license: mit
 SemViQA-TC serves as the **first step in the two-step classification process** of the SemViQA system. It initially categorizes claims into three classes: **SUPPORTED, REFUTED, or NEI**. For claims classified as **SUPPORTED** or **REFUTED**, a secondary **binary classification model (SemViQA-BC)** further refines the prediction. This hierarchical classification strategy enhances the accuracy of fact verification.
 ## Usage Example
 Direct Model Usage

 language:
 - vi
 library_name: transformers
+license: mit
+pipeline_tag: text-classification
 tags:
 - SemViQA
 - three-class-classification
 - fact-checking
 ---
 # SemViQA-TC: Vietnamese Three-class Classification for Claim Verification
 ## Model Description
+The rise of misinformation, exacerbated by Large Language Models (LLMs) like GPT and Gemini, demands robust fact-checking solutions, especially for low-resource languages like Vietnamese. Existing methods struggle with semantic ambiguity, homonyms, and complex linguistic structures, often trading accuracy for efficiency. We introduce SemViQA, a novel Vietnamese fact-checking framework integrating Semantic-based Evidence Retrieval (SER) and Two-step Verdict Classification (TVC). Our approach balances precision and speed, achieving state-of-the-art results with 78.97\% strict accuracy on ISE-DSC01 and 80.82\% on ViWikiFC, securing 1st place in the UIT Data Science Challenge. Additionally, SemViQA Faster improves inference speed 7x while maintaining competitive accuracy. SemViQA sets a new benchmark for Vietnamese fact verification, advancing the fight against misinformation.
+**SemViQA-TC** is one of the key components of the **SemViQA** system, designed for **three-class classification** in Vietnamese fact-checking. This model classifies a given claim into one of three categories: **SUPPORTED**, **REFUTED**, or **NOT ENOUGH INFORMATION (NEI)** based on retrieved evidence. To address these challenges, SemViQA integrates:
+- **Semantic-based Evidence Retrieval (SER)**: Combines **TF-IDF** with a **Question Answering Token Classifier (QATC)** to enhance retrieval precision while reducing inference time.
+- **Two-step Verdict Classification (TVC)**: Uses hierarchical classification optimized with **Cross-Entropy and Focal Loss**, improving claim verification across three categories:
+  - **Supported** ✅
+  - **Refuted** ❌
+  - **Not Enough Information (NEI)** 🤷‍♂️
 ### **Model Information**
 - **Developed by:** [SemViQA Research Team](https://huggingface.co/SemViQA)
 SemViQA-TC serves as the **first step in the two-step classification process** of the SemViQA system. It initially categorizes claims into three classes: **SUPPORTED, REFUTED, or NEI**. For claims classified as **SUPPORTED** or **REFUTED**, a secondary **binary classification model (SemViQA-BC)** further refines the prediction. This hierarchical classification strategy enhances the accuracy of fact verification.
+### **🏆 Achievements**
+- **1st place** in the **UIT Data Science Challenge** 🏅
+- **State-of-the-art** performance on:
+  - **ISE-DSC01** → **78.97% strict accuracy**
+  - **ViWikiFC** → **80.82% strict accuracy**
+- **SemViQA Faster**: **7x speed improvement** over the standard model 🚀
+These results establish **SemViQA** as a **benchmark for Vietnamese fact verification**, advancing efforts to combat misinformation and ensure **information integrity**.
 ## Usage Example
 Direct Model Usage