utc-large / README.md
LemonNoel's picture
Update README.md
c934042
metadata
library_name: paddlenlp
license: apache-2.0
language:
  - zh
tags:
  - zero-shot-classification

paddlenlp-banner

PaddlePaddle/utc-large

Text classification technology is widely used in various industries such as dialogue intention recognition, bill archiving, and event detection. However, there are many challenges in industrial-level text classification practices, including diverse tasks, limited data availability and label transfer difficulty. To address these issues, UTC models text classification as a matching task between labels and text, based on the idea of Unified Semantic Matching (USM). Thus, it can handle multiple classification tasks with a single model, reducing development and machine costs and achieving good zero/few-shot transfer performance.

USM Paper: https://arxiv.org/abs/2301.03282

PaddleNLP released UTC model for various text classification tasks which use ERNIE models as the pre-trained language models and were finetuned on a large amount of text classification data.

UTC-diagram

Available Models

Model Name Usage Scenarios Supporting Tasks
utc-large A text classification model supports Chinese Supports intention recognition, semantic matching, natural language inference, semantic analysis, etc.

Performance on Text Dataset

UTC won the 1st place on both ZeroCLUE and FewCLUE benchmarks.

UTC-benchmarks

Detailed Info: https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/zero_shot_text_classification