metadata

language: en
tags:
  - augmentation
license: apache-2.0
datasets:
  - C4
widget:
  - text: >-
      <mask> Conference on Empirical Methods <mask> submission of research
      papers <mask> Deep Learning <mask>
    example_title: Example 1
  - text: >-
      <mask> machine learning <mask> my research interest <mask> data science
      <mask>
    example_title: Example 2
  - text: >-
      <mask> play basketball <mask> a strong team <mask> Shanghai University of
      Finance and Economics <mask> last Sunday <mask>
    example_title: Example 3
  - text: >-
      Good news: <mask> the European Union <mask> month by EU <mask> Farm
      Commissioner Franz <mask>
    example_title: Example with a prompt 1
  - text: >-
      Bad news: <mask> the European Union <mask> month by EU <mask> Farm
      Commissioner Franz <mask>
    example_title: Example with a prompt 2
inference:
  parameters:
    max_length: 200
    num_beams: 3
    do_sample: true

SEGA-large model

SEGA: SkEtch-based Generative Augmentation

SEGA is a general text augmentation model that can be used for data augmentation for various NLP tasks (including sentiment analysis, topic classification, NER, and QA). SEGA uses an encoder-decoder structure (based on the BART architecture) and is pre-trained on the C4-realnewslike corpus.

Paper: this paper
Github: this repository.

Model description

Model variations

Model	#params	Language
`sega-large`	xM	English
`sega-base`	xM	English
`sega-small`	xM	English
`sega-large-chinese`	xM	Chinese
`sega-base-chinese`	xM	Chinese
`sega-small-chinese`	xM	Chinese

beyond
/

genius-large

SEGA-large model

Model description

Model variations

Intended uses & limitations

How to use

Limitations and bias

Training data

Training procedure

Preprocessing

Pretraining

Evaluation results

BibTeX entry and citation info