Edit model card

YuYan-Dialogue

YuYan is a series of Chinese language models with different size, developed by Fuxi AI lab, Netease.Inc. They are trained on a large Chinese novel dataset of high quality.

YuYan is in the same family of decoder-only models like GPT2 and GPT-3. As such, it was pretrained using the self-supervised causal language modedling objective.

YuYan-Dialogue is a dialogue model by fine-tuning the YuYan-11b on a large multi-turn dialogue dataset of high quality. It has very strong conversation generation capabilities.

Model Inference Acceleration

As the model size increases, the model inference time increases and more computational resources are required.

Therefore, we developed our own transformer model inference acceleration framework, EET. More details are in Easy and Efficient Transformer: Scalable Inference Solution For Large NLP Model.

We combine our language model with the EET inference framework to provide industrial-grade inference reasoning performance.

How to use

Our model is trained based on the fairseq. As a result, the inference and finetuning depend on it.

For inference, we modify some parts of the original fairseq codes. Mainly

fairseq-0.12.2/fairseq/sequence_generator.py

We integrate the EET with sequence_generator. We replace the eos token to a token unlikely to be sampled to ensure the generated text length. The repetition penalty trick is also modified. You can change the penalty strength by adjusting the value of self.ban_weight.

Then, to keep the eos token in the final generated text, we change the line 75 include_eos=False to include_eos=True in

fairseq-0.12.2/fairseq/data/dictionary.py

Finally, to pass in parameters in python scripts, we remove the line 67 ~ line 69 in

fairseq-0.12.2/fairseq/dataclass/utils.py

Below are the install tutorial.

# install pytorch
pip install torch==1.8.1 # install pytorch

# install fairseq
unzip fairseq-0.12.2.zip
cd fairseq-0.12.2
pip  install.

# install EET
git clone https://github.com/NetEase-FuXi/EET.git
cd EET
pip install .

# install transformers (EET requirements)
pip install transformers==4.23

# make a folder, move the dictionary file and model file into it.
mkdir transformer_lm_gpt2_xxl_dialogue
mv dict.txt transformer_lm_gpt2_xxl_dialogue/
mv checkpoint_best_part_*.pt transformer_lm_gpt2_xxl_dialogue/

inference.py is a script to provide a interface to initialize the EET object and sequence_generator. It includes some pre-process and post-process functions for text input and output. You can modify the script according to your needs.

In addition, it provide a simple object to organize the dialogue generation and dialogue history.

After the environment is ready, several lines of codes can realize the inference.


from inference import Inference, Dialogue
model_path = "transformer_lm_gpt2_xxl_dialogue/checkpoint_best.pt"
data_path = "transformer_lm_gpt2_xxl_dialogue"
eet_batch_size = 10  # max inference batch size, adjust according to cuda memory, 40GB memory is necessary 
inference = Inference(model_path, data_path, eet_batch_size)
dialogue_model = Dialogue(inference)
dialogue_model.get_repsonse("你好啊")

Citation

If you find the technical report or resource is useful, please cite the following technical report in your paper.

@inproceedings{li-etal-2022-easy,
    title = "Easy and Efficient Transformer: Scalable Inference Solution For Large {NLP} Model",
    author = "Li, Gongzheng  and
      Xi, Yadong  and
      Ding, Jingzhen  and
      Wang, Duan  and
      Luo, Ziyang  and
      Zhang, Rongsheng  and
      Liu, Bai  and
      Fan, Changjie  and
      Mao, Xiaoxi  and
      Zhao, Zeng",
    booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track",
    month = jul,
    year = "2022",
    address = "Hybrid: Seattle, Washington + Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.naacl-industry.8",
    doi = "10.18653/v1/2022.naacl-industry.8",
    pages = "62--68"
}

Contact Us

You can also contact us by email:

xiyadong@corp.netease.com, dingjingzhen@corp.netease

Downloads last month
0
Inference Examples
Inference API (serverless) has been turned off for this model.