guoqiang wang commited on
Commit
52732f6
1 Parent(s): 1d4752e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +146 -0
README.md ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # WudaoSailing
2
+
3
+ WudaoSailing is a package for pretraining chinese Language Model and finetune tasks. Now it supports GLM, Bert, T5, Cogview and Roberta models.
4
+
5
+
6
+
7
+
8
+ ## Get Started
9
+ ### Docker Image
10
+ We prepare two docker images based on CUDA 10.2 and CUDA 11.2. You can build images from the docker file [docs/docker/cuda102.dockerfile](docs/docker/cuda102.dcokerfile) or pull the pre-built images from Docker Hub and run with docker v19.03+
11
+ ```shell
12
+ nvidia-docker run -id --hostname=V100 --network=host\
13
+ --ipc=host --shm-size=16gb --name=deepspeed-cuda \
14
+ -e NVIDIA_VISIBLE_DEVICES=0,1,2,3 \
15
+ -v /DATA/disk1/docker/containers/:/data deepspeed/cuda102:lastest
16
+ ```
17
+ or replace `cuda102` with `cuda112`.
18
+
19
+ ```shell
20
+ docker build -f cuda102.dockerfile -t deepspeed/cuda102 .
21
+ ```
22
+
23
+ ### Clone this repo
24
+ ```shell
25
+ git clone https://github.com/wangguojim/WudaoSailing.git
26
+ cd WudaoSailing
27
+ pip install -r requirements.txt
28
+ ```
29
+
30
+ ## GLM
31
+
32
+ We show some examples based on GLM model.
33
+
34
+ ### finetuene
35
+ We provide scripts for finetuning GLM on some downstream tasks.
36
+
37
+ #### SuperGLUE
38
+
39
+ - Download the [SuperGlue](https://super.gluebenchmark.com/tasks) data and check the experiment setup in
40
+ [examples/glm/scripts/ds_finetune_superglue.sh](xamples/glm/scripts/ds_finetune_superglue.sh). Note that `DATA_ROOT, CHECKPOINT_PATH, SAVE_PATH`
41
+ need to be changed to your local path. You may also change the `batch-size` and `nproc_per_node` according to your
42
+ available hardware.
43
+
44
+ - Run the following script for text similarity finetune task (use the afqmc dataset as an example)
45
+
46
+ ```
47
+ cd examples/glm/
48
+ bash scripts/ds_finetune_superglue.sh\
49
+ config/model_blocklm_large_chinese.sh\
50
+ config_tasks/task_afqmc.sh
51
+ ```
52
+
53
+
54
+ - Run the following script for text classification finetune task (use the thunews and thunews dataset as an example)
55
+
56
+ ```
57
+ cd examples/glm/
58
+ bash scripts/ds_finetune_superglue.sh\
59
+ config/model_blocklm_large_chinese.sh\
60
+ config_tasks/task_tnews.sh
61
+ ```
62
+
63
+ - Run the following script for causal inference finetune task (use the COPA dataset as an example)
64
+
65
+ ```
66
+ cd examples/glm/
67
+ bash scripts/ds_finetune_superglue.sh\
68
+ config/model_blocklm_large_chinese.sh\
69
+ config_tasks/task_copa.sh
70
+ ```
71
+
72
+ - To apply GLM to a new NLU dataset with cloze-filling finetuning, implement a `DataProcessor` in
73
+ [examples/glm/tasks/superglue/dataset.py](examples/glm/tasks/superglue/dataset.py) for data loading and add a `PVP` in
74
+ [examples/glm/tasks/superglue/pvp.py](examples/glm/tasks/superglue/pvp.py) for the cloze question. More details can be found
75
+ [here](examples/glm/tasks/superglue/README.md).
76
+
77
+
78
+
79
+ #### Blank Filling (Interactive)
80
+ * Change `CHECKPOINT_PATH` to your local path. Run the following script
81
+ ```
82
+ bash config/generate_block.sh\
83
+ config/model_blocklm_large_chinese.sh
84
+ ```
85
+ ##### Example1 (Entity Prediction):
86
+
87
+ Context: 凯旋门位于意大利米兰市古城堡旁。1807年为纪念[MASK]而建,门高25米,顶上矗立两武士青铜古兵车铸像。
88
+
89
+ GLM:拿破仑军队攻克米兰城
90
+
91
+ ##### Example2 (Sentence Prediction)
92
+ Context: 工业互联网(Industrial Internet)是新一代信息通信技术与工业经济深度融合的新型基础设施、应用模式和工业生态,通过对人、机、物、系统等的全面连接,构建起覆盖全产业链、全价值链的全新制造和服务体系,为工业乃至产业数字化、网络化、智能化发展提供了实现途径,是第四次工业革命的重要基石。[sMASK]它以网络为基础、平台为中枢、数据为要素、安全为保障,既是工业数字化、网络化、智能化转型的基础设施,也是互联网、大数据、人工智能与实体经济深度融合的应用模式,同时也是一种新业态、新产业,将重塑企业形态、供应链和产业链。当前,工业互联网融合应用向国民经济重点行业广泛拓展,形成平台化设计、智能化制造、网络化协同、个性化定制、服务化延伸、数字化管理六大新模式,赋能、赋智、赋值作用不断显现,有力的促进了实体经济提质、增效、降本、绿色、安全发展。
93
+
94
+ GLM: 工业互联网是制造业技术、管理、模式的重大变革,是推动互联网、大数据、人工智能和实体经济深度融合的重要载体,是建设制造强国和网络强国的重要基础。
95
+
96
+ ##### Example3 (Long Text Generation)
97
+ Context: 问题:高斯所在的国家有什么汽车品牌?答案:[gMASK]
98
+
99
+ GLM:答案:[gMASK]<|startofpiece|>德国奔驰、德国大众、别克、沃尔沃、斯柯达、本田、雪铁龙.
100
+
101
+
102
+ ### Ptuning
103
+ Run the following script to integrate p-tuning with GLM:
104
+ ```shell
105
+ cd algutils/ptuning/
106
+ bash finetune_zy.sh
107
+ ```
108
+
109
+ ### Pretrain
110
+ Run the following script to pre-train the GLM-Large model
111
+ ```shell
112
+ cd examples/glm/
113
+ bash scripts/ds_pretrain_nvidia.sh config/ds_block_large.sh
114
+ ```
115
+
116
+ The script [examples/glm/config/ds_pretrain_nvidia.sh](examples/glm/config/ds_pretrain_nvidia.sh) launches the training program with DeepSpeed. You should change `NUM_WORKERS` and `NUM_GPUS_PER_WORKER` to the number of workers and the number of gpus per worker. Also change `HOST_FILE_PATH` to the path to an OpenMPI-style hostfile. More details about DeepSpeed launcher can be found [here](https://www.deepspeed.ai/getting-started/#resource-configuration-multi-node).
117
+
118
+ The file [examples/glm/config/ds_block_large.sh](examples/glm/config/ds_block_large.sh) defines the hyperparameters for pretraining. Most of the arguments are fairly self-explanatory. Specifically, `--train-data` can be multiple keywords defined in `NAMED_CORPORA` in [data_utils/corpora.py](data_utils/corpora.py). The hyperparameters of the optimizer are defined in the corresponding json file under `config`. The semantics of the json file can be found [here](https://www.deepspeed.ai/docs/config-json).
119
+
120
+
121
+
122
+ ## Bert
123
+
124
+ We show some examples based on GLM model.
125
+
126
+ ### Pretrain
127
+ Run the following script to pre-train the Bert model
128
+ ```shell
129
+ cd examples/bert/
130
+ python quick_start.py
131
+ ```
132
+
133
+ ## CogView
134
+ ### Pretrain
135
+ Run the following script to pre-train the cogview model
136
+ ```shell
137
+ cd examples/cogview/
138
+ bash config/pretrain_multiple_nodes.sh
139
+ ```
140
+
141
+ ### inference
142
+ Run the following script to test the ability of text2image
143
+ ```shell
144
+ cd examples/cogview/
145
+ bash config/text2image_cogview.sh
146
+ ```