malteos commited on
Commit
702fd4d
1 Parent(s): 36a30d4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bigscience-bloom-rail-1.0
3
+ datasets:
4
+ - OpenAssistant/oasst1
5
+ - LEL-A/translated_german_alpaca_validation
6
+ - deepset/germandpr
7
+ language:
8
+ - de
9
+ pipeline_tag: conversational
10
+ ---
11
+
12
+ # Instruction-fine-tuned German language model (6B parameters)
13
+
14
+ Base model: [malteos/bloom-6b4-clp-german](https://huggingface.co/malteos/bloom-6b4-clp-german) [(Ostendorff and Rehm, 2023)](https://arxiv.org/abs/2301.09626)
15
+
16
+ Trained on:
17
+ - 20B additional German tokens
18
+ - [OpenAssistant/oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) (German subset)
19
+ - [LEL-A/translated_german_alpaca_validation](https://huggingface.co/datasets/LEL-A/translated_german_alpaca_validation)
20
+ - [LEL-A's version of deepset/germandpr](https://github.com/LEL-A/EuroInstructProject#instruct-germandpr-dataset-v1-german)
21
+
22
+ ## Chat demo
23
+
24
+ [https://opengptx.dfki.de/chat/](https://opengptx.dfki.de/chat/)
25
+
26
+ Please note that this a research prototype and may not be suitable for extensive use.
27
+
28
+
29
+ ## How to cite
30
+
31
+ If you are using our code or models, please cite [our paper](https://arxiv.org/abs/2301.09626):
32
+
33
+ ```bibtex
34
+ @misc{Ostendorff2023clp,
35
+ doi = {10.48550/ARXIV.2301.09626},
36
+ author = {Ostendorff, Malte and Rehm, Georg},
37
+ title = {Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning},
38
+ publisher = {arXiv},
39
+ year = {2023}
40
+ }
41
+
42
+ ```
43
+
44
+ ## License
45
+
46
+ [BigScience BLOOM RAIL 1.0](https://bigscience.huggingface.co/blog/the-bigscience-rail-license)