rishigami commited on
Commit
d7cbfb7
1 Parent(s): 5610d2c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-sa-4.0
3
+ datasets:
4
+ - wikipedia
5
+ - cc100
6
+ - mc4
7
+ language:
8
+ - ja
9
+ tags:
10
+ - japanese
11
+ - causal-lm
12
+ inference: false
13
+ ---
14
+ # OpenCALM-3B
15
+
16
+ ## Model Description
17
+
18
+ OpenCALM is a suite of decoder-only language models pre-trained on Japanese datasets, developed by CyberAgent, Inc.
19
+
20
+ ## Usage
21
+
22
+ ```python
23
+ import torch
24
+ from transformers import AutoModelForCausalLM, AutoTokenizer
25
+
26
+ model = AutoModelForCausalLM.from_pretrained("cyberagent/open-calm-3b", device_map="auto", torch_dtype=torch.float16)
27
+ tokenizer = AutoTokenizer.from_pretrained("cyberagent/open-calm-3b")
28
+
29
+ inputs = tokenizer("AIによって私達の暮らしは、", return_tensors="pt").to(model.device)
30
+ with torch.no_grad():
31
+ tokens = model.generate(
32
+ **inputs,
33
+ max_new_tokens=64,
34
+ do_sample=True,
35
+ temperature=0.7,
36
+ pad_token_id=tokenizer.pad_token_id,
37
+ )
38
+
39
+ output = tokenizer.decode(tokens[0], skip_special_tokens=True)
40
+ print(output)
41
+ ```
42
+
43
+ ## Model Details
44
+
45
+ |Model|Params|Layers|Dim|Heads|Dev ppl|
46
+ |:---:|:---: |:---:|:---:|:---:|:---:|
47
+ |[cyberagent/open-calm-small](https://huggingface.co/cyberagent/open-calm-small)|160M|12|768|12|19.7|
48
+ |[cyberagent/open-calm-medium](https://huggingface.co/cyberagent/open-calm-medium)|400M|24|1024|16|13.8|
49
+ |[cyberagent/open-calm-large](https://huggingface.co/cyberagent/open-calm-large)|830M|24|1536|16|11.3|
50
+ |[cyberagent/open-calm-1b](https://huggingface.co/cyberagent/open-calm-1b)|1.4B|24|2048|16|10.3|
51
+ |[cyberagent/open-calm-3b](https://huggingface.co/cyberagent/open-calm-3b)|2.7B|32|2560|32|9.7|
52
+ |[cyberagent/open-calm-7b](https://huggingface.co/cyberagent/open-calm-7b)|6.8B|32|4096|32|8.2|
53
+
54
+ * **Developed by**: [CyberAgent, Inc.](https://www.cyberagent.co.jp/)
55
+ * **Model type**: Transformer-based Language Model
56
+ * **Language**: Japanese
57
+ * **Library**: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
58
+ * **License**: OpenCALM is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License ([CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)). When using this model, please provide appropriate credit to CyberAgent, Inc.
59
+ * Example (en): This model is a fine-tuned version of OpenCALM-XX developed by CyberAgent, Inc. The original model is released under the CC BY-SA 4.0 license, and this model is also released under the same CC BY-SA 4.0 license. For more information, please visit: https://creativecommons.org/licenses/by-sa/4.0/
60
+ * Example (ja): 本モデルは、株式会社サイバーエージェントによるOpenCALM-XXをファインチューニングしたものです。元のモデルはCC BY-SA 4.0ライセンスのもとで公開されており、本モデルも同じくCC BY-SA 4.0ライセンスで公開します。詳しくはこちらをご覧ください: https://creativecommons.org/licenses/by-sa/4.0/
61
+
62
+
63
+ ## Training Dataset
64
+
65
+ * Wikipedia (ja)
66
+ * Common Crawl (ja)
67
+
68
+ ## Author
69
+
70
+ [Ryosuke Ishigami](https://huggingface.co/rishigami)
71
+
72
+ ## Citations
73
+
74
+ ```bibtext
75
+ @software{gpt-neox-library,
76
+ title = {{GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch}},
77
+ author = {Andonian, Alex and Anthony, Quentin and Biderman, Stella and Black, Sid and Gali, Preetham and Gao, Leo and Hallahan, Eric and Levy-Kramer, Josh and Leahy, Connor and Nestler, Lucas and Parker, Kip and Pieler, Michael and Purohit, Shivanshu and Songz, Tri and Phil, Wang and Weinbach, Samuel},
78
+ url = {https://www.github.com/eleutherai/gpt-neox},
79
+ doi = {10.5281/zenodo.5879544},
80
+ month = {8},
81
+ year = {2021},
82
+ version = {0.0.1},
83
+ }
84
+ ```