ibivibiv commited on
Commit
8bcf33d
1 Parent(s): e7a9af7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +165 -0
README.md ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ language:
4
+ - en
5
+ tags:
6
+ - summary
7
+
8
+ ---
9
+ # Bubo Bubo 13B
10
+
11
+ ![img](./bubu-bubo.png)
12
+
13
+ # Prompting
14
+
15
+ ## Prompt Template for alpaca style
16
+
17
+ ```
18
+ ### Instruction:
19
+
20
+ <prompt> (without the <>)
21
+
22
+ ### Response:
23
+ ```
24
+
25
+ ## Sample Code
26
+
27
+ ```python
28
+ import torch
29
+ from transformers import AutoModelForCausalLM, AutoTokenizer
30
+
31
+ torch.set_default_device("cuda")
32
+
33
+ model = AutoModelForCausalLM.from_pretrained("ibivibiv/bubo-bubo-13b", torch_dtype="auto", device_config='auto')
34
+ tokenizer = AutoTokenizer.from_pretrained("ibivibiv/bubo-bubo-13b")
35
+
36
+ inputs = tokenizer("### Instruction: Summarize this email chain : <email chain stuff here>.\n### Response:\n", return_tensors="pt", return_attention_mask=False)
37
+
38
+ outputs = model.generate(**inputs, max_length=200)
39
+ text = tokenizer.batch_decode(outputs)[0]
40
+ print(text)
41
+ ```
42
+
43
+ # Model Details
44
+ * **Trained by**: [ibivibiv](https://huggingface.co/ibivibiv)
45
+ * **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
46
+ * **Model type:** **bubo-bubo-13b** is an auto-regressive language model fine tuned on the Llama 2 transformer architecture.
47
+ * **Language(s)**: English
48
+ * **Purpose**: Has specific training for summary tasks. This model is targeted towards summarizing communication chains specifically.
49
+
50
+ # Benchmark Scores
51
+
52
+ pending
53
+
54
+
55
+
56
+ ## Citations
57
+
58
+ ```
59
+ @misc{open-llm-leaderboard,
60
+ author = {Edward Beeching and Clémentine Fourrier and Nathan Habib and Sheon Han and Nathan Lambert and Nazneen Rajani and Omar Sanseviero and Lewis Tunstall and Thomas Wolf},
61
+ title = {Open LLM Leaderboard},
62
+ year = {2023},
63
+ publisher = {Hugging Face},
64
+ howpublished = "\url{https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard}"
65
+ }
66
+ ```
67
+ ```
68
+ @software{eval-harness,
69
+ author = {Gao, Leo and
70
+ Tow, Jonathan and
71
+ Biderman, Stella and
72
+ Black, Sid and
73
+ DiPofi, Anthony and
74
+ Foster, Charles and
75
+ Golding, Laurence and
76
+ Hsu, Jeffrey and
77
+ McDonell, Kyle and
78
+ Muennighoff, Niklas and
79
+ Phang, Jason and
80
+ Reynolds, Laria and
81
+ Tang, Eric and
82
+ Thite, Anish and
83
+ Wang, Ben and
84
+ Wang, Kevin and
85
+ Zou, Andy},
86
+ title = {A framework for few-shot language model evaluation},
87
+ month = sep,
88
+ year = 2021,
89
+ publisher = {Zenodo},
90
+ version = {v0.0.1},
91
+ doi = {10.5281/zenodo.5371628},
92
+ url = {https://doi.org/10.5281/zenodo.5371628}
93
+ }
94
+ ```
95
+ ```
96
+ @misc{clark2018think,
97
+ title={Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge},
98
+ author={Peter Clark and Isaac Cowhey and Oren Etzioni and Tushar Khot and Ashish Sabharwal and Carissa Schoenick and Oyvind Tafjord},
99
+ year={2018},
100
+ eprint={1803.05457},
101
+ archivePrefix={arXiv},
102
+ primaryClass={cs.AI}
103
+ }
104
+ ```
105
+ ```
106
+ @misc{zellers2019hellaswag,
107
+ title={HellaSwag: Can a Machine Really Finish Your Sentence?},
108
+ author={Rowan Zellers and Ari Holtzman and Yonatan Bisk and Ali Farhadi and Yejin Choi},
109
+ year={2019},
110
+ eprint={1905.07830},
111
+ archivePrefix={arXiv},
112
+ primaryClass={cs.CL}
113
+ }
114
+ ```
115
+ ```
116
+ @misc{hendrycks2021measuring,
117
+ title={Measuring Massive Multitask Language Understanding},
118
+ author={Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt},
119
+ year={2021},
120
+ eprint={2009.03300},
121
+ archivePrefix={arXiv},
122
+ primaryClass={cs.CY}
123
+ }
124
+ ```
125
+ ```
126
+ @misc{lin2022truthfulqa,
127
+ title={TruthfulQA: Measuring How Models Mimic Human Falsehoods},
128
+ author={Stephanie Lin and Jacob Hilton and Owain Evans},
129
+ year={2022},
130
+ eprint={2109.07958},
131
+ archivePrefix={arXiv},
132
+ primaryClass={cs.CL}
133
+ }
134
+ ```
135
+ ```
136
+ @misc{DBLP:journals/corr/abs-1907-10641,
137
+ title={{WINOGRANDE:} An Adversarial Winograd Schema Challenge at Scale},
138
+ author={Keisuke Sakaguchi and Ronan Le Bras and Chandra Bhagavatula and Yejin Choi},
139
+ year={2019},
140
+ eprint={1907.10641},
141
+ archivePrefix={arXiv},
142
+ primaryClass={cs.CL}
143
+ }
144
+ ```
145
+ ```
146
+ @misc{DBLP:journals/corr/abs-2110-14168,
147
+ title={Training Verifiers to Solve Math Word Problems},
148
+ author={Karl Cobbe and
149
+ Vineet Kosaraju and
150
+ Mohammad Bavarian and
151
+ Mark Chen and
152
+ Heewoo Jun and
153
+ Lukasz Kaiser and
154
+ Matthias Plappert and
155
+ Jerry Tworek and
156
+ Jacob Hilton and
157
+ Reiichiro Nakano and
158
+ Christopher Hesse and
159
+ John Schulman},
160
+ year={2021},
161
+ eprint={2110.14168},
162
+ archivePrefix={arXiv},
163
+ primaryClass={cs.CL}
164
+ }
165
+ ```