airesearch
/

WangchanLion7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mrp commited on Dec 26, 2023

Commit

ef3a275

•

1 Parent(s): 85f2d34

Update README.md

Files changed (1) hide show

README.md +8 -1

README.md CHANGED Viewed

@@ -33,7 +33,9 @@ Users (both direct and downstream) should be made aware of the risks, biases, an
 # How to Get Started with the Model
 Use the code [here](https://colab.research.google.com/drive/1y_7oOU3ZJI0h4chUrXFL3K4kelW_OI2G?usp=sharing#scrollTo=4yN3Bo6iAH2L) below to get started with the model.
- Or
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained( "airesearch/WangchanLion7B", trust_remote_code=True)
@@ -68,3 +70,8 @@ output = model.generate(
 print(tokenizer.decode(output[0], skip_special_tokens=True))
 ```

 # How to Get Started with the Model
 Use the code [here](https://colab.research.google.com/drive/1y_7oOU3ZJI0h4chUrXFL3K4kelW_OI2G?usp=sharing#scrollTo=4yN3Bo6iAH2L) below to get started with the model.
+Or
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained( "airesearch/WangchanLion7B", trust_remote_code=True)
 print(tokenizer.decode(output[0], skip_special_tokens=True))
 ```
+# Training Details
+## Training Data
+Finetuning datasets are sourced from [LAION OIG chip2 and infill_dbpedia (Apache-2.0)](https://huggingface.co/datasets/laion/OIG), [DataBricks Dolly v2 (Apache-2.0)](https://github.com/databrickslabs/dolly), [OpenAI TL;DR (MIT)](https://github.com/openai/summarize-from-feedback), [Hello-SimpleAI HC3 (CC-BY SA)](https://huggingface.co/datasets/Hello-SimpleAI/HC3), [dolphin](https://huggingface.co/datasets/ehartford/dolphin), [iapp_wiki_qa_squad](https://huggingface.co/datasets/iapp_wiki_qa_squad) , [thaisum](https://huggingface.co/datasets/thaisum), [xlsum](https://huggingface.co/datasets/csebuetnlp/xlsum), [scb_mt_enth_2020](https://huggingface.co/datasets/scb_mt_enth_2020), han dataset, [xp3x](https://huggingface.co/datasets/Muennighoff/xP3x) and [Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus).
+## Training regime
+- QLoRA with 4 GPUs. (A100 40GB?)