theblackcat102
/

pythia-3b-deduped-sft-r1

 ---
 license: apache-2.0
+language:
+- en
+tags:
+- sft
+pipeline_tag: text-generation
+widget:
+  - text: <prefix>You are a helpful assistant model trained by LAION called Aki</prefix><human>Hi, how are you?<bot>
+  - text: <human>What's the Earth total population<bot>
+  - text: <human>Write a story about future of AI development<bot>
 ---
+# Pythia 3B SFT model revision 1
+<!-- Provide a quick summary of what the model is/does. -->
+# Model Details
+## Model Description
+Model was supervised fine tuned on only [Open Assistant](https://open-assistant.io/) crowd souce platform.
+- **Developed by:** Open Assistant
+- **Model type:** Pythia
+- **Language(s) (NLP):** English
+- **License:** Apache-2.0
+## Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [Open Assistant](https://github.com/LAION-AI/Open-Assistant)
+# Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+## Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+See the example on the right
+# Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[just read pythia](https://huggingface.co/EleutherAI/pythia-12b#out-of-scope-use)
+## Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "theblackcat102/pythia-3b-deduped-sft-r1"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name).half().eval().cuda()
+input_text = "<human>What's the earth population?<bot>"
+inputs = tokenizer(input_text, return_tensors="pt", padding=True).to(0)
+outputs = model.generate(
+    **inputs,
+    early_stopping=True,
+    max_new_tokens=args.max_new_tokens,
+    do_sample=True,
+    top_k=args.top_k,
+    temperature=args.temperature,
+    pad_token_id=tokenizer.eos_token_id,
+    # dialogue_collator.py line 36
+)
+output = tokenizer.decode(outputs[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"])
+print(output)
+```
+# Training Details
+## Training Data
+<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+## Training Procedure
+```
+deepspeed trainer_sft.py --configs defaults pythia-1-4b-ost --deepspeed
+```
+This model was trained for 200 iterations. After 200 iterations the accuracy started to drop and loss increasing which is a sign of overfitting.
+### Training Hyperparameters
+```
+defaults:
+  learning_rate: 1e-5
+  gradient_checkpointing: false
+  gradient_accumulation_steps: 32
+  per_device_train_batch_size: 2
+  per_device_eval_batch_size: 2
+  weight_decay: 0.00
+  warmup_steps: 600
+  eval_steps: 250
+  save_steps: 250
+  max_length: 512
+  num_train_epochs: 2
+  logging_steps: 10
+  max_grad_norm: 2.0
+  save_total_limit: 4
+  fp16: true
+  eval_accumulation_steps:
+  freeze_layer:
+  datasets:
+    - oa_private:
+        data_path: .cache
+        split: sft
+        val_split: 0.01
+        fraction: 1
+        file: 2023-02-26_oasst_default.jsonl
+  cache_dir: .cache
+  loss_fn: CrossEntropyLoss
+  eval_size:
+  log_dir: "base"
+  quantization: false
+  seq2seqmodel: false
+  poly_eps: 1.0
+  fuse_gelu: false
+  log_wandb: true
+  samples_mixing: true # uses collator that mixes samples in the batch to create a single sample with possible multiple tasks within
+  verbose: false
+pythia-1-4b-ost:
+  learning_rate: 1e-6
+  model_name: EleutherAI/pythia-1.4b-deduped
+  weight_decay: 0.01
+  max_length: 1024
+  warmup_steps: 100
+  gradient_checkpointing: false
+  gradient_accumulation_steps: 12
+  per_device_train_batch_size: 5
+  per_device_eval_batch_size: 6
+  eval_steps: 100
+  save_steps: 100
+  num_train_epochs: 50
+  save_total_limit: 4
+```
+# Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+## Testing Data, Factors & Metrics
+### Testing Data
+<!-- This should link to a Data Card if possible. -->
+[More Information Needed]
+### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+## Results
+### Summary
+# Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+# Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+# Technical Specifications [optional]
+## Model Architecture and Objective
+[More Information Needed]
+## Compute Infrastructure
+[More Information Needed]
+### Hardware
+[More Information Needed]
+### Software
+[More Information Needed]
+# Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+# Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+# Acknowledgements
+- [LAION](https://laion.ai/) & EleutherAI
+- [Stability.ai](https://stability.ai/) : this project wouldn't be possible without their compute resource
+- [Teams and contributors at Open Assistant](https://github.com/LAION-AI/Open-Assistant/graphs/contributors) : who put their time after their day job or whatever into this project
+- [Huggingface](https://huggingface.co/) : For the storage and spaces here
+# Model Card Authors [optional]
+[More Information Needed]
+# Model Card Contact
+[More Information Needed]