PetroGPT
/

WestSeverus-7B-DPO-v2

@@ -5,191 +5,101 @@ language:
 ---
 # WestSeverus - 7B - DPO - v2
-## Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 # WestSeverus - 7B - DPO - v2
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a53b0747a04f0512941b6f/-_CvSGuu-kQ1GDNzVMYjg.png)
+## ☘️ Model Description
+WestSeverus-7B-DPO-v2 is a WestLake Family model trained over [WestSeverus-7B](https://huggingface.co/FelixChao/WestSeverus-7B).
+The model was trained on several dpo datasets and it can perform well on basic math problem.
+WestSeverus-7B-DPO-v2 can be used in mathematics, chemical, physics and even coding for further research and reference.
+# 📖 Table of Contents
+1. [Nous Benchmark Results](#🪄-nous-benchmark-results)
+    - AGIEval
+    - GPT4All
+    - TruthfulQA Scores
+    - BigBench
+2. [Open LLM Leaderboard](#🏆-open-llm-leaderboard)
+    - ARC
+    - HellaSwag
+    - MMLU
+    - TruthfulQA
+    - Winogrande
+    - GSM8K
+3. [EvalPlus Leaderboard](#⚡-evalplus-leaderboard)
+    - HumanEval
+    - HumanEval_Plus
+    - MBPP
+    - MBPP_Plus
+4. [Prompt Format](#prompt-format)
+5. [Inference Example Code](#inference-code)
+6. [Quantized Models](#🛠️-quantized-models)
+7. [Gratitude](#Gratitude)
+## 🪄 Nous Benchmark Results
+WestSeverus-7B-DPO-v2 is currently on the top of the [YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/CultriX/Yet_Another_LLM_Leaderboard) created by CultriX and it outperforms on TruthfulQA Scores and BigBench.
+| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
+|---|---:|---:|---:|---:|---:|
+| [**WestSeverus-7B-DPO-v2**](https://huggingface.co/FelixChao/WestSeverus-7B-DPO-v2)| **60.98**| 45.29 | 77.2|      **72.72**|   **48.71**|
+| [CultriX/Wernicke-7B-v1](https://huggingface.co/CultriX/Wernicke-7B-v1)| 60.73| 45.59 | 77.36 |   71.46   |  48.49 |
+| [mlabonne/NeuralBeagle14-7B](https://huggingface.co/mlabonne/NeuralBeagle14-7B) | 60.25 |46.06|76.77  | 70.32 |47.86  |
+| [CultriX/MistralTrix-v1](https://huggingface.co/CultriX/MistralTrix-v1)  | 60.05 | 44.98 | 76.62 | 71.44 | 47.17 |
+| [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)  | 59.42 | 44.27 | 77.86 | 67.46 | 48.09 |
+| [mlabonne/Daredevil-7B](https://huggingface.co/mlabonne/Daredevil-7B)  | 58.22 | 44.85 | 76.07 | 64.89 | 47.07 |
+| [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) | 44.61 | 27.96 | 70.84 | 44.46 | 35.17 |
+## 🏆 Open LLM Leaderboard
+WestSeverus-7B-DPO-v2 is one of the top 7B model in Open LLM Leaderboard and it outperforms on TruthfulQA and GSM8K.
+|             Metric              |Value|
+|---------------------------------|----:|
+|Avg.                             |75.29|
+|AI2 Reasoning Challenge (25-Shot)|71.42|
+|HellaSwag (10-Shot)              |88.27|
+|MMLU (5-Shot)                    |64.79|
+|TruthfulQA (0-shot)              |72.37|
+|Winogrande (5-shot)              |83.27|
+|GSM8k (5-shot)                   |71.65|
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_FelixChao__WestSeverus-7B-DPO-v2)
+## ⚡ EvalPlus Leaderboard
+| Model | HumanEval | HumanEval_Plus| MBPP | MBPP_Plus |
+|---|---:|---:|---:|---:|
+| phi-2-2.7B |48.2|43.3|61.9|51.4|
+| **WestSeverus-7B-DPO-v2**| 43.3 | 34.1 |TBD |TBD |
+| SOLAR-10.7B-Instruct-v1.0 |  42.1   |  34.3    |   42.9  |  34.6   |
+| CodeLlama-7B| 37.8| 34.1 | 57.6 |45.4 |
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a53b0747a04f0512941b6f/ckaLICCp_npj64mvxY1yj.png)
+## Prompt_Format
+TBD.
+## Inference Example Code
+TBD.
+## 🛠️ Quantized Models
+### Another version of WestSeverus Model:
+* [**PetroGPT/WestSeverus-7B-DPO**](https://huggingface.co/PetroGPT/WestSeverus-7B-DPO)
+* **GGUF**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GGUF
+* **GGUF**: https://huggingface.co/s3nh/WestSeverus-7B-DPO-GGUF
+* **GPTQ**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GPTQ
+* **AWQ**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-AWQ
+## Gratitude
+TBD.