|
--- |
|
base_model: |
|
- beomi/Llama-3-KoEn-8B-Instruct-preview |
|
- Danielbrdz/Barcenas-Llama3-8b-ORPO |
|
- maum-ai/Llama-3-MAAL-8B-Instruct-v0.1 |
|
- rombodawg/Llama-3-8B-Instruct-Coder |
|
- NousResearch/Meta-Llama-3-8B-Instruct |
|
- rombodawg/Llama-3-8B-Base-Coder-v3.5-10k |
|
- cognitivecomputations/dolphin-2.9-llama3-8b |
|
- asiansoul/Llama-3-Open-Ko-Linear-8B |
|
- NousResearch/Meta-Llama-3-8B |
|
- aaditya/Llama3-OpenBioLLM-8B |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
|
|
--- |
|
# Joah-Llama-3-KoEn-8B-Coder-v1 |
|
|
|
<a href="https://ibb.co/8XPkwP8"><img src="https://i.ibb.co/kMqZTqc/Joah.png" alt="Joah" border="0"></a><br /> |
|
|
|
μ€λ λΆν° μλ‘μκ² λΉμ΄ λμ΄ μ€ μ¬λ¬λΆμ Merge Model |
|
|
|
"μ’μ(Joah)" by AsianSoul |
|
|
|
|
|
## Merge Details |
|
|
|
|
|
The performance of this merge model doesn't seem to be bad though.-> Just opinion |
|
|
|
This may not be a model that satisfies you. But if we continue to overcome our shortcomings, |
|
|
|
Won't we someday find the answer we want? |
|
|
|
Don't worry even if you don't get the results you want. |
|
|
|
I'll find the answer for you. |
|
|
|
Soon real PoSE to extend Llama's context length to 64k with using my merge method : [reborn](https://medium.com/@puffanddmx82/reborn-elevating-model-adaptation-with-merging-for-superior-nlp-performance-f604e8e307b2) |
|
|
|
I have found that most of merge's model outside so far do not actually have 64k in their configs. I will improve it in the next merge with my reborn. If that doesn't work, I guess I'll have to find another way, right? |
|
|
|
256k is not possible. My computer is running out of memory. |
|
|
|
If you support me, i will try it on a computer with maximum specifications, also, i would like to conduct great tests by building a network with high-capacity traffic and high-speed 10G speeds for you. |
|
|
|
|
|
### Merge Method |
|
|
|
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [beomi/Llama-3-KoEn-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-Instruct-preview) |
|
* [Danielbrdz/Barcenas-Llama3-8b-ORPO](https://huggingface.co/Danielbrdz/Barcenas-Llama3-8b-ORPO) |
|
* [maum-ai/Llama-3-MAAL-8B-Instruct-v0.1](https://huggingface.co/maum-ai/Llama-3-MAAL-8B-Instruct-v0.1) |
|
* [rombodawg/Llama-3-8B-Instruct-Coder](https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder) |
|
* [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) |
|
* [rombodawg/Llama-3-8B-Base-Coder-v3.5-10k](https://huggingface.co/rombodawg/Llama-3-8B-Base-Coder-v3.5-10k) |
|
* [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b) |
|
* [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B) |
|
* [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B) |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
models: |
|
- model: NousResearch/Meta-Llama-3-8B |
|
# Base model providing a general foundation without specific parameters |
|
|
|
- model: NousResearch/Meta-Llama-3-8B-Instruct |
|
parameters: |
|
density: 0.60 |
|
weight: 0.25 |
|
|
|
- model: beomi/Llama-3-KoEn-8B-Instruct-preview |
|
parameters: |
|
density: 0.55 |
|
weight: 0.15 |
|
|
|
- model: asiansoul/Llama-3-Open-Ko-Linear-8B |
|
parameters: |
|
density: 0.55 |
|
weight: 0.2 |
|
|
|
- model: maum-ai/Llama-3-MAAL-8B-Instruct-v0.1 |
|
parameters: |
|
density: 0.55 |
|
weight: 0.1 |
|
|
|
- model: rombodawg/Llama-3-8B-Instruct-Coder |
|
parameters: |
|
density: 0.55 |
|
weight: 0.1 |
|
|
|
- model: rombodawg/Llama-3-8B-Base-Coder-v3.5-10k |
|
parameters: |
|
density: 0.55 |
|
weight: 0.1 |
|
|
|
- model: cognitivecomputations/dolphin-2.9-llama3-8b |
|
parameters: |
|
density: 0.55 |
|
weight: 0.05 |
|
|
|
- model: Danielbrdz/Barcenas-Llama3-8b-ORPO |
|
parameters: |
|
density: 0.55 |
|
weight: 0.05 |
|
|
|
- model: aaditya/Llama3-OpenBioLLM-8B |
|
parameters: |
|
density: 0.55 |
|
weight: 0.1 |
|
|
|
merge_method: dare_ties |
|
base_model: NousResearch/Meta-Llama-3-8B |
|
parameters: |
|
int8_mask: true |
|
dtype: bfloat16 |
|
|
|
|
|
``` |
|
|