asiansoul's picture
Update README.md
ce3f25f verified
|
raw
history blame
3.49 kB
---
base_model:
- beomi/Llama-3-KoEn-8B-Instruct-preview
- Danielbrdz/Barcenas-Llama3-8b-ORPO
- maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
- rombodawg/Llama-3-8B-Instruct-Coder
- NousResearch/Meta-Llama-3-8B-Instruct
- rombodawg/Llama-3-8B-Base-Coder-v3.5-10k
- cognitivecomputations/dolphin-2.9-llama3-8b
- asiansoul/Llama-3-Open-Ko-Linear-8B
- NousResearch/Meta-Llama-3-8B
- aaditya/Llama3-OpenBioLLM-8B
library_name: transformers
tags:
- mergekit
- merge
---
# Joah-Llama-3-KoEn-8B-Coder-v1
<a href="https://ibb.co/8XPkwP8"><img src="https://i.ibb.co/kMqZTqc/Joah.png" alt="Joah" border="0"></a><br />
였늘 λΆ€ν„° μ„œλ‘œμ—κ²Œ 빛이 λ˜μ–΄ 쀄 μ—¬λŸ¬λΆ„μ˜ Merge Model
"μ’‹μ•„(Joah)" by AsianSoul
## Merge Details
The performance of this merge model doesn't seem to be bad though.-> Just opinion
This may not be a model that satisfies you. But if we continue to overcome our shortcomings,
Won't we someday find the answer we want?
### Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base.
### Models Merged
The following models were included in the merge:
* [beomi/Llama-3-KoEn-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-Instruct-preview)
* [Danielbrdz/Barcenas-Llama3-8b-ORPO](https://huggingface.co/Danielbrdz/Barcenas-Llama3-8b-ORPO)
* [maum-ai/Llama-3-MAAL-8B-Instruct-v0.1](https://huggingface.co/maum-ai/Llama-3-MAAL-8B-Instruct-v0.1)
* [rombodawg/Llama-3-8B-Instruct-Coder](https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder)
* [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
* [rombodawg/Llama-3-8B-Base-Coder-v3.5-10k](https://huggingface.co/rombodawg/Llama-3-8B-Base-Coder-v3.5-10k)
* [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
* [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B)
* [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: NousResearch/Meta-Llama-3-8B
# Base model providing a general foundation without specific parameters
- model: NousResearch/Meta-Llama-3-8B-Instruct
parameters:
density: 0.60
weight: 0.25
- model: beomi/Llama-3-KoEn-8B-Instruct-preview
parameters:
density: 0.55
weight: 0.15
- model: asiansoul/Llama-3-Open-Ko-Linear-8B
parameters:
density: 0.55
weight: 0.2
- model: maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
parameters:
density: 0.55
weight: 0.1
- model: rombodawg/Llama-3-8B-Instruct-Coder
parameters:
density: 0.55
weight: 0.1
- model: rombodawg/Llama-3-8B-Base-Coder-v3.5-10k
parameters:
density: 0.55
weight: 0.1
- model: cognitivecomputations/dolphin-2.9-llama3-8b
parameters:
density: 0.55
weight: 0.05
- model: Danielbrdz/Barcenas-Llama3-8b-ORPO
parameters:
density: 0.55
weight: 0.05
- model: aaditya/Llama3-OpenBioLLM-8B
parameters:
density: 0.55
weight: 0.1
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
parameters:
int8_mask: true
dtype: bfloat16
```