Jan commited on
Commit
cb4686c
1 Parent(s): 91250a3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ ---
6
+ <!-- header start -->
7
+ <!-- 200823 -->
8
+ <div style="width: auto; margin-left: auto; margin-right: auto">
9
+ <img src="https://github.com/janhq/jan/assets/89722390/35daac7d-b895-487c-a6ac-6663daaad78e" alt="Jan banner" style="width: 100%; min-width: 400px; display: block; margin: auto;">
10
+ </div>
11
+
12
+ <p align="center">
13
+ <a href="https://jan.ai/">Jan</a>
14
+ - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
15
+ </p>
16
+ <!-- header end -->
17
+ # Model Description
18
+ This model uses the `DARE_TIES` merge method.
19
+ **NOTE:** Due to the mismatch of architecture between Llama and Mistral, Magicoder-S-CL-7B layers will be skipped
20
+
21
+ ```yaml
22
+ base_model: mistralai/Mistral-7B-v0.1
23
+ dtype: bfloat16
24
+ merge_method: dare_ties
25
+ models:
26
+ - model: mistralai/Mistral-7B-v0.1
27
+ - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
28
+ parameters:
29
+ density: 0.8
30
+ weight: 0.3
31
+ - model: Q-bert/MetaMath-Cybertron-Starling
32
+ parameters:
33
+ density: 0.8
34
+ weight: 0.3
35
+ - model: ise-uiuc/Magicoder-S-CL-7B
36
+ parameters:
37
+ density: 0.6
38
+ weight: 0.2
39
+ - model: AIDC-ai-business/Marcoroni-7B-v3
40
+ parameters:
41
+ density: 0.6
42
+ weight: 0.2
43
+ parameters:
44
+ int8_mask: true
45
+ ```
46
+
47
+ # About Jan
48
+ Jan believes in the need for an open-source AI ecosystem and is building the infra and tooling to allow open-source AIs to compete on a level playing field with proprietary ones.
49
+
50
+ Jan's long-term vision is to build a cognitive framework for future robots, who are practical, useful assistants for humans and businesses in everyday life.
51
+
52
+ # Jan Model Merger
53
+ This is a test project for merging models.
54
+
55
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
56
+
57
+ Detailed results can be found here.
58
+
59
+ | Metric | Value |
60
+ |-----------------------|---------------------------|
61
+ | Avg. | ?|
62
+ | ARC (25-shot) | ? |
63
+ | HellaSwag (10-shot) | ? |
64
+ | MMLU (5-shot) | ?|
65
+ | TruthfulQA (0-shot) | ? |
66
+ | Winogrande (5-shot) | ? |
67
+ | GSM8K (5-shot) | ? |
68
+
69
+ # Acknowlegement
70
+ - [mergekit](https://github.com/cg123/mergekit)
71
+ - [DARE](https://github.com/yule-BUAA/MergeLM/blob/main/README.md)
72
+ - [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)