File size: 3,728 Bytes
6a864df
 
9133415
 
 
 
 
6a864df
 
390fde4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
---
library_name: transformers
license: other
datasets:
- Locutusque/hercules-v4.0
language:
- en
---

<style>
  body {
    font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
    line-height: 1.6;
    color: #f5f5f5;
    background-color: #1e2a36;
    margin: 0;
    padding: 0;
  }

  .container {
    max-width: 1200px;
    margin: 20px auto;
    padding: 20px;
    background-color: #2a3f54;
    border-radius: 8px;
    box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
    display: flex;
    flex-wrap: wrap;
    justify-content: space-between;
  }

  h1 {
    font-size: 2.5rem;
    color: #51a3d3;
    text-align: center;
    margin-bottom: 30px;
    width: 100%;
  }

  h2 {
    font-size: 1.75rem;
    margin: 20px 0;
    color: #63b8ea;
    padding-bottom: 10px;
  }

  h3 {
    font-size: 1.25rem;
    color: #80c8f4;
  }

  p, a {
    font-size: 1rem;
  }

  p {
    color: #b0c2ce;
    margin-bottom: 20px;
  }

  ul {
    list-style-type: none;
    padding: 0;
    display: flex;
    flex-wrap: wrap;
    justify-content: space-between;
    width: 100%;
  }

  li {
    background-color: #34495e;
    padding: 20px;
    margin-bottom: 10px;
    border-radius: 4px;
    cursor: pointer;
    transition: background-color 0.3s ease, color 0.3s ease;
    overflow: hidden;
    color: #b0c2ce;
    width: calc(50% - 10px);
    box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
  }

  li:hover {
    background-color: #4e6a81;
    color: #dfe8f1;
  }

  .section-content {
    margin-top: 15px;
    border-top: 1px solid #4e6a81;
    padding-top: 10px;
  }

  a {
    color: #a4c8e1;
    text-decoration: none;
  }

  a:hover {
    text-decoration: underline;
  }

  pre {
    background-color: #2c3e50;
    padding: 10px;
    border-radius: 5px;
    overflow-x: auto;
    color: #b0c2ce;
  }
</style>
<div class="container">
  <h1>Hercules-Qwen1.5-14B</h1>
</div>

<ul>
  <li>
    <h2>Model Details</h2>
    <div class="section-content">
      <h3>Model Description</h3>
      <p>This model has capabilities in math, coding, function calling, roleplay, and more. We fine-tuned it using 700,000 examples of Hercules-v4.</p>
      <p><strong>Developed by:</strong> M4-ai</p>
      <p><strong>Language(s) (NLP):</strong> English and maybe Chinese</p>
      <p><strong>License:</strong> tongyi-qianwen license</p>
      <p><strong>Finetuned from model:</strong> <a href="https://huggingface.co/Qwen/Qwen1.5-14B">Qwen1.5-14B</a></p>
    </div>
  </li>
  <li>
    <h2>Uses</h2>
    <div class="section-content">
      <p>General purpose assistant, question answering, chain-of-thought, etc..</p>
      <h3>Recommendations</h3>
      <p>Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.</p>
    </div>
  </li>
  <li>
    <h2>Evaluation</h2>
    <div class="section-content">
      <p>Coming soon</p>
    </div>
  </li>
  <li>
    <h2>Training Details</h2>
    <div class="section-content">
      <h3>Training Data</h3>
      <p><a href="https://huggingface.co/datasets/Locutusque/hercules-v4.0">https://huggingface.co/datasets/Locutusque/hercules-v4.0</a></p>
      <h4>Training Hyperparameters</h4>
      <p><strong>Training regime:</strong> bf16 non-mixed precision</p>
    </div>
  </li>
  <li>
    <h2>Technical Specifications</h2>
    <div class="section-content">
      <h4>Hardware</h4>
      <p>We used 8 Kaggle TPUs, and we trained at a global batch size of 128 and sequence length of 1024</p>
    </div>
  </li>
  <li>
    <h2>Contributions</h2>
    <div class="section-content">
      <p>Thanks to @Tonic, @aloobun, @fhai50032, and @Locutusque for their contributions to this model.</p>
    </div>
  </li>
</ul>