Steelskull
commited on
Commit
•
ea1aec4
1
Parent(s):
f23a241
Update README.md
Browse files
README.md
CHANGED
@@ -4,35 +4,35 @@ tags:
|
|
4 |
- mergekit
|
5 |
- lazymergekit
|
6 |
- abacaj/phi-2-super
|
7 |
-
- abacaj/phi-2-super
|
8 |
-
- abacaj/phi-2-super
|
9 |
-
- abacaj/phi-2-super
|
10 |
-
- abacaj/phi-2-super
|
11 |
-
- abacaj/phi-2-super
|
12 |
-
- abacaj/phi-2-super
|
13 |
-
- abacaj/phi-2-super
|
14 |
base_model:
|
15 |
- abacaj/phi-2-super
|
16 |
-
- abacaj/phi-2-super
|
17 |
-
- abacaj/phi-2-super
|
18 |
-
- abacaj/phi-2-super
|
19 |
-
- abacaj/phi-2-super
|
20 |
-
- abacaj/phi-2-super
|
21 |
-
- abacaj/phi-2-super
|
22 |
-
- abacaj/phi-2-super
|
23 |
-
---
|
24 |
|
25 |
# phi-2-DLEC
|
26 |
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
|
37 |
## 🧩 Configuration
|
38 |
|
|
|
4 |
- mergekit
|
5 |
- lazymergekit
|
6 |
- abacaj/phi-2-super
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
base_model:
|
8 |
- abacaj/phi-2-super
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
|
10 |
# phi-2-DLEC
|
11 |
|
12 |
+
The DLEC (Distributive Layer Expansion Curve) methodology offers a novel approach to improving neural network models by focusing on the strategic duplication of certain effective layers.
|
13 |
+
Developed with the aim of enhancing model performance, DLEC carefully identifies and amplifies the impact of key layers within the model's architecture.
|
14 |
+
|
15 |
+
Below is a overview of the method and its implementation, particularly in how it integrates with the Hugging Face Transformers library and utilizes PyTorch and BitsAndBytes for efficient operation.
|
16 |
+
|
17 |
+
Overview
|
18 |
+
Setting Up: First, the script ensures all necessary components are in place, from libraries to the model and dataset.
|
19 |
+
|
20 |
+
Database for Activations: A SQLite database is established to track layer activations, providing a clear view into how individual neurons react and which layers are most influential — these are our 'beneficial layers.'
|
21 |
+
|
22 |
+
Analyzing and Identifying: By analyzing activation data, the script pinpoints which layers are most valuable to the model's performance.
|
23 |
+
|
24 |
+
Configuring DLEC: A configuration is then created, guiding how the model should incorporate duplicates of these beneficial layers to boost effectiveness without unnecessarily increasing complexity.
|
25 |
+
|
26 |
+
Reconfiguring and Running the Model: Finally, the model is adjusted according to DLEC's insights, focusing enhancement on the identified layers.
|
27 |
+
|
28 |
+
Key Features:
|
29 |
+
Selective Layer Duplication: DLEC doesn't just add more layers; it doubles down on the ones that really matter. This methodical selection ensures we're making the most of the model's capabilities without wasteful expansion.
|
30 |
+
|
31 |
+
Smart Resource Management: By honing in on specific areas for improvement, DLEC aims to make better use of computational and memory resources, promoting more efficient learning without adding undue complexity to the model.
|
32 |
+
|
33 |
+
This approach is about making informed, strategic enhancements to model architecture, prioritizing efficiency and effectiveness in utilizing neural network capabilities.
|
34 |
+
|
35 |
+
# This Method is still in development and I do not expect "Game Changing" or will I oversell this method, it is purely done for fun. Please let me know how the model works for you.
|
36 |
|
37 |
## 🧩 Configuration
|
38 |
|