Steelskull commited on
Commit
ea1aec4
1 Parent(s): f23a241

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -24
README.md CHANGED
@@ -4,35 +4,35 @@ tags:
4
  - mergekit
5
  - lazymergekit
6
  - abacaj/phi-2-super
7
- - abacaj/phi-2-super
8
- - abacaj/phi-2-super
9
- - abacaj/phi-2-super
10
- - abacaj/phi-2-super
11
- - abacaj/phi-2-super
12
- - abacaj/phi-2-super
13
- - abacaj/phi-2-super
14
  base_model:
15
  - abacaj/phi-2-super
16
- - abacaj/phi-2-super
17
- - abacaj/phi-2-super
18
- - abacaj/phi-2-super
19
- - abacaj/phi-2-super
20
- - abacaj/phi-2-super
21
- - abacaj/phi-2-super
22
- - abacaj/phi-2-super
23
- ---
24
 
25
  # phi-2-DLEC
26
 
27
- phi-2-DLEC is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
28
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
29
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
30
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
31
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
32
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
33
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
34
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
35
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
  ## 🧩 Configuration
38
 
 
4
  - mergekit
5
  - lazymergekit
6
  - abacaj/phi-2-super
 
 
 
 
 
 
 
7
  base_model:
8
  - abacaj/phi-2-super
 
 
 
 
 
 
 
 
9
 
10
  # phi-2-DLEC
11
 
12
+ The DLEC (Distributive Layer Expansion Curve) methodology offers a novel approach to improving neural network models by focusing on the strategic duplication of certain effective layers.
13
+ Developed with the aim of enhancing model performance, DLEC carefully identifies and amplifies the impact of key layers within the model's architecture.
14
+
15
+ Below is a overview of the method and its implementation, particularly in how it integrates with the Hugging Face Transformers library and utilizes PyTorch and BitsAndBytes for efficient operation.
16
+
17
+ Overview
18
+ Setting Up: First, the script ensures all necessary components are in place, from libraries to the model and dataset.
19
+
20
+ Database for Activations: A SQLite database is established to track layer activations, providing a clear view into how individual neurons react and which layers are most influential — these are our 'beneficial layers.'
21
+
22
+ Analyzing and Identifying: By analyzing activation data, the script pinpoints which layers are most valuable to the model's performance.
23
+
24
+ Configuring DLEC: A configuration is then created, guiding how the model should incorporate duplicates of these beneficial layers to boost effectiveness without unnecessarily increasing complexity.
25
+
26
+ Reconfiguring and Running the Model: Finally, the model is adjusted according to DLEC's insights, focusing enhancement on the identified layers.
27
+
28
+ Key Features:
29
+ Selective Layer Duplication: DLEC doesn't just add more layers; it doubles down on the ones that really matter. This methodical selection ensures we're making the most of the model's capabilities without wasteful expansion.
30
+
31
+ Smart Resource Management: By honing in on specific areas for improvement, DLEC aims to make better use of computational and memory resources, promoting more efficient learning without adding undue complexity to the model.
32
+
33
+ This approach is about making informed, strategic enhancements to model architecture, prioritizing efficiency and effectiveness in utilizing neural network capabilities.
34
+
35
+ # This Method is still in development and I do not expect "Game Changing" or will I oversell this method, it is purely done for fun. Please let me know how the model works for you.
36
 
37
  ## 🧩 Configuration
38