monsoon-nlp commited on
Commit
5726e08
1 Parent(s): 3e20386

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -7
README.md CHANGED
@@ -1,22 +1,41 @@
1
  ---
2
  language:
3
  - en
4
- license: apache-2.0
5
  tags:
6
  - text-generation-inference
7
- - transformers
8
  - unsloth
9
  - llama
10
  - trl
 
11
  base_model: gradientai/Llama-3-8B-Instruct-262k
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** monsoon-nlp
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** gradientai/Llama-3-8B-Instruct-262k
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
1
  ---
2
  language:
3
  - en
4
+ license: llama3
5
  tags:
6
  - text-generation-inference
 
7
  - unsloth
8
  - llama
9
  - trl
10
+ - dna
11
  base_model: gradientai/Llama-3-8B-Instruct-262k
12
  ---
13
 
14
+ # llama3-dnapretrain-kaniwa
15
 
16
+ This is a LoRA adapter.
17
+
18
+ The base model is the longer-context LLaMA-3-8b-Instruct developed by Gradient and Crusoe: `gradientai/Llama-3-8B-Instruct-262k`
19
+
20
+ The dataset was part of BYU's 2019 kaniwa (*Chenopodium pallidicaule*) genome, from https://genomevolution.org/coge/GenomeInfo.pl?gid=53872
21
+
22
+ The adapter was finetuned for 3 hours on an A100. The data was split into ~20k nucleotide snippets with an Alpaca like message format.
23
+
24
+ Training Notebook: https://colab.research.google.com/drive/1XZcCYGFQGtz3_AKSR4F67WYXl6DIwP4R
25
+
26
+ Sample message:
27
+ ```
28
+ Write information about the nucleotide sequence.
29
+
30
+ ### Sequence:
31
+ GCCTATAGTGTGTAGCTAATGAGCCTAGGTTATCGACCCTAATCT...
32
+
33
+ ### Annotation:
34
+ Information about location in the kaniwa chromosome: >lcl|Cp5
35
+ ```
36
 
37
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
38
 
39
+ **Genome Citation**
40
+
41
+ Mangelson H, et al. The genome of *Chenopodium pallidicaule*: an emerging Andean super grain. Appl. Plant Sci. 2019;7:e11300. doi: 10.1002/aps3.11300