jeffra commited on
Commit
a519f10
1 Parent(s): 89831d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -0
README.md CHANGED
@@ -1,3 +1,88 @@
1
  ---
2
  license: apache-2.0
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - snowflake
5
+ - arctic
6
  ---
7
+
8
+ ## Model Details
9
+
10
+ Arctic is a Dense-MoE Hybrid transformer architecture pre-trained from scratch by the Snowflake AI
11
+ Research Team. We are releasing model checkpoints for both the base and instruct-tuned versions of
12
+ Arctic under an Apache-2.0 license. This means you can use them freely in your own research,
13
+ prototypes, and products.
14
+
15
+ * [Arctic-Base](link-here)
16
+ * [Acrtic-Instruct](link-to-instruct)
17
+
18
+ **Model developers** Snowflake
19
+
20
+ **License** Apache-2.0
21
+
22
+ **Input** Models input text only.
23
+
24
+ **Output** Models generate text and code only.
25
+
26
+ **Model Release Date** April, 24th 2024.
27
+
28
+ ## Model Architecture
29
+
30
+ Arctic combines a 10B dense transformer model with a residual 128x3.66B MoE MLP resulting in 480B
31
+ total and 17B active parameters chosen using a top-2 gating. For more details about Arctic's model
32
+ architecture please see our cookbook
33
+
34
+
35
+ ## Usage
36
+
37
+ As of 4/24/2024 we are actively working with the maintainers of `transformers` to include the Arctic
38
+ model implementation. Until this support is released please follow these instructions to get the
39
+ required dependencies for using Arctic:
40
+
41
+ ```python
42
+ pip install git+https://github.com/Snowflake-Labs/transformers.git
43
+ ```
44
+
45
+ Arctic leverages several features from [DeepSpeed](https://github.com/microsoft/DeepSpeed), you will need to
46
+ install the latest version of DeepSpeed to get all of these required features:
47
+
48
+ ```python
49
+ pip install "deepspeed>=0.15.0"
50
+ ```
51
+
52
+ ### Inference
53
+
54
+ To get the best performance with Arctic we highly recommend using TRT-LLM or vLLM for inference. However you
55
+ can also use `transformers` to load
56
+ the model for text generation. Due to the model size we recommend using a single 8xH100 instance from your
57
+ favorite cloud provider such as: AWS [p5.48xlarge](https://aws.amazon.com/ec2/instance-types/p5/),
58
+ Azure [ND96isr_H100_v5](https://learn.microsoft.com/en-us/azure/virtual-machines/nd-h100-v5-series), etc.
59
+
60
+ In addition, if you would like to access Acrtic via API we have colloborated with several inference API
61
+ providers to host Acrtic such as AWS, Microsoft Azure, NVIDIA Foundry, Lamini, Perplexity, Replicate and Together.
62
+
63
+ ```python
64
+ import torch
65
+ from transformers import AutoTokenizer, AutoModelForCausalLM
66
+
67
+ tokenizer = AutoTokenizer.from_pretrained("snowflake/arctic")
68
+ model = AutoModelForCausalLM.from_pretrained("snowflake/arctic", device_map="auto", torch_dtype=torch.bfloat16)
69
+
70
+ input_text = "Hello my name is "
71
+ input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
72
+
73
+ outputs = model.generate(**input_ids, max_new_tokens=20)
74
+ print(tokenizer.decode(outputs[0]))
75
+ ```
76
+
77
+ ### Fine-Tuning
78
+
79
+ TODO: add link and extra details about fine-tuning scripts
80
+
81
+ ## Metrics
82
+
83
+ TODO: add summary of metrics here, we don't necessarily need to compare to others but we can if we want
84
+
85
+ ## Training Data
86
+
87
+ TODO: add short description and links to training data related cookbook(s)
88
+