dacorvo HF staff commited on
Commit
6916e19
1 Parent(s): fe9153d

Create model card

Browse files
Files changed (1) hide show
  1. README.md +58 -0
README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ pipeline_tag: text-generation
5
+ inference: false
6
+ tags:
7
+ - facebook
8
+ - meta
9
+ - pytorch
10
+ - mistral
11
+ - inferentia2
12
+ - neuron
13
+ ---
14
+ # Neuronx model for [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
15
+
16
+ This repository contains [**AWS Inferentia2**](https://aws.amazon.com/ec2/instance-types/inf2/) and [`neuronx`](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/) compatible checkpoints for [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf).
17
+ You can find detailed information about the base model on its [Model Card](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).
18
+
19
+ This model has been exported to the `neuron` format using specific `input_shapes` and `compiler` parameters detailed in the paragraphs below.
20
+
21
+ Please refer to the 🤗 `optimum-neuron` [documentation](https://huggingface.co/docs/optimum-neuron/main/en/guides/models#configuring-the-export-of-a-generative-model) for an explanation of these parameters.
22
+
23
+ ## Usage on Amazon SageMaker
24
+
25
+ _coming soon_
26
+
27
+ ## Usage with 🤗 `optimum-neuron`
28
+
29
+ ```python
30
+ >>> from optimum.neuron import pipeline
31
+
32
+ >>> p = pipeline('text-generation', 'aws-neuron/Mistral-7B-Instruct-v0.1-neuron-4x2048-24-cores')
33
+ >>> p("My favorite place on earth is", max_new_tokens=64, do_sample=True, top_k=50)
34
+ [{'generated_text': 'My favorite place on earth is the ocean. It is where I feel most
35
+ at peace. I love to travel and see new places. I have a'}]
36
+ ```
37
+
38
+ This repository contains tags specific to versions of `neuronx`. When using with 🤗 `optimum-neuron`, use the repo revision specific to the version of `neuronx` you are using, to load the right serialized checkpoints.
39
+
40
+ ## Arguments passed during export
41
+
42
+ **input_shapes**
43
+
44
+ ```json
45
+ {
46
+ "batch_size": 4,
47
+ "sequence_length": 2048,
48
+ }
49
+ ```
50
+
51
+ **compiler_args**
52
+
53
+ ```json
54
+ {
55
+ "auto_cast_type": "bf16",
56
+ "num_cores": 24,
57
+ }
58
+ ```