Instructions to use sayril007/finoma-secondtokenizer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sayril007/finoma-secondtokenizer with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="sayril007/finoma-secondtokenizer")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("sayril007/finoma-secondtokenizer") model = AutoModelForCausalLM.from_pretrained("sayril007/finoma-secondtokenizer") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use sayril007/finoma-secondtokenizer with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "sayril007/finoma-secondtokenizer" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sayril007/finoma-secondtokenizer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/sayril007/finoma-secondtokenizer
- SGLang
How to use sayril007/finoma-secondtokenizer with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "sayril007/finoma-secondtokenizer" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sayril007/finoma-secondtokenizer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "sayril007/finoma-secondtokenizer" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sayril007/finoma-secondtokenizer", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use sayril007/finoma-secondtokenizer with Docker Model Runner:
docker model run hf.co/sayril007/finoma-secondtokenizer
| { | |
| "metadata": { | |
| "total_size": 13316947968 | |
| }, | |
| "weight_map": { | |
| "model/decoder/embed_positions/embedding": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/embed_tokens/embedding": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/0/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/1/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/10/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/11/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/12/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/13/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/14/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/15/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/16/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/17/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/18/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/19/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/2/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/20/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/21/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/22/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/23/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/24/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/25/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/26/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/27/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/28/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn/out_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn/q_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn/q_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn/v_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn/v_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/29/self_attn_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/fc1/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/fc1/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/fc2/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/fc2/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/final_layer_norm/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/final_layer_norm/scale": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn/k_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn/k_proj/kernel": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn/out_proj/bias": "flax_model-00001-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/3/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/fc1/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/fc1/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/fc2/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/fc2/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/final_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/final_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn/k_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn/k_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn/out_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/30/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/fc1/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/fc1/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/fc2/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/fc2/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/final_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/final_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn/k_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn/k_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn/out_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/31/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/fc1/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/fc1/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/fc2/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/fc2/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/final_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/final_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn/k_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn/k_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn/out_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/4/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/fc1/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/fc1/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/fc2/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/fc2/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/final_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/final_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn/k_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn/k_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn/out_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/5/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/fc1/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/fc1/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/fc2/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/fc2/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/final_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/final_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn/k_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn/k_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn/out_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/6/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/fc1/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/fc1/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/fc2/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/fc2/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/final_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/final_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn/k_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn/k_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn/out_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/7/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/fc1/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/fc1/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/fc2/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/fc2/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/final_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/final_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn/k_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn/k_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn/out_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/8/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/fc1/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/fc1/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/fc2/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/fc2/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/final_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/final_layer_norm/scale": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn/k_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn/k_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn/out_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn/out_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn/q_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn/q_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn/v_proj/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn/v_proj/kernel": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn_layer_norm/bias": "flax_model-00002-of-00002.msgpack", | |
| "model/decoder/layers/9/self_attn_layer_norm/scale": "flax_model-00002-of-00002.msgpack" | |
| } | |
| } | |