leafspark
/

IridiumLlama-72B-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

leafspark commited on Aug 14, 2024

Commit

ea1bcbd

·

verified ·

1 Parent(s): 02e49cd

docs: update model card

Files changed (1) hide show

README.md +9 -8

README.md CHANGED Viewed

@@ -12,16 +12,16 @@ tags:
 - llama
 ---
-# FeatherLlama-72B-v0.1
 ## Model Description
 FeatherLlama is a 72B parameter language model created through a merge of Qwen2-72B-Instruct, calme2.1-72b, and magnum-72b-v1 using `model_stock`.
-This is converted from [leafspark/FeatherQwen2-72B-v0.1](https://huggingface.co/leafspark/FeatherQwen2-72B-v0.1)
 ## Features
 - 72 billion parameters
-- Sharded in 31 files (unlike FeatherQwen2, which has 1,043 shards due to the merging process)
 - Combines Magnum prose with Calam smarts
 - Llamaified for easy use
@@ -32,6 +32,7 @@ This is converted from [leafspark/FeatherQwen2-72B-v0.1](https://huggingface.co/
 - Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1
 - Merged layers: 80
 - Total tensors: 1,043
 ### Tensor Distribution
 - Attention layers: 560 files
@@ -49,15 +50,15 @@ Custom script utilizing safetensors library.
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
-model = AutoModelForCausalLM.from_pretrained("leafspark/FeatherLlama-72B-v0.1",
                                              device_map="auto",
                                              torch_dtype=torch.float16)
-tokenizer = AutoTokenizer.from_pretrained("leafspark/FeatherLlama-72B-v0.1")
 ```
 ### GGUFs
-Find them here: [leafspark/FeatherLlama-72B-v0.1-GGUF](https://huggingface.co/leafspark/FeatherLlama-72B-v0.1-GGUF)
 ### Hardware Requirements
-- Minimum ~140GB of storage
-- ~140GB VRAM

 - llama
 ---
+# IridiumLlama-72B-v0.1
 ## Model Description
 FeatherLlama is a 72B parameter language model created through a merge of Qwen2-72B-Instruct, calme2.1-72b, and magnum-72b-v1 using `model_stock`.
+This is converted from [leafspark/Iridium-72B-v0.1](https://huggingface.co/leafspark/Iridium-72B-v0.1)
 ## Features
 - 72 billion parameters
+- Sharded in 31 files (unlike Iridium, which has 963 shards due to the merging process)
 - Combines Magnum prose with Calam smarts
 - Llamaified for easy use
 - Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1
 - Merged layers: 80
 - Total tensors: 1,043
+- Context length: 32k
 ### Tensor Distribution
 - Attention layers: 560 files
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
+model = AutoModelForCausalLM.from_pretrained("leafspark/IridiumLlama-72B-v0.1",
                                              device_map="auto",
                                              torch_dtype=torch.float16)
+tokenizer = AutoTokenizer.from_pretrained("leafspark/IridiumLlama-72B-v0.1")
 ```
 ### GGUFs
+Find them here: [leafspark/IridiumLlama-72B-v0.1-GGUF](https://huggingface.co/leafspark/IridiumLlama-72B-v0.1-GGUF)
 ### Hardware Requirements
+- At least ~150GB of free space
+- ~150GB VRAM