JuncaiL
/

llama-8x265m-moe

Text Generation

Model card Files Files and versions Community

llama-8x265m-moe

1 contributor

History: 7 commits

JuncaiL's picture

Update README.md

8ebafba verified 11 months ago

.gitattributes

1.52 kB

initial commit 11 months ago
README.md

5.6 kB

Update README.md 11 months ago
config.json

1.56 kB

fix state_dict loading in MoE model 11 months ago
configuration_llama_moe.py

4.41 kB

upload llama-8x265m-moe model checkpoint 11 months ago
generation_config.json

132 Bytes

upload llama-8x265m-moe model checkpoint 11 months ago
modeling_llama_moe_hf.py

66.7 kB

fix state_dict loading in MoE model 11 months ago
pytorch_model.bin
Detected Pickle imports (3)
- "torch.FloatStorage",
- "torch._utils._rebuild_tensor_v2",
- "collections.OrderedDict"
What is a pickle import?
3.88 GB
LFS

upload llama-8x265m-moe model checkpoint 11 months ago
special_tokens_map.json

411 Bytes

upload llama-8x265m-moe model checkpoint 11 months ago
tokenizer.model

500 kB
LFS

upload llama-8x265m-moe model checkpoint 11 months ago
tokenizer_config.json

720 Bytes

upload llama-8x265m-moe model checkpoint 11 months ago
trainer_state.json

71.7 kB

upload llama-8x265m-moe model checkpoint 11 months ago
training_args.bin
Detected Pickle imports (8)
- "torch.device",
- "transformers.trainer_utils.IntervalStrategy",
- "transformers.trainer_utils.HubStrategy",
- "accelerate.state.PartialState",
- "accelerate.utils.dataclasses.DistributedType",
- "transformers.training_args.OptimizerNames",
- "transformers.trainer_utils.SchedulerType",
- "transformers.training_args.TrainingArguments"
How to fix it?
3.95 kB
LFS

upload llama-8x265m-moe model checkpoint 11 months ago