Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
JuncaiL
/
llama-8x265m-moe
like
2
Text Generation
Transformers
PyTorch
wikipedia
allenai/c4
English
llama_moe
MoE
custom_code
arxiv:
2305.09781
Model card
Files
Files and versions
Community
1
Train
Use this model
main
llama-8x265m-moe
1 contributor
History:
7 commits
JuncaiL
Update README.md
8ebafba
verified
10 months ago
.gitattributes
Safe
1.52 kB
initial commit
10 months ago
README.md
Safe
5.6 kB
Update README.md
10 months ago
config.json
Safe
1.56 kB
fix state_dict loading in MoE model
10 months ago
configuration_llama_moe.py
Safe
4.41 kB
upload llama-8x265m-moe model checkpoint
10 months ago
generation_config.json
Safe
132 Bytes
upload llama-8x265m-moe model checkpoint
10 months ago
modeling_llama_moe_hf.py
Safe
66.7 kB
fix state_dict loading in MoE model
10 months ago
pytorch_model.bin
Safe
pickle
Detected Pickle imports (3)
"torch.FloatStorage"
,
"torch._utils._rebuild_tensor_v2"
,
"collections.OrderedDict"
What is a pickle import?
3.88 GB
LFS
upload llama-8x265m-moe model checkpoint
10 months ago
special_tokens_map.json
Safe
411 Bytes
upload llama-8x265m-moe model checkpoint
10 months ago
tokenizer.model
Safe
500 kB
LFS
upload llama-8x265m-moe model checkpoint
10 months ago
tokenizer_config.json
Safe
720 Bytes
upload llama-8x265m-moe model checkpoint
10 months ago
trainer_state.json
Safe
71.7 kB
upload llama-8x265m-moe model checkpoint
10 months ago
training_args.bin
pickle
Detected Pickle imports (8)
"torch.device"
,
"transformers.trainer_utils.IntervalStrategy"
,
"transformers.trainer_utils.HubStrategy"
,
"accelerate.state.PartialState"
,
"accelerate.utils.dataclasses.DistributedType"
,
"transformers.training_args.OptimizerNames"
,
"transformers.trainer_utils.SchedulerType"
,
"transformers.training_args.TrainingArguments"
How to fix it?
3.95 kB
LFS
upload llama-8x265m-moe model checkpoint
10 months ago