Mixture of Experts
reach-vb's picture
reach-vb HF staff
Update README.md
d3574ce verified
|
raw
history blame
1.36 kB
metadata
license: apache-2.0
language:
  - fr
  - it
  - de
  - es
  - en
tags:
  - moe

Mixtral-8x22B Original

This model checkpoint is provided as-is and might not be up-to-date. Please use the corresponding version from https://huggingface.co/mistralai org

The original weights for Mixtral 8x22B.

These weights are unconverted. For converted weights, see mistral-community/Mixtral-8x22B-v0.1.

Joining Files

(Untested) script to join the files back together (run this where you downloaded the files):

import os
from tqdm import tqdm
print('Opening files...')
files = os.listdir('.')
files = [file for file in files if file.startswith('consolidated.safetensors.')]
files_sorted = sorted(files, key=lambda x: int(x.split('.')[-1]))
output_file = 'consolidated.safetensors'
progress_bar = tqdm(total=len(files_sorted), desc='Joining Files')
with open(os.path.join(directory, output_file), 'wb') as output:
    for file in files_sorted:
        with open(os.path.join(directory, file), 'rb') as input_file:
            output.write(input_file.read())
        progress_bar.update(1)
progress_bar.close()
print('Done!')

It may take a while.

Conversion to HF

It's in the Transformers repo.

License

Confirmed Apache https://x.com/arthurmensch/status/1778308399144333411