semo_stage1 / README.md
mjkmain's picture
Upload tokenizer
4948c4b verified
---
{}
---
Trained : Reconstruction tokens
```python
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
from semo_lm.model import SemoLlama
from semo_lm.semo_utils.prefix_vars import PAD_TOKEN_ID
model = SemoLlama.from_pretrained(
"meta-llama/Meta-Llama-3-8B-Instruct",
torch_dtype=torch.bfloat16,
pad_token_id=PAD_TOKEN_ID
)
model.init_sentence_encoder_weights()
repo_id = "MLP-SEMO/Llama-Reconstruction-embedding"
filename = "embed_tokens.safetensors"
downloaded_file = hf_hub_download(repo_id=repo_id, filename=filename)
embedding_weights = load_file(downloaded_file)
model.model.embed_tokens.load_state_dict(embedding_weights)
```