Instructions to use mlx-community/Hy-MT2-7B-Abliterated-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Hy-MT2-7B-Abliterated-bf16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Hy-MT2-7B-Abliterated-bf16 mlx-community/Hy-MT2-7B-Abliterated-bf16
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Hy-MT2-7B-Abliterated-bf16
This repository contains bf16 MLX weights for an abliterated derivative of tencent/Hy-MT2-7B, Tencent Hunyuan's multilingual translation model.
The abliteration method is based on jim-plus/llm-abliteration. This is a community derivative and is not an official Tencent release.
Model Details
- Base model: tencent/Hy-MT2-7B
- Target repository: mlx-community/hy-mt2-7b-abliterated-bf16
- Runtime: MLX / MLX-LM
- Precision: bf16 MLX weights
- Task: multilingual translation
- License: Apache-2.0, following the upstream model license
The bf16 version preserves higher precision than the 8-bit quantized variant and is intended for users who prefer quality over minimum memory footprint.
Usage
Install MLX-LM:
pip install -U mlx-lm
Generate with the command line:
mlx_lm.generate \
--model mlx-community/hy-mt2-7b-abliterated-bf16 \
--prompt "ε°δ»₯δΈζζ¬ηΏ»θ―ζθ±θ―οΌζ³¨ζεͺιθ¦θΎεΊηΏ»θ―εηη»ζοΌδΈθ¦ι’ε€θ§£ιοΌ\n\nδ»ε€©ε€©ζ°ηε₯½γ" \
--max-tokens 4096 \
--temp 0.7 \
--top-p 0.6
Or use Python:
from mlx_lm import load, generate
model_id = "mlx-community/hy-mt2-7b-abliterated-bf16"
model, tokenizer = load(model_id)
prompt = "ε°δ»₯δΈζζ¬ηΏ»θ―ζθ±θ―οΌζ³¨ζεͺιθ¦θΎεΊηΏ»θ―εηη»ζοΌδΈθ¦ι’ε€θ§£ιοΌ\n\nδ»ε€©ε€©ζ°ηε₯½γ"
messages = [{"role": "user", "content": prompt}]
formatted_prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
response = generate(
model,
tokenizer,
prompt=formatted_prompt,
max_tokens=4096,
temp=0.7,
)
print(response)
Prompting
Hy-MT2 is optimized for translation instructions. Use full language names in prompts when possible.
Example:
Translate the following text into Chinese. Only output the translated result without additional explanation:
The weather is nice today.
Recommended generation settings from the upstream Hy-MT2 model card for 1.8B and 7B models:
{
"temperature": 0.7,
"top_p": 0.6,
"top_k": 20,
"repetition_penalty": 1.05,
"max_tokens": 4096
}
About Abliteration
Abliteration attempts to identify and remove refusal-related directions in model activations or weights. It can reduce explicit refusal behavior, but it does not guarantee that every refusal is removed, and it may affect translation quality or other model behavior.
This model should be evaluated for your own use case before production use.
Limitations
- This model inherits the capabilities and limitations of the upstream Hy-MT2-7B model.
- Abliteration may change behavior in ways that are not captured by standard translation benchmarks.
- Users are responsible for complying with applicable laws, platform policies, and the upstream model license.
References
- Base model: tencent/Hy-MT2-7B
- Abliteration method: jim-plus/llm-abliteration
- Hy-MT2 paper: Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild
Citation
@misc{zheng2026hymt2familyfastefficient,
title={Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild},
author={Mao Zheng and Zheng Li and Tao Chen and Bo Lv and Mingrui Sun and Mingyang Song and Jinlong Song and Hong Huang and Decheng Wu and Hai Wang and Yifan Song and Yanfeng Chen and Guanwei Zhang},
year={2026},
eprint={2605.22064},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2605.22064}
}
- Downloads last month
- 29
Quantized
Model tree for mlx-community/Hy-MT2-7B-Abliterated-bf16
Base model
tencent/Hy-MT2-7B