metadata
license: mit
library_name: transformers
pipeline_tag: text-generation
tags:
- tensormind
- causal-lm
- text-generation
- chinese
- custom-code
language:
- zh
- en
model-index:
- name: TensorMind
results:
- task:
type: text-generation
name: Chinese Multiple-Choice Evaluation
dataset:
type: custom
name: C-Eval
metrics:
- type: accuracy
value: 27.27
name: C-Eval (0-shot)
- task:
type: text-generation
name: Chinese Multiple-Choice Evaluation
dataset:
type: custom
name: CMMLU
metrics:
- type: accuracy
value: 25.26
name: CMMLU (0-shot)
- task:
type: text-generation
name: Chinese Multiple-Choice Evaluation
dataset:
type: custom
name: A-CLUE
metrics:
- type: accuracy
value: 25.43
name: A-CLUE (0-shot)
- task:
type: text-generation
name: Chinese Multiple-Choice Evaluation
dataset:
type: custom
name: TMMLU+
metrics:
- type: accuracy
value: 24.96
name: TMMLU+ (0-shot)
TensorMind (0.5B)
TensorMind is a 536.9M-parameter causal language model for lightweight Chinese/English text generation.
Model Details
- Architecture: Decoder-only Transformer (
TensorMindForCausalLM) - Layers: 32
- Hidden size: 1024
- Heads / KV heads: 16 / 8 (GQA)
- Context length: 32,768
- Vocab size: 32,768
- Positional encoding: RoPE
- Activation: SiLU
- Parameters: 536,941,568 (~0.5B)
Quick Start
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
repo_id = "TensorMind/TensorMind"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
trust_remote_code=True,
torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
)
prompt = "请用三句话介绍一下你自己。"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Benchmark Snapshot
Evaluation time: 2026-03-07 00:40 (UTC+8), zero-shot (n-shot=0).
| Model | Params | C-Eval | CMMLU | A-CLUE | TMMLU+ | AGIEval |
|---|---|---|---|---|---|---|
| TensorMind | 0.5B | 27.27 | 25.26 | 25.43 | 24.96 | 33.56 |
Intended Use
- Lightweight chat and text generation
- Local experimentation and teaching
- Baseline model for research and fine-tuning
Limitations
- This is a small model and can produce factual errors.
- Benchmark numbers above are from multiple-choice style evaluations and do not fully represent open-ended generation quality.
- Outputs may contain bias or unsafe content; apply filtering for production use.
License
MIT License.

