Instructions to use chendren/qwen2.5-0.5b-cx-lam with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use chendren/qwen2.5-0.5b-cx-lam with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # if on a CUDA device, also pip install mlx[cuda] # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("chendren/qwen2.5-0.5b-cx-lam") prompt = "Once upon a time in" text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use chendren/qwen2.5-0.5b-cx-lam with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Generate some text mlx_lm.generate --model "chendren/qwen2.5-0.5b-cx-lam" --prompt "Once upon a time"
Qwen2.5-0.5B Iterative CX LAM
A Large Action Model fine-tuned for Customer Experience (CX) and CRM workflows with native iterative / closed-loop support. It runs entirely locally on Apple Silicon using MLX.
Unlike one-shot planners, this version is trained and prompted for step-by-step operation: the model proposes one action, receives execution results as feedback, and decides the next action until the interaction is complete.
This is a research / proof-of-concept release. It demonstrates local LAM construction with perception-planning-execution-adaptation (inspired by AI21 LAM ideas) using small models + deterministic pipeline.
Model Details
- Base: Qwen2.5-0.5B-Instruct (4-bit mlx-community version)
- Method: LoRA (rank 8)
- Training: 300 iterations on Apple M4 Max
- Adapter size: ~11.7 MB
- Training data: Iterative step-by-step expansions (~9.2k examples) derived from 2,346 full trajectories
- Total trajectories generated: 2,346 (from two public sources)
Datasets Used
Trajectories were synthesized from:
- bitext/Bitext-customer-support-llm-chatbot-training-dataset (1,996 trajs)
- knkarthick/dialogsum (350 trajs, used as transcript-style proxy)
See the companion dataset: chendren/cx-lam-trajectories
Iterative training data is generated on-the-fly or via scripts/build_iterative_data.py.
How the LAM Works (Iterative Mode)
We use a narrow contract with two prompt styles:
- Initial call: Uses the classic full-plan prefix ending at
### Execute\n[ - Continuation / adaptation steps: Uses
build_continuation_promptthat surfaces:- Current observation + rich state summary ("contact loaded, case created, actions so far...")
- "### Completed so far"
- "### Next Action\n["
The runtime loop (run_lam_session / --iterative):
- Model proposes (typically) one action.
- CRMSimulator executes it and returns structured results.
- Results are turned into a new observation via
build_adaptation_observation. - Repeat (up to max_steps) until
log_callsucceeds or the model emits nothing.
A deterministic post-processing pipeline still handles perception derivation, arg filling, and validation.
Usage (Recommended Iterative)
from lam.inference import run_lam_session
sess = run_lam_session(
"Customer Jordan Lee asks about renewal and add-on pricing for the contract",
one_action_per_step=True,
max_steps=6
)
print("Steps:", sess["num_steps"], "terminal:", sess["terminal"])
for s in sess["steps"]:
print(s["actions"])
CLI (recommended):
PYTHONPATH=. python3 scripts/lam_infer.py --adaptive --iterative \
"Customer reports $420 billing discrepancy on April invoice"
Expected behavior (example from retrained model):
Step 0: screenpop
Step 1: create_case
Step 2: screenpop (or list_cases)
Step 3: log_call โ terminal
Evaluation
Iterative behavior (10 diverse CX tests post-retrain)
- Terminal success (reached log_call): 10/10 (100%)
- Avg steps: 4.0 (true one-action-per-step)
- Final state: always 1 case + 1 log created
- Typical flow:
screenpopโcreate_caseโ ... โlog_call
One ELD example used list_cases mid-sequence (sensible variation).
Legacy full test set (previous single-shot checkpoint)
For reference only (on the 271-example hold-out):
- valid_rate (valid + โฅ3 actions): 0.395
- Time for 271: ~161s on M4 Max
Limitations
- Small base model (0.5B) โ occasional redundant actions (e.g. re-screenpop) and limited long-horizon reasoning.
- Iterative behavior is strong on canonical CX flows but can still repeat early actions.
- Perception remains mostly derived (not model-generated).
- Research artifact. Not production-ready without more data, longer training, or larger base.
Citation
If you use this work, please cite the original datasets and reference the AI21 Large Action Model concept.
License
MIT (code and adapter). See source dataset cards for original data license notes.
Trained and evaluated entirely locally on Apple M4 Max with MLX.
Key new scripts: scripts/build_iterative_data.py, lam/format.py:build_continuation_prompt, lam/inference.py:run_lam_session(..., one_action_per_step=True)
Quantized
Model tree for chendren/qwen2.5-0.5b-cx-lam
Base model
Qwen/Qwen2.5-0.5B