Instructions to use mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42
- SGLang
How to use mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42 with Docker Model Runner:
docker model run hf.co/mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42
Model Card for Model ID
This is a fine-tuned Italian model for Controllable Text Simplification. This is the Wikipedia-trained version on 25000 different original texts, each with multiple target simplifications, up to 10 different variations. The model has been trained on the Wikipedia split of IMPaCTS.
Model Details
The base model has been trained using LoRA with 16 bits precision.
To the base model Vocabulary, 20 new tokens have been addded which are used to control the target readability output.
During the LoRA Training the base model embedding and unembedding layers have been left unfrozen to learn representations for these tokens.
lora_cfg = LoraConfig(
r=32,
lora_alpha=64,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
target_modules=[
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
],
modules_to_save=["embed_tokens", "lm_head"]
)
Model Description
The twenty control token for targeting the output readability scores are:
<|readability_0|>
<|readability_5|>
<|readability_10|>
<|readability_15|>
<|readability_20|>
<|readability_25|>
<|readability_30|>
<|readability_35|>
<|readability_40|>
<|readability_45|>
<|readability_50|>
<|readability_55|>
<|readability_60|>
<|readability_65|>
<|readability_70|>
<|readability_75|>
<|readability_80|>
<|readability_85|>
<|readability_90|>
<|readability_95|>
<|readability_100|>
These tokens represent the target readability output that the models try to achieve. The structure of the input should be <|readability_target|>\n original_italian_sentence\n. The model will try to generate a simplification at the target readability, where a higher readability score means a more complex sentence. Aim for low readability values.
- Developed by: Michele Papucci
- Language(s) (NLP): Italian
- Finetuned from model: Qwen3-8B-Base
Model Sources [optional]
- Data: IMPaCTS - Wikipedia Split
- Paper Controllable Sentence Simplification in Italian: Fine-Tuning Large Language Models on Automatically Generated Resources
Uses
These models aim to be a simplification system for italian sentences, where a user can generate simplification at the aimed target readability of its intended reader. This can be useful for generating simplification for primary school student that have different reading-level competence, for people learning Italian, etc.
How to Get Started with the Model
This model can be simply used as follows:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from peft import PeftModel
tokenizer = AutoTokenizer.from_pretrained("mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42")
# If you need padding, ensure that these two lines are uncommented:
# tokenizer.pad_token = tokenizer.eos_token
# tokenizer.padding_side = "left"
model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-8B-Base",
device_map="auto",
)
model.resize_token_embeddings(len(tokenizer), pad_to_multiple_of=8)
model.config.vocab_size = len(tokenizer)
model = PeftModel.from_pretrained(model, "mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42")
messages = []
text = f"<|readability_20|>\nProdotto dalla BBC, il film esce solo nel 1998 ed ottiene numerosi riconoscimenti internazionali, tra cui la candidatura al Premio Oscar per il miglior cortometraggio animato.\n"
messages.append(text)
pipe = pipeline(
model=model,
tokenizer=tokenizer,
task='text-generation',
max_new_tokens=128,
)
sequences = pipe(messages)
print(sequences)
When providing the text, add the desired control token for readability as the first token of the sentence that needs to be simplified.
More Details
An extensive explanation of the model was trained and how it performs can be found in the LREC2026 Paper.
Citation
If you use any of these models, pleace cite:
@inproceedings{papucci-etal-2026-controllable,
title = {Controllable Sentence Simplification in Italian: Fine-Tuning Large Language Models on Automatically Generated Resources},
author = {Papucci, Michele and Venturi, Giulia and Dell'Orletta, Felice},
booktitle = {Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)},
month = {May},
year = {2026},
pages = {7178--7191},
address = {Palma, Mallorca, Spain},
publisher = {European Language Resources Association (ELRA)},
doi = {10.63317/5fgm358dfxt5},
abstract = {This paper presents a study on readability-controlled Sentence Simplification for Italian, addressing the scarcity of annotated resources for low-resource languages. We introduce IMPaCTS (Italian Multilevel Parallel Corpus for Text Simplification), the first fully automatically created corpus of 1,444,160 original–simple sentence pairs automatically annotated with readability levels and linguistic features. It was generated using an Italian LLM prompted in zero-shot to produce multiple simplifications per input sentence. Increasing portions of the resource are used to fine-tune mono- and multilingual open-weight LLMs, conditioning them to generate simplifications at a target readability level. Results from automatic and human evaluations show that fine-tuning on IMPaCTS improves performance both in terms of task completion and adherence to the targeted readability levels compared to few-shot baselines.}
}
Model tree for mpapucci/Qwen3-8B-Wikipedia-Controllable-Text-Simplification-25000-42
Base model
Qwen/Qwen3-8B-Base