Instructions to use CADT-IDRI/gemma-khmer-text-sum-adapters with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use CADT-IDRI/gemma-khmer-text-sum-adapters with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use CADT-IDRI/gemma-khmer-text-sum-adapters with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for CADT-IDRI/gemma-khmer-text-sum-adapters to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for CADT-IDRI/gemma-khmer-text-sum-adapters to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for CADT-IDRI/gemma-khmer-text-sum-adapters to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="CADT-IDRI/gemma-khmer-text-sum-adapters", max_seq_length=2048, )
π°π Khmer Text Summarization Adapters (Gemma)
QLoRA adapters fine-tuned for Khmer text summarization.
Trained using the Unsloth framework for efficient 4-bit fine-tuning.
π Variants
| Variant | Subfolder | Description |
|---|---|---|
| Title-based | title_based/ |
Trained on raw Khmer news dataset |
| Synthetic | synthetic/ |
Trained on synthetic dataset |
π Usage (Unsloth)
from unsloth import FastLanguageModel
import torch
ALPACA_PROMPT = """ααΆαααααααααααΊααΆααα
ααααΈααααΆαα’αααΈαα·α
αα
ααΆααα½αα ααΌααααααα
ααααΎαα±ααααΆαααααΉαααααΌα αααααα αα·αααΆααααα
### Instruction:
α
αΌααααααα α’αααααααΆααααααααα
### Input:
{}
### Response:
"""
# β
Load base model + adapter in ONE call
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="unsloth/gemma-2b-bnb-4bit", # base model
max_seq_length=8192,
load_in_4bit=True,
adapter_name="ChilyRan/gemma-khmer-adapters", # your HF adapter
adapter_kwargs={"subfolder": "synthetic"} # or "title_based"
)
FastLanguageModel.for_inference(model)
# model.eval()
# Prepare input
text = "αααα
αΌαα’ααααααααααααααα’ααααα
ααΈααα..."
prompt = ALPACA_PROMPT.format(text)
inputs = tokenizer(prompt, return_tensors="pt", truncation=True).to("cuda")
# Generate summary
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=128,
use_cache=True,
do_sample=True,
temperature=0.3,
top_p=0.85
)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
summary = decoded.split("### Response:")[-1].strip()
print(summary)
- Downloads last month
- -
Model tree for CADT-IDRI/gemma-khmer-text-sum-adapters
Base model
unsloth/gemma-2b-bnb-4bit