Instructions to use Vclord/flux-packaging-lora-indian-snacks with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Vclord/flux-packaging-lora-indian-snacks with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("Vclord/flux-packaging-lora-indian-snacks") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
FLUX Packaging LoRA β Indian Snacks Domain
A LoRA adaptation of FLUX.1-schnell fine-tuned for the Indian snack packaging visual domain.
Part of the MSc dissertation "Injecting Regional Cultural Aesthetics into Product Packaging via Reference-Conditioned Diffusion Models: A Comparative Study of SDXL and FLUX with LoRA and IP-Adapter Conditioning" β University of Stirling, MSc Artificial Intelligence, 2026.
Model details
Two LoRA checkpoints are provided:
| File | Use | Rank | Steps | Resolution |
|---|---|---|---|---|
flux_packaging_lora_r16_res1024_steps2000.safetensors |
Primary β used for the SDXL-vs-FLUX comparison in the dissertation | 16 | 2000 | 1024 Γ 1024 |
flux_packaging_lora_r16_res512_steps1000.safetensors |
Supplementary β produced as a robustness check during infrastructure resolution | 16 | 1000 | 512 Γ 512 |
Shared training configuration:
| Property | Value |
|---|---|
| Base model | black-forest-labs/FLUX.1-schnell |
| Learning rate | 5e-5 |
| Trigger token | ipsnackpkg |
| Precision | bfloat16 |
| Training hardware | NVIDIA A100 (40 GB) on Google Colab Pro |
| Wall-clock training time (primary) | β 3 h 40 min |
The FLUX learning rate (5e-5) is lower than the SDXL counterpart (1e-4) to account for FLUX's greater sensitivity to gradient magnitude.
Pinned dependency configuration
FLUX LoRA training in the diffusers ecosystem required pinning a specific dependency set due to incompatibilities on the diffusers main branch:
diffusers==0.32.0
transformers==4.45.2
peft==0.13.2
accelerate==1.1.1
Reproducing training requires this pinned set; see the dissertation methodology log for full context.
Training data
311 images of Indian snack packaging sourced from Open Food Facts (CC-BY-SA licence). Identical training corpus to the SDXL counterpart LoRA. Per-image provenance is preserved in the code repository as data/packaging_metadata.csv.
Intended use
Research use in studying base-model contribution to packaging-domain image generation. The dissertation's RQ1 asks whether fine-tuned FLUX produces superior packaging generation compared to fine-tuned SDXL under comparable LoRA configurations. This model is the FLUX side of that comparison.
How to use
from diffusers import FluxPipeline
import torch
pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-schnell",
torch_dtype=torch.bfloat16,
).to("cuda")
pipe.load_lora_weights(
"Vclord/flux-packaging-lora-indian-snacks",
weight_name="flux_packaging_lora_r16_res1024_steps2000.safetensors",
)
pipe.set_adapters(["default_0"], adapter_weights=[0.5])
prompt = "ipsnackpkg, Front-facing product photograph of an Indian snack packet"
image = pipe(
prompt,
num_inference_steps=4,
guidance_scale=0.0,
width=1024,
height=1024,
max_sequence_length=256,
).images[0]
image.save("output.png")
Recommended LoRA scale: 0.5
Why 0.5 and not 1.0?
Unlike SDXL LoRAs which are conventionally used at scale 1.0, this FLUX LoRA operates best at scale 0.5. A diagnostic comparison at scales 0.3, 0.5, and 1.0 confirmed that scale 1.0 over-asserts on FLUX outputs, producing hazy ghosted packets β a known phenomenon in the FLUX LoRA community. Scale 0.5 preserves the trained LoRA contribution without inducing the over-assertion failure mode.
Evaluation
The FLUX vs SDXL comparison was conducted as a LoRA-only experiment (no IP-Adapter, no ControlNet) because mature FLUX equivalents of those components were not available at the time of writing. The comparison therefore answers a narrower sub-question of RQ1: whether FLUX is a better base model for the packaging-domain LoRA task in isolation.
Quantitative metrics across 24 comparison images (3 prompts Γ 2 seeds Γ 4 conditions):
| Configuration | CLIP-img | CLIP-txt | LPIPS |
|---|---|---|---|
| SDXL baseline (no LoRA) | β | β | β |
| SDXL + LoRA + Plus + ControlNet (full pipeline) | 0.552 | 0.320 | 0.782 |
| FLUX baseline (no LoRA) | 0.475 | 0.255 | 0.795 |
| FLUX + LoRA at scale 0.5 | 0.528 | 0.306 | 0.665 |
Intra-rater reliability for the FLUX comparison spike (n = 24), Cohen's weighted kappa with linear weights:
| Axis | ΞΊ |
|---|---|
| Text legibility | 0.740 |
| Packaging plausibility | 0.559 |
| Visual quality | 0.554 |
(Regional appropriateness was not scored for this spike because the FLUX comparison prompts were not folk-art conditioned.)
Headline finding: FLUX + LoRA at scale 0.5 achieves the lowest LPIPS distance to real packaging across all configurations tested, suggesting base-model choice contributes more to packaging-domain quality than the specific fine-tuning strategy. This finding is bounded by the LoRA-only comparison scope; the full-pipeline comparison is future work.
Limitations
- LoRA-only configuration; no IP-Adapter or ControlNet conditioning is applied during inference with this model. Folk-art style transfer is not part of the FLUX pipeline at the time of writing.
- Trained on a small dataset (311 images); generalisation beyond Indian snack packaging is not characterised
- The 1024-resolution LoRA is the primary deliverable; the 512-resolution LoRA was produced during infrastructure resolution and behaves similarly at scale 0.5 but is not the main artefact
- Single-rater evaluation methodology with intra-rater reliability protocol; see dissertation for full discussion
Citation
If you use this LoRA in research, please cite:
@mastersthesis{chandra2026folkart,
title = {Injecting Regional Cultural Aesthetics into Product Packaging via Reference-Conditioned Diffusion Models},
author = {Chandra, Vivek},
year = {2026},
school = {University of Stirling},
type = {MSc Dissertation, Artificial Intelligence}
}
Companion repository and SDXL counterpart
- Full code: https://github.com/Vclord/folk-art-packaging-generation
- SDXL counterpart LoRA: https://huggingface.co/Vclord/sdxl-packaging-lora-indian-snacks
Licence
apache-2.0
- Downloads last month
- 16
Model tree for Vclord/flux-packaging-lora-indian-snacks
Base model
black-forest-labs/FLUX.1-schnell