TexJEPA Model Weights

This repository hosts the trained model weights for TexJEPA from:

When Texture Becomes the World: Texture-aware JEPA for Chest X-ray Representation Learning

Authors: Yi Zhao*, Ruilang Wang*, Bowen Liu, Donglong Chen†
* Equal contribution. † Corresponding author.

Summary

TexJEPA is a texture-aware extension of Image Joint-Embedding Predictive Architecture (I-JEPA) for chest X-ray representation learning. The models are trained on MIMIC-CXR-JPG with a ViT-H/14 backbone and an I-JEPA-style context-target predictive objective. The released weights are intended to reproduce the paper's pre-training lineage and downstream evaluation using the public TexJEPA codebase.

The repository includes four checkpoints:

Model	File	Initialization	Training stage	Main purpose
I-JEPA-300	`weights/I-JEPA-300/jepa-latest.pth.tar`	I-JEPA ViT-H/14 pre-training run	300-epoch final/latest checkpoint	Baseline chest X-ray I-JEPA representation
TexJEPA-N	`weights/TexJEPA-N/jepa-latest.pth.tar`	I-JEPA-300 lineage	+50 epoch texture-noise specialization	Learn invariance to nuisance texture corruption
TexJEPA-R	`weights/TexJEPA-R/jepa-latest.pth.tar`	TexJEPA-N	+50 epoch register-token branch	Route high-norm texture artifacts through register tokens
TexJEPA-C	`weights/TexJEPA-C/jepa-latest.pth.tar`	TexJEPA-N	+50 epoch covariance-regularized branch	Preserve local patch diversity and lesion-scale texture sensitivity

The historical internal experiment names are not used in this release. Please cite and report the public names above.

Files

weights/
  I-JEPA-300/jepa-latest.pth.tar
  TexJEPA-N/jepa-latest.pth.tar
  TexJEPA-R/jepa-latest.pth.tar
  TexJEPA-C/jepa-latest.pth.tar
configs/pretrain/
  ijepa_300.yaml
  texjepa_n.yaml
  texjepa_r.yaml
  texjepa_c.yaml
checksums.sha256
requirements.txt
LICENSE

checksums.sha256 contains SHA-256 hashes for the uploaded checkpoint files.

Model Differences

I-JEPA-300

I-JEPA-300 is the chest X-ray baseline. It uses the original I-JEPA multi-block masking objective with a ViT-H/14 context encoder, EMA target encoder, and predictor. It is the reference model for clean representation quality and texture-sensitivity analysis.

TexJEPA-N

TexJEPA-N adds context-target asymmetric texture corruption. The target branch receives clean chest radiographs, while the context branch receives mild stochastic texture perturbations such as Gaussian noise, Poisson noise, and JPEG-like compression. This trains the model to predict clean latent targets from corrupted context views, directly targeting texture robustness.

TexJEPA-R

TexJEPA-R starts from TexJEPA-N and adds register tokens to the encoder. Register tokens participate in self-attention but are stripped before patch outputs are returned, preserving compatibility with the I-JEPA predictor. The branch also uses tighter local masking to encourage stronger local, lesion-scale structure modeling.

TexJEPA-C

TexJEPA-C starts from TexJEPA-N and adds a patch-token variance/covariance auxiliary loss. The regularizer discourages patch-token collapse and helps preserve local texture diversity. The implementation computes the auxiliary loss per rank and avoids cross-rank all-gather, which is important because I-JEPA masks can produce variable visible-token counts across distributed workers.

Intended Use

These weights are intended for:

Reproducing TexJEPA pre-training and downstream experiments with the GitHub repository.
Chest X-ray representation learning research.
Linear probing and fine-tuning on VinBigData/VinDr-style multi-label chest X-ray classification.
Texture robustness, lesion sensitivity, and model comparison experiments.

They are not intended for clinical deployment or diagnostic use.

Reproduction

Clone the code:

git clone https://github.com/YiZhao-Jasper/TexJEPA.git
cd TexJEPA
python -m pip install -r requirements.txt

Download this model repository and place or symlink the checkpoints under the paths expected by the configs:

logs/ijepa_300/jepa-latest.pth.tar             <- weights/I-JEPA-300/jepa-latest.pth.tar
logs/texjepa_n/jepa-latest.pth.tar             <- weights/TexJEPA-N/jepa-latest.pth.tar
logs/texjepa_r/jepa-latest.pth.tar             <- weights/TexJEPA-R/jepa-latest.pth.tar
logs/texjepa_c/jepa-latest.pth.tar             <- weights/TexJEPA-C/jepa-latest.pth.tar

The pre-training configs in this model repository mirror the GitHub release:

python scripts/sanity_check.py

For downstream evaluation:

export CHECKPOINT=logs/texjepa_n/jepa-latest.pth.tar
export VINBIG_IMAGE_DIR=data/vinbig/images_1024/train
export VINBIG_CSV=data/vinbig/annotations/train.csv
bash downstream/run_downstream.sh all

Data

Pre-training uses MIMIC-CXR-JPG. This model repository does not redistribute MIMIC-CXR data. Researchers must obtain MIMIC-CXR-JPG through the official PhysioNet credentialing and data-use process.

Downstream evaluation code supports VinBigData/VinDr-style multi-label chest X-ray annotations; those datasets are not redistributed here.

Checkpoint Format

The checkpoints are PyTorch .pth.tar training checkpoints containing encoder, predictor, target encoder, optimizer metadata, and training metadata where available. Load only checkpoints from trusted sources because PyTorch checkpoint loading relies on Python pickle serialization.

For representation extraction, the downstream code in the GitHub repository loads the target_encoder weights and supports ViT-H/14, ViT-L/14, and register-token variants by reading checkpoint metadata and tensor shapes.

Citation

@misc{zhao2026texjepa,
  title  = {When Texture Becomes the World: Texture-aware JEPA for Chest X-ray Representation Learning},
  author = {Zhao, Yi and Wang, Ruilang and Liu, Bowen and Chen, Donglong},
  year   = {2026},
  note   = {TexJEPA model weights}
}

License

The released code and model weights are provided for non-commercial research use under the license included in this repository. MIMIC-CXR-JPG and downstream datasets remain governed by their own licenses and data-use agreements.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Image Feature Extraction

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support