ContextAugmNER medprocner-llama3b-r0-seed2

This repository contains one LoRA adapter from the ContextAugmNER paper release, a reproducibility study of inference-protocol sensitivity in compact decoder-only Spanish clinical named entity recognition.

The adapter is released as a PEFT/LoRA artifact. It does not contain merged base-model weights. Users must obtain the upstream base model and comply with the base-model licence and access terms.

Adapter Summary

Dataset: medprocner
Base-model alias: llama3b
Base model: meta-llama/Llama-3.2-3B-Instruct
Training regime: r0 (R0-normal)
Seed: 2
Paper run ID: train_medprocner_opt1_r0_llama3b_seed2
Local release ID: medprocner-llama3b-r0-seed2

Repository Contents

adapter_model.safetensors: LoRA adapter weights.
adapter_config.json: PEFT adapter configuration, including base model.
tokenizer files: copied from the training/inference artifact for exact reproducibility of the paper runs.
adapter_manifest.csv: one-row manifest mapping this HF repository to the paper run ID and release metadata.

training_args.bin is excluded from the default upload because it is not needed for inference and can contain environment-specific training state.

Loading

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "meta-llama/Llama-3.2-3B-Instruct"
adapter_id = "edugredu/contextaugmner-medprocner-llama3b-r0-seed2"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype="auto",
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

The paper's scored runs used the project inference code, strict parser, deterministic span alignment, official shared-task scorers, greedy decoding, chunking at 1800 characters with 200-character overlap, and a generation budget of 960 new tokens. For paper reproduction, use the linked code release and the manifest rather than ad hoc prompting.

How This Adapter Is Used in the Paper

AuxStop, batch-one execution, matched stop/truncation, online-stop-only, offline-truncate, and left-padding conditions are inference-time protocols. They reuse trained adapters and are not separate trained checkpoints.

For the central corrected inference recommendation, trained R0 adapters are reused with left-padding at batch size 64 or effective batch-one execution, depending on the reported contrast. R1 adapters are retained for the historical primary grid and transparency about the original rationale-training comparison.

Data and Ethics

The adapter was trained on public, de-identified Spanish clinical NER benchmark data distributed by the original shared-task organizers. Dataset access and use must follow the MedProcNER, SympTEMIST, and DisTEMIST task terms.

Intended Use

This artifact is intended for research reproducibility, audit of inference protocols, and controlled comparison with the ContextAugmNER paper results. It is not a clinical decision-support system and should not be used for patient care without independent validation, governance review, and task-specific error analysis.

Limitations

The adapter is specialised to one dataset/task configuration and one random seed.
Reported paper metrics depend on the full inference protocol, parser, aligner, scorer, and padding/batching settings.
The release does not establish universal clinical NER performance outside the evaluated Spanish benchmark setting.
The adapter inherits limitations and licence obligations from the upstream base model.