TryDotAtwo/RuModernBERT-ruLaw

Russian legal ModernBERT package: one shared encoder and four inference heads.

Model

Base encoder: deepvk/RuModernBERT-base.
Continued pretraining: MLM on irlspbru/RusLawOD.
Inference heads:
- doc_type: document type classification from doc_typeIPS.
- classifier: multi-label legal classifier from classifierByIPS.
- keywords: multi-label keyword prediction from keywordsByIPS.
- ner: token classification from TryDotAtwo/russian-legal-ner.

headingIPS and textIPS are used together as model input for document-level heads.

Usage

from legal_modernbert import LegalDocumentPipeline

pipe = LegalDocumentPipeline.from_pretrained("TryDotAtwo/RuModernBERT-ruLaw")
result = pipe("Текст правового документа...")

The pipeline requires a FlashAttention 2 compatible runtime.

Data Sources

Source	Used for	Link
RuModernBERT-base	Initial encoder weights	`deepvk/RuModernBERT-base`
RusLawOD	MLM, document type, classifier, keywords	`irlspbru/RusLawOD`
Russian legal NER	NER head	`TryDotAtwo/russian-legal-ner`
Sud-resh benchmark	External MLM validation	`lawful-good-project/sud-resh-benchmark`

Benchmarks

Evaluation	Metric	Result
RusLawOD MLM validation	eval loss	0.1337
RusLawOD MLM validation	train loss	0.1537
Sud-resh MLM benchmark, this model	eval loss	0.4473
Sud-resh MLM benchmark, base `deepvk/RuModernBERT-base`	eval loss	0.5172
Sud-resh MLM benchmark	perplexity improvement vs base	~6.8%
NER test	precision	0.9970
NER test	recall	0.9884
NER test	F1	0.9927
NER test	loss	0.00133
Multitask document heads	final train loss	0.0193

The NER dataset mirror keeps attribution to the original authors and source folder in its dataset card.

Limitations

The benchmark table reports internal training/evaluation runs. Document-head quality should be validated on held-out downstream legal tasks before production use.

Downloads last month: 32

Safetensors

Model size

0.2B params

Tensor type

BF16

Model tree for TryDotAtwo/RuModernBERT-ruLaw

Base model

deepvk/RuModernBERT-base

Finetuned

(17)

this model

TryDotAtwo
/

RuModernBERT-ruLaw