Joseon Level 2 Huneum Selector
This is the promoted Level 2 ํ์ selector for the Joseon-to-Day project. It selects Korean readings for Hanja spans using structured candidate sets.
This repository contains a project-specific PyTorch artifact rather than a standard Transformers model:
huneum_selector.pt: model weightsmodel_config.json: model/vocabulary configurationcandidate_manifest.json: candidate metadatametrics.json: promoted evaluation metricshuneum_error_analysis.json: diagnostic errors
Evaluation
Promoted dev result:
| Metric | Value |
|---|---|
| exact row accuracy | 0.9953 |
| correct rows | 5054 / 5078 |
| character accuracy | 0.9981 |
| selector accuracy | 0.9949 |
| deterministic single-candidate accuracy | 1.0000 |
By label:
| Label | Exact |
|---|---|
| book/title/evidence | 1.0000 |
| person/name | 0.9953 |
| place | 0.9948 |
Intended Use
Use this model after Level 1 span detection to choose readings for Korean historical names, places, book titles, and related Hanja/Hanmun spans.
Limitations
The artifact is tied to the Joseon-to-Day project code and candidate format. It is not a standalone general Hanja-to-Korean transliteration package.
Loading
Use the project loader/training code from the Joseon-to-Day repository. The artifact is intentionally published with its config and manifest so the project pipeline can reproduce the promoted Level 2 selector.