ReSyn — Segmenter

This repository contains the pre-trained Segmenter model presented in the paper ReSyn: A Generalized Recursive Regular Expression Synthesis Framework.

ReSyn is a synthesizer-agnostic divide-and-conquer framework that decomposes complex regular expression synthesis problems into manageable sub-problems by adaptively predicting whether to split examples sequentially (Concatenation) or group them by structural similarity (Union).

Segmenter performs the Concatenation decomposition. Given a set of positive example strings, it labels each character with a segment id so that the strings are split into positionally-aligned sub-groups, each of which can be synthesized independently and concatenated back together.

Links

Usage

These are custom PyTorch models that use PyTorchModelHubMixin. The model class is defined in the GitHub repository; clone it first so that the ReSyn package is importable, then:

from ReSyn.model import Segmenter

model = Segmenter.from_pretrained("mrseongminkim/ReSyn-Segmenter").eval()

See ReSyn/server.py for the full input encoding / output decoding used at inference time.

Citation

If you find this work useful, please cite:

@inproceedings{kim2026resyn,
  title={ReSyn: A Generalized Recursive Regular Expression Synthesis Framework},
  author={Kim, Seongmin and Cheon, Hyunjoon and Kim, Su-Hyeon and Han, Yo-Sub and Ko, Sang-Ki},
  booktitle={Proceedings of the Thirty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-26)},
  year={2026}
}
Downloads last month
-
Safetensors
Model size
7.75M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train mrseongminkim/ReSyn-Segmenter

Paper for mrseongminkim/ReSyn-Segmenter