openenv / evaluation /__init__.py
sentinel-space-publisher
space: publish latest Sentinel app snapshot
c452421
raw
history blame contribute delete
243 Bytes
# -*- coding: utf-8 -*-
"""Evaluation suite for SENTINEL oversight architecture.
Modules:
- weak_to_strong: OpenAI-style Weak-to-Strong generalization testing
- transcript_export: METR MALT-style labeled transcript dataset generation
"""