cortexa-marketing-feedback (distilled student)
A ~4.4M-parameter conditional decoder distilled from
M725/cortexa-marketing-scorer outputs. Takes CLIP-ViT-B/32 vision
features (768-d) + the 4 Marketing pillar scores (or a "no-scores"
sentinel for fast mode) and emits a creator-vernacular phrase chain:
"scroll stopping | clear cta | thumb stopping"
"forgettable | looks clean | low contrast text"
"lazy design | model looks fake | low contrast"
The student is meant to be the feedback callout shown on the result screen for paid users โ plain-language pros and cons that go alongside the scorer's numeric output.
Files
| file | purpose |
|---|---|
student_int8.onnx |
TinyTransformer decoder, 4 layers / 256-dim / 4 heads, INT8 dynamic-quantized. 6.9 MB. |
tokenizer.json |
Whole-phrase tokenizer (vocab ~115; specials <pad>, <bos>, <eos>, <sep>). |
config.json |
Encoder dim, pillar names, vocab size, special-token ids โ read by the TS/JS runtime to shape inputs. |
Inference shape
inputs:
encoder_feats (1, 768) float32 # mean-pooled CLIP-ViT-B/32 vision output
scores (1, 4) float32 # [universal_appeal, demographic_appeal, audience_drive, engagement] in [0,1]
scores_present (1,) float32 # 1.0 anchored, 0.0 fast-mode
input_ids (1, T) int64 # decoder context
outputs:
logits (1, T, V) float32
Greedy decode works; temperature 0.8 + top-k 20 + SEP-veto is the recommended sampling config when running on more than one input (prevents the greedy "forgettable | forgettable | forgettable" collapse the v0 model exhibited).
Training
15k phrase triples from 5k COCO photos. Each photo scored locally
against the cortexa_v10 head; phrase chains generated by
research.distill_adjectives.phrase_rules.scores_to_phrase. 12 epochs,
AdamW, cosine schedule. Val loss 2.31 โ 1.87. See
research/distill_students/train_marketing.py in the app repo.
License
Pleius internal โ see https://pleius.com. Not for redistribution.
- Downloads last month
- 14