sentiment-gpt2-medium-l8
Sentiment direction: positive minus negative movie-review prompts. Positive coefficients steer toward positive sentiment.
Vector card
| Field | Value |
|---|---|
| Model | gpt2-medium |
| Hook site | blocks.8.hook_resid_pre |
| Direction norm (raw) | 15.3588 |
| Source paper | Zou et al. 2023 |
| Source run | — |
How to use
mech apply-steering --vector sentiment-gpt2-medium-l8 --coefficient 3.0 --prompt "Your prompt here"
from mech_interp.steering.registry import load_steering_vector
direction, metadata = load_steering_vector("sentiment-gpt2-medium-l8")
Provenance
- Extraction method: mean-difference (Arditi/RepE)
- Source: Zou et al. 2023
- Platform repo:
masonwyatt23/sv-sentiment-gpt2-medium
Files
| File | Description |
|---|---|
direction.safetensors |
Unit-norm direction tensor under key "direction" |
direction.safetensors.json |
Extraction metadata sidecar |
bundle_metadata.json |
Machine-readable bundle manifest |
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support