sentiment-gpt2-medium-l8

Sentiment direction: positive minus negative movie-review prompts. Positive coefficients steer toward positive sentiment.

Vector card

Field Value
Model gpt2-medium
Hook site blocks.8.hook_resid_pre
Direction norm (raw) 15.3588
Source paper Zou et al. 2023
Source run —

How to use

mech apply-steering --vector sentiment-gpt2-medium-l8 --coefficient 3.0 --prompt "Your prompt here"
from mech_interp.steering.registry import load_steering_vector

direction, metadata = load_steering_vector("sentiment-gpt2-medium-l8")

Provenance

  • Extraction method: mean-difference (Arditi/RepE)
  • Source: Zou et al. 2023
  • Platform repo: masonwyatt23/sv-sentiment-gpt2-medium

Files

File Description
direction.safetensors Unit-norm direction tensor under key "direction"
direction.safetensors.json Extraction metadata sidecar
bundle_metadata.json Machine-readable bundle manifest
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support