I’m building a tool called psyctl at Modulabs Persona Lab.

In short, it’s a project about changing an LLM’s personality without fine-tuning.

How It Works

We extract vectors like “extroverted direction” or “introverted direction” from the model’s internal activations, then add those directions during inference to shift the personality. It’s a technique called Contrastive Activation Addition (CAA) — it’s fascinating that behavior changes with just vector addition, no training required.


graph LR
    A[Generate Contrastive Dataset] --> B[Extract Steering Vector]
    B --> C[Inject Vector into Model]
    C --> D[Validate with Psych Tests]

What psyctl Does

It’s a tool that lets you run the entire pipeline above with a single CLI.

# Dataset generation → Vector extraction → Application → Evaluation
psyctl dataset.build.steer --personality Extroversion --output ./data
psyctl extract.steering --dataset ./data --method mean_diff --output ./vec.safetensors
psyctl steering --steering-vector ./vec.safetensors --input "Tell me about yourself"
psyctl benchmark inventory --steering-vector ./vec.safetensors

It supports two vector extraction methods — Mean Difference (statistics-based) and BiPO (optimization-based) — and evaluates using standard psychological instruments like IPIP-NEO (Big Five) and NPI-40 (Narcissism).

It works with any HuggingFace-compatible model including Llama and Gemma.

Interested?

The code is fully open on GitHub. Check it out:

👉 github.com/modulabs-personalab/psyctl