Influencing Token Attention

The lever you have — and its limits. Interactive companion to "Influencing Token Attention Through Prompting"

Note: Attention weights and vector distances shown are conceptual illustrations, not empirical measurements. Actual attention weights are computed inside models and are not directly inspectable during normal use. These visualisations demonstrate the direction of effects, not measured magnitudes.

Stage 1 — Explicit Weighting Instructions
Without Steering

"Summarise this medical report"

Drug X shows 40% improvement in a small trial
Drug 0.18
X 0.12
shows 0.05
40% 0.28
improv. 0.22
in 0.02
a 0.01
small 0.06
trial 0.06
With Steering

"Summarise this medical report. Sample size and methodology limitations are as important as efficacy findings."

Drug X shows 40% improvement in a small trial
Drug 0.12
X 0.08
shows 0.04
40% 0.15
improv. 0.13
in 0.02
a 0.01
small 0.22
trial 0.23

What changed: The steering tokens ("sample size," "methodology," "limitations") now exist in the context. During attention computation, "small" and "trial" have higher similarity to these new tokens. The weights shift — limitation-related content gets more attention, headline numbers get less. Same input, different emphasis.

Stage 2 — Context Priming
Priming Text (appears first)

"This document contains important qualifications in sections 3 and 7 that are often overlooked. Read carefully for nuance before summarising."

Document Content

[The actual document to be summarised...]

Closing Instruction

"Now summarise, preserving those qualifications."

Strong attention Weaker (middle) Strong attention

Why position matters: Earlier tokens undergo more attention processes (causal attention accumulates). The priming tokens — "qualifications," "overlooked," "nuance" — get positional advantage. When the model later encounters qualifying content in the document, those attractor tokens are already established as reference points. Closing instructions reinforce at the other high-attention position.

Stage 3 — Relationship Override
cat
dog
pet
democracy
voting
election
animals governance
"cat ownership and democratic participation"
Vector distance: 0.847 (far apart in trained space)

The key insight: In trained vector space, "cat" and "democracy" are far apart — they rarely co-occur meaningfully in training data. But if you explicitly state their relationship matters for your analysis, you create a contextual bridge. The model now has tokens establishing cat-democracy relevance that participate in attention calculations. You're not retraining — the underlying distances remain — but you're creating local influence that the trained weights alone wouldn't produce.

Stage 4 — The Limits
Trained Landscape
billions of parameters
+
Your Prompt Influence
local distortion
=
Combined Effect
influence, not replacement
Can't see what you're changing
Attention weights are computed inside the model. You influence them but can't inspect them. This is iterative work — prompt, observe, adjust.
Can't override training entirely
Trained weights establish the baseline landscape. Your prompt reshapes local terrain but doesn't move mountains.
Can't prevent all failure modes
Confabulation, overgeneralisation, positional bias are architectural. Steering mitigates but doesn't eliminate. Verification remains essential.
Diminishing returns on length
More emphasis text means more tokens competing for attention. At some point, steering instructions become noise. Precision beats volume.

The honest position: You have a lever. It's real and it works. But it's influence, not control. The model descends gradients shaped by billions of parameters learned from massive training. Your prompt adds local constraints that reshape which minimum it finds. That's genuinely useful — but it's not magic, and it doesn't replace verification.