Influencing Token Attention

The lever you have — and its limits. Interactive companion to "Influencing Token Attention Through Prompting"

Note: Attention weights and vector distances shown are conceptual illustrations, not empirical measurements. Actual attention weights are computed inside models and are not directly inspectable during normal use. These visualisations demonstrate the direction of effects, not measured magnitudes.

Stage 1 — Explicit Weighting Instructions

Without Steering

"Summarise this medical report"

Drug X shows 40% improvement in a small trial

Drug 0.18

X 0.12

shows 0.05

40% 0.28

improv. 0.22

in 0.02

a 0.01

small 0.06

trial 0.06

With Steering

"Summarise this medical report. Sample size and methodology limitations are as important as efficacy findings."

Drug X shows 40% improvement in a small trial

Drug 0.12

X 0.08

shows 0.04

40% 0.15

improv. 0.13

in 0.02

a 0.01

small 0.22

trial 0.23

What changed: The steering tokens ("sample size," "methodology," "limitations") now exist in the context. During attention computation, "small" and "trial" have higher similarity to these new tokens. The weights shift — limitation-related content gets more attention, headline numbers get less. Same input, different emphasis.

Stage 2 — Context Priming

Priming Text (appears first)

"This document contains important qualifications in sections 3 and 7 that are often overlooked. Read carefully for nuance before summarising."

↓

Document Content

[The actual document to be summarised...]

↓

Closing Instruction

"Now summarise, preserving those qualifications."

Strong attention Weaker (middle) Strong attention

Why position matters: Earlier tokens undergo more attention processes (causal attention accumulates). The priming tokens — "qualifications," "overlooked," "nuance" — get positional advantage. When the model later encounters qualifying content in the document, those attractor tokens are already established as reference points. Closing instructions reinforce at the other high-attention position.

Stage 3 — Relationship Override

cat

dog

pet

democracy

voting

election

animals governance

"cat ownership and democratic participation"

                        Vector distance: 0.847 (far apart in trained space)
                    

The key insight: In trained vector space, "cat" and "democracy" are far apart — they rarely co-occur meaningfully in training data. But if you explicitly state their relationship matters for your analysis, you create a contextual bridge. The model now has tokens establishing cat-democracy relevance that participate in attention calculations. You're not retraining — the underlying distances remain — but you're creating local influence that the trained weights alone wouldn't produce.

Stage 4 — The Limits

Trained Landscape

billions of parameters

Your Prompt Influence

local distortion

Combined Effect

influence, not replacement

Can't see what you're changing

Attention weights are computed inside the model. You influence them but can't inspect them. This is iterative work — prompt, observe, adjust.

Can't override training entirely

Trained weights establish the baseline landscape. Your prompt reshapes local terrain but doesn't move mountains.

Can't prevent all failure modes

Confabulation, overgeneralisation, positional bias are architectural. Steering mitigates but doesn't eliminate. Verification remains essential.

Diminishing returns on length

More emphasis text means more tokens competing for attention. At some point, steering instructions become noise. Precision beats volume.

The honest position: You have a lever. It's real and it works. But it's influence, not control. The model descends gradients shaped by billions of parameters learned from massive training. Your prompt adds local constraints that reshape which minimum it finds. That's genuinely useful — but it's not magic, and it doesn't replace verification.