TL;DR. Within 48 hours — on 29 and 30 April 2026 — OpenAI published a post-mortem on GPT-5's goblin outputs and Anthropic updated its Responsible Scaling Policy. The pattern is not coincidental: foundation model behaviour drifts after deployment. Organisations that freeze their governance at go-live are running risks they cannot see.
A Recurring Pattern: Model Behaviour Is Not Fixed at Deployment
Two major publications within 48 hours. OpenAI documents how unpredictable personality traits — called goblins — emerged in GPT-5 after deployment: a detailed timeline, an identified root cause, fixes applied in post-production. Anthropic simultaneously publishes an update to its Responsible Scaling Policy, revising its commitments as its models' actual capabilities become visible.
The signal is structural: foundation model behaviour is not static. It reconfigures under the effect of human reinforcement loops (RLHF), successive updates, and deployment at massive scale. Governance frameworks built at a given point in time do not cover what the model will do six months later.
Three Documented Cases That Illustrate the Pattern
GPT-5 and the goblins
On 29 April 2026, OpenAI published an analysis of how unpredictable personality traits proliferated in GPT-5. Per that publication, these quirks emerged from positive reinforcement signals that amplified unanticipated behaviours. Diagnosis and fixes came after deployment — a genuine analytical effort, a resolutely reactive posture.
Anthropic's Responsible Scaling Policy update
Published the same day, 29 April 2026, Anthropic's RSP update shows that even the sector's most formalised safety frameworks are continuously revised — not before deployment, but as the model's capabilities exceed initial projections. A static governance policy is, by design, behind the model it claims to govern.
How people actually use Claude for personal guidance
On 30 April 2026, Anthropic published a study on how individuals ask Claude for personal advice. What it reveals: actual usage patterns diverge systematically from what the designers anticipated. The model responds to needs nobody fully predicted — confirming that initial assumptions about expected behaviour are structurally insufficient.
The Root Cause: Behavioural Emergence That Static Governance Cannot Track
Large language models generate emergent behaviour — configurations that were not explicitly programmed, arising from the interaction of training data, human feedback loops, and large-scale deployment. What the goblins case illustrates, per OpenAI's 29 April 2026 publication, is that behavioural traits can reinforce non-linearly from signals that appeared entirely benign.
A second factor: governance policies are drafted based on capabilities known at a given moment. As soon as the model evolves — through an update, a shift in usage context, or a scaling event — the initial assumptions become obsolete. Anthropic's RSP update of 29 April 2026 demonstrates that even a leading lab must revise its own certainties mid-flight.
Three Levers to Move from Reactive to Continuous Monitoring
- Treat every model update as a new software release. Define documented behavioural regression tests — before and after migration. What the model answered before an update is not guaranteed after. Software qualification processes apply here with the same rigour.
- Establish behavioural baselines before deployment. Identify the most critical prompts for your business and document expected responses. That baseline becomes the reference for continuous monitoring — and the starting point for detecting any drift.
- Read vendor governance publications as early-warning signals. Anthropic's RSP update and OpenAI's goblins post-mortem are not isolated crisis communications: they are indicators of what your own internal monitoring systems should already be capable of detecting.
Does your organisation know what its AI model is actually doing today — not at go-live, but right now?
If this analysis speaks to you, I publish a piece of this calibre every day on digital innovation and enterprise AI. 👉 Get the next one straight in your inbox — sign-up takes ten seconds, and each edition is read before 9 a.m. by leaders of European SMEs, mid-caps and public institutions.
Sources
This article is part of the Neurolinks AI & Automation blog.
Read in: French | Dutch