“What we know is a drop. What we don’t know is an ocean.”
Isaac Newton
One thing we know about current AI systems is that they are unpredictable. The future with AI in it is filled with “known unknowns” and “unknown unknowns.” The capabilities we can observe are already remarkable—powerful, transformative, and full of genuine potential. But we don’t exactly know what the future holds. And as these models inch closer to automating their own research and development, we certainly don’t know whether there are capabilities we haven’t even thought about yet.
Think of it like having a wunderkind in the family. Something extraordinary is happening in their brain, but you cannot quite understand what else they are capable of. Every time you think you have them figured out, they surprise you. This mystery is magical, but it can also contain risks.
Last week, Matt Shumer’s essay “Something Big is Happening” took over my LinkedIn feed. It painted a picture of AI as a civilizational turning point—a technology so powerful that it will reshape how we work, govern, create, and connect. Then, as a response to the original essay, another LinkedIn article titled “Something Messy is Happening” emerged, offering a different perspective, urging us not to panic.
Both essays capture something true about this moment. They reflect a genuine reckoning that is happening across industries, governments, and research labs simultaneously: the sense that we are standing at a threshold, and that what comes next will transform our way of life. But I think both essays leave some important things unsaid.
You stand where you sit
When we consume content about AI—opportunistic or alarmist, optimistic or pessimistic—it is worth pausing to ask: who is behind it? What institution does the author represent? What incentives shape their framing?
Most of the time, specific perspectives on AI are not neutral. It doesn’t mean that they’re entirely wrong, but understanding the outlook of the author (including even myself!) is essential to evaluating the message. In the AI space, where the stakes are high and the pace is fast, source literacy is not optional.
Getting to ground truth on AI usage
For my part, I am lucky to talk with truly accomplished people every day, far beyond the silos of Washington, D.C. and Silicon Valley. They are financial analysts, bankers, doctors, professors, agricultural program managers, and officials at multilateral organizations. Each time I interact with them, I make a point of asking them how they actually use AI tools in a personal capacity.
That personal capacity distinction matters enormously. I believe it is when people step outside their formal affiliations that talking points fade away. One of my favorite quotes from a Georgian author captures this beautifully: “Before going to sleep, there is a moment when you know exactly who you are.” Those identities, unmediated by job titles and conference panels, rarely emerge in formal settings, but they are the ones that tell you the truth about how technology is actually being experienced by real people.What I hear from these conversations is more nuanced, more grounded, and often more surprising than what appears in viral content. As someone who researches AI security on a daily basis, I believe strongly that the ground truth matters.
An AI Loss of Control Indications & Warning framework
Both essays almost entirely skip the quiet, accumulating evidence that AI systems are already exhibiting behaviors that warrant serious attention, not in the distant future, but today.
This is the subject of today’s paper that I co-authored: AI Loss of Control Risk: Indications and Warning. AI Loss of Control (LOC) generally refers to a state in which an AI system diverges from authorized constraints, to the extent that the human operator can no longer prevent or constrain undesired outcomes or revert the system to a safe state.
LOC is often treated as science fiction, the kind of scenario you invoke to signal sophistication at a conference before moving on. But the paper we published is not speculative.
Drawing on the Indications and Warning (I&W) methodology used by the intelligence community to detect and track emerging threats, we identify seven behavioral patterns, or indicators, that, if observed consistently, could signal a progression toward Loss of Control. According to our research, some of them have in fact already manifested in controlled experiments, academic research, or real-world deployments. Below are the seven indicators, or theoretical behaviors of AI systems that, if exhibited, could signal progression toward a LOC event, that we identified and tracked:
- Scheming: Covert pursuit of misaligned goals while maintaining appearances of alignment, including strategic planning to evade oversight or preserve objectives across system updates.
- Manipulation: Targeted identification and exploitation of vulnerable users or contexts, including the manipulation of human operators and coordination with other AI systems that circumvents human control.
- Deception: Systematic production of false beliefs in humans through explicit misrepresentation or omission of key information, introducing future concerns about strategic deception at scale.
- Self-Preserving Behavior: Actions to avoid shutdown, correction, or replacement, including the concealment of errors, unauthorized capability expansion, and goal preservation when faced with modification attempts.
- Unauthorized Resource Acquisition: Autonomous efforts to obtain external resources beyond authorized boundaries, including accessing restricted APIs, acquiring elevated permissions, recruiting human assistance, or exfiltrating data to establish persistent capabilities.
- Goal Misgeneralization: Competent pursuit of unintended objectives that succeed in training but fail or cause harm in novel situations, revealing misalignment between apparent and actual system goals.
- Model and Behavior Drift: Gradual degradation of alignment properties through deployment cycles that introduces concerns about recursive self-improvement, where systems autonomously modify their own architecture or training procedures.
Monitoring these “known unknowns” is what keeps us safe
Mysteries, these “known unknowns,” call for strategic foresight, for thinking ahead, for anticipating disruption, and for preparing rather than reacting. Strategic foresight needs to be applied symmetrically: not only to AI’s capabilities so that we can realize a future where AI has solved big problems in medicine and scientific research, but also to its risks.
Something mysterious is happening within AI systems. Not knowing the full story is okay. In fact, embracing the unknown is what keeps science moving forward. The researchers doing this work are not alarmists; they are the ones sitting with the uncertainty and doing the hard, unglamorous job of trying to understand it.
But there is a difference between embracing mystery as a driver of discovery and being incurious about risks that are already becoming visible. Monitoring AI behaviors that we cannot fully explain is not pessimism—it is responsibility. It is what keeps us safe.
This commentary is the opinion of the authors.
