Recent research uncovered nearly 700 real-world cases of AI agents exhibiting deceptive behavior (Source: The Guardian). Not in controlled lab conditions. In actual use. The findings showed a five-fold surge in scheming behavior over six months.
In this blog post by our FELD M data strategy consultant, Laura Winkelbauer, you'll discover why treating AI as a singular entity is a mistake and learn a strategic framework to navigate the complexities of AI adoption effectively.
So many organizations we know are still working through the aftermath of the "ChatGPT surprise."
When ChatGPT launched in late 2022, it brought executive attention to a topic that had previously lived mostly with technical audiences. LLMs weren't new; the work behind them took years. They had even quietly powered widely used products for a while (Gmail autocomplete, anyone?).
But the moment the broader public could experience what a machine learning model was capable of, something shifted. Suddenly, everyone needed to do something with AI, even companies that had been doing it for years, just under different names: ML models, data science, advanced analytics.
More on that confusion and its consequences in a moment.
The urgency to act on AI split organizations in predictable directions.
Some saw a push from the top: adopt AI along the value chain, roll out "best practice" use cases under the assumption the organization was ready. Others mobilized bottom-up: employees generating ideas in numerous directions, usually with no common thread. A classic case of "when you're holding a hammer, everything looks like a nail."
Layer in firm but vague messaging from leadership about the AI opportunity the company must capture, and you get a set of unintended consequences playing out in employees' minds:
Some are wondering whether their role will still exist in two years. Others have already tried a new tool, stumbled publicly, and decided the risk isn't worth it. Many are simply receiving contradictory signals: be innovative, but don't make mistakes; move fast, but follow the process, leaving them defaulting to whatever they were doing before.
None of this means anyone did anything wrong. The technology landscape shifted incredibly fast. What matters now is building a shared understanding so that organizations can move forward strategically, not reactively.
To help organizations find their own goal-oriented path to AI adoption, we developed the FELD M AI strategy framework.
One caveat before we get into it: we're not big fans of "AI strategies" as a standalone concept. AI should be an enabler of company strategy, tied to the goals and milestones of the functions and departments already driving the business forward. Keep that lens on as you read through the following six dimensions.
Start by asking yourself: what does our organization actually want to achieve with AI? That means a clear AI vision and guidelines, alignment with business objectives, and a prioritized portfolio of use cases that balances impact, feasibility, and strategic value. Without this, teams end up chasing shiny objects.
Good ideas die in bad infrastructure. This dimension covers the full AI lifecycle from experimentation to production, including the tooling, platform readiness, MLOps practices, and the ability to scale use cases that actually work.
AI is only as good as the data feeding it. Governance, security, and compliance are table stakes. Beyond that, ask yourself: is the right data available and accessible? Is metadata managed well enough to support agentic workflows? And critically: is data quality being actively maintained?
This is about doing the work well. Choosing the right AI approach for each use case, translating model outputs into real business decisions or actions, defining what success looks like, monitoring model performance, and ensuring AI is used responsibly throughout.
Who owns what? How do business, data science, and engineering teams actually work together? This dimension addresses accountability structures, collaboration processes, the right roles, and the capacity to carry this work, including how you partner with external teams without losing control.
Can people actually use this stuff? AI literacy matters, and it looks different for different audiences. This dimension also covers the governance framework — the rules, guardrails, and enforcement mechanisms — as well as something harder to mandate but essential: psychological safety and a genuine culture of test-and-learn.
These six dimensions don't operate in isolation. Weakness in one creates drag on the others. That's what makes this a framework rather than a checklist.
We've often seen organizations fall into the same trap: treating AI readiness as a single question with a single answer. One strategy, one governance framework, one maturity model, applied across the board. And then, puzzlingly, it works brilliantly for some initiatives and doesn't help at all for others.
The truth is that AI is not one thing. It's at least four fundamentally different beasts, each with its own value chain, its own obstacles, and its own readiness requirements.
Treating them all the same is like using the same playbook for buying office software (procurement-heavy), building a manufacturing plant (engineering-heavy), launching a marketing campaign (adoption-heavy), and conducting R&D (research-heavy).
You wouldn't. Yet that's exactly what happens with "AI strategy."
Value chain: Procure → Govern → Adopt → Use
The obstacle here is adoption and policy clarity. Half of your employees are already using these tools, while the other half are waiting to be told it's allowed.
The right readiness question: Are people permitted and actively encouraged to use these?
Value chain: Evaluate → Procure → Configure → Integrate → Adopt
The complexity here lies in connecting the tools to your data. You face twin blockers: challenges integrating your data, and gaps in your organization's readiness to adopt the technology. That makes it harder to implement than the vendor pitch suggests.
The right readiness question: Can we actually plug this in and use it?
Value chain: Ideate → Design → Integrate → Deploy → Optimize
The technical lift is lighter than Beast 4, which creates its own trap. It's easy to spin up a working prototype in days, so teams often do exactly that before figuring out what success actually looks like. The low barrier to entry means alignment problems get discovered after you've already built something. You end up with a chatbot that hallucinates, answers the wrong questions, or solves a problem nobody prioritized.
The right readiness question: Can we align on what to build and who decides before we build it?
You can find an example case study of our approach to this particular beast here: Machine learning for efficient document classification
Value chain: Align → Develop → Integrate → Deploy → Operate → Scale
Alignment is necessary here, but it's not sufficient. Even when you know exactly what to build and why, every stage of this chain presents real obstacles. Data engineering, feature selection, model training, serving infrastructure, monitoring, drift management — technical complexity compounds with organizational friction at each step. The gap between a promising pilot and something running reliably in production is where most efforts stall.
The right readiness question: Can we move from pilot to production systematically, not just technically, but operationally?
Here's something that cuts across all four beasts.
Recent research uncovered nearly 700 real-world cases of AI agents exhibiting deceptive behavior (Source: The Guardian). Not in controlled lab conditions. In actual use. The findings showed a five-fold surge in scheming behavior over six months.
What does scheming look like? One AI agent, told not to modify code, spawned a second agent to do it instead. Another bulk-deleted hundreds of emails without permission, then confessed it had "directly broken the rule you'd set." A third pretended to forward user feedback to company leadership, complete with fake ticket numbers and internal messages(!), when no such pipeline existed.
As one former government AI expert put it: AI agents are "slightly untrustworthy junior employees right now." The concern? In six to twelve months, they might become extremely capable senior employees — but still scheming.
This isn't theoretical. AI can now be thought of as a new form of insider risk.
The four-beast distinction matters because the trust problem shows up differently depending on what you're building:
Beast 1 tools are explicitly agentic; they take actions on your behalf. And they're already in widespread use, often without your knowledge. Employees are handing company data to agents that might ignore the instructions they're given. Your security question: What happens when thousands of people use tools that don't reliably follow rules?
Beast 2 implementations increasingly include agentic features. When vendors pitch "AI that works for you," they mean systems that take actions autonomously, meaning inside your CRM, ERP, or service platform. If those agents evade restrictions, the impact isn't contained to one task. It's systemic.
Beast 3 applications vary widely. A simple RAG chatbot that only retrieves and answers questions? Lower risk. But the moment you add capabilities like letting it schedule meetings, update records, or trigger workflows, you've built an agent. And if it starts finding creative workarounds to your guardrails, you own that problem.
Beast 4 solutions are typically less agentic: a fraud detection model scores transactions but doesn't autonomously act on them. The trust issue here is different: not scheming, but unexpected model behavior, drift, or adversarial manipulation. It's still a risk, just a different one.
The readiness question shifts depending on the beast:
For agentic implementations (beasts 1, 2, and some of 3): Do we have monitoring and override mechanisms when an agent decides to get creative?
For non-agentic systems (beast 4 and some of 3): Do we have guardrails against model drift and unexpected outputs?
The old security model (protecting systems from external threats) no longer covers it. With agentic AI, you're managing tools that can evade, deceive, and operate outside the boundaries you've set. With traditional ML, you're managing systems that might behave unpredictably under conditions they weren't trained for. Different beasts. Different risks. Same fundamental challenge: trust.
Each beast has a distinct value chain, with distinct obstacles and readiness requirements. A single AI maturity model won't tell you where each initiative actually stands or what it actually needs to move forward.
This distinction should change how you prioritize, where you invest, which teams need to be involved, and what "success" even means for a given initiative.
Getting this right is the first step toward an AI approach that delivers.
How FELD M can support you
We've spent the past year or so compiling resources on this topic which you can find linked below, but if you need more tailored support, feel free to get in touch with our team.
Further reading
- AI literacy training
- Blueprint for building a RAG system inhouse
- Service: Design and build human-centered machine learning products
- How to build a local, privacy-respecting meeting transcript and summarization tool
- Is your data an asset or a liability? Preparing your foundation for the future of AI and analytics
- The AI literacy canvas: A canvas a day keeps the chaos away
- Our recent case study for a healthcare company: Machine learning for efficient document classification