Token Efficiency & Air-Gapped MLOps #36 (Stockholm, Sweden | June 3, 2026)
Insights from the Stockholm MLOps community
"Being an adult is knowing you only have bad choices." — Mikael Vesavuori, evroc
That line got a laugh from the audience. But it also captured the reality of the evening. AI in production is not limited by models. It is limited by how systems are built, controlled, and operated under real-world constraints. At Stockholm MLOps #36, engineers, operators, and infrastructure leaders came together to discuss what actually happens when AI leaves the sandbox and enters enterprise environments.
SUMMARY — KEY INSIGHTS FROM THE EVENT
The industry is shifting from model-centric AI to infrastructure-centric AI
Retrieval quality increasingly matters more than model selection
Control and sovereignty are becoming first-class design requirements
Data gravity is returning as data movement is questioned across industries
Air-gapped AI is rapidly expanding far beyond defense applications
AI serving is fundamentally becoming a workload routing problem
WHAT THIS EVENT WAS REALLY ABOUT
Most tech conversations focus on model capabilities and benchmarks. Enterprise production is different. It operates within legacy environments, strict regulations, and strict security perimeters.
The implication: AI in production is not just a data science problem. It is:
a systems problem
an infrastructure problem
a control problem
and a governance problem
KEY INSIGHTS FROM THE EVENT
1. The Industry Is Moving to Infrastructure-Centric AI
"Organizations are no longer asking: Which model should we use? They are asking: Where should it run?" — Mikael Vesavuori "Who controls it? How is it governed? What happens when it fails?" — Mikael Vesavuori
The strongest signal from the evening was clear. The center of gravity has moved down the stack and into the infrastructure that surrounds the model.
2. Retrieval Is Becoming Core Infrastructure
"Retrieval quality is what determines whether agents succeed." — Ewa Szyszka, Qdrant "Agents waste tokens in four ways: context accumulation, excessive loading, fluff generation, and failure recovery." — Ewa Szyszka
Organizations improve quality, reduce cost, and increase reliability faster by optimizing how information reaches the model than by changing the model itself.
3. Control Is Becoming a First-Class Design Requirement
"Sovereign is here to stay." — Mikael Vesavuori "People were told they were stupid for running their own hardware. The industry tricked you." — Mikael Vesavuori
Control means different things for different workloads. Teams are prioritizing operational ownership over public cloud lock-in based on sensitive data and shifting regulations.
4. AI Serving Is Becoming a Routing Problem
"Application to Model is becoming Application to Routing Layer to Location to Model." — Mikael Vesavuori "Hybrid is going to mean something else in the future." — Mikael Vesavuori
The emerging question is where a workload should execute. Organizations will increasingly route queries based on latency, governance, trust, cost, and ownership.
5. Data Gravity Is Returning
"You must innovate with GenAI to survive, but you cannot move your proprietary data to the public cloud." — Markus Kjellner, Cloudera "You can either bring data to the model or bring the model to the data." — Markus Kjellner
For years the industry assumed data should move to compute. Latency, governance, and trust are pushing teams toward keeping data where it lives and moving the compute closer to it.
6. Foundation Models Are Escaping Language
"We want to build a foundation model for the electromagnetic spectrum." — Peter Sundström, SAAB "The destination of this journey is not classification. It is anticipation." — Peter Sundström
Language is simply the first domain where foundation models succeeded. The same architectural patterns are appearing in sensor systems, electronic warfare, and environments that look nothing like human text.
7. Real-World Data Demands Synthetic Solutions
"Real-world data is a total mess." — Peter Sundström "Synthetic data is very important." — Peter Sundström
In operational environments, real data is often rare, messy, or classified. Synthetic data is graduating from a nice-to-have tool to the primary mechanism that makes advanced learning possible.
8. Data Formats Reveal Capabilities
"The data format itself reveals the capabilities of our sensors." — Peter Sundström
In high-security environments, even the structure of data holds sensitive signal. This reality is forcing infrastructure to adapt to security rules that protect both content and context.
PATTERNS ACROSS TALKS
Across all speakers, a few patterns stood out:
The model itself is no longer the bottleneck
Trust and sovereignty dictate architectural choices
Retrieval quality dictates actual system performance
Workload placement is becoming an operational discipline
Data gravity is stopping indiscriminate cloud migration
WHAT THIS MEANS FOR AI IN PRODUCTION (SWEDEN)
AI in production is not limited by intelligence. It is limited by:
infrastructure
governance
workload placement
operational control
Sweden is addressing these challenges under distinct pressures:
strict data sovereignty
evolving European regulation
the need for industrial infrastructure ownership
This is forcing a shift: Less focus on model benchmarks. More focus on systems that run reliably under real-world constraints.
What This Means For AI In Production
Eighteen months ago, many AI discussions focused almost entirely on model capability.
At Stockholm MLOps #36, the conversation focused almost entirely on infrastructure.
Retrieval systems, governance controls, routing layers, synthetic data, sovereign environments, deployment architectures and operational resilience dominated the evening.
That shift matters because it signals a maturing industry.
When technology is new, conversations revolve around capability.
As technology matures, conversations shift toward operations.
The strongest signal from Stockholm MLOps #36 was not that models are becoming less important.
It was that infrastructure is becoming more important.
The future of AI will not be defined solely by who builds the best model.
It will increasingly be defined by who can operate AI systems reliably, securely and efficiently under real-world constraints.
Related Insights in this Theme
AI Infrastructure, Orchestration & On-Prem AI | Stockholm MLOps #28
How infrastructure ownership, open-source AI stacks and operational control are reshaping production AI environments.
Optimizing Inference in Production | Stockholm MLOps #29
Exploring inference economics, runtime optimization, orchestration and why production AI increasingly resembles distributed systems engineering.
Sovereign AI in Production | Stockholm MLOps #31
How trust, infrastructure control, governance and sovereignty are influencing AI deployment decisions across regulated industries.
Join the Community
👉 Full event details: https://www.meetup.com/stockholm-mlops-community/
👉 Explore all Stockholm MLOps Insights and Events
Event Details
Location: AI Sweden, Stockholm
Date: June 3, 2026
Speakers: Qdrant, evroc, SAAB, Cloudera
Topics: Token Efficiency, Retrieval Architecture, Air-Gapped AI, Sovereign AI, AI Infrastructure, Synthetic Data, Governance, AI Operations, AI Serving, MLOps
Event: Stockholm MLOps #36