The Context Bottleneck
AI isn’t limited by prompts. It’s limited by context sharing.
What follows started as a brainstorming prompt to an LLM. Seemed fitting to share it here.
For most of the twentieth century, American auto manufacturers managed their suppliers the same way — write exhaustive specifications and put the contract out for bid. The assumption was simple: the more precise your instructions, the better the output.
Toyota did the opposite. Their engineers handed suppliers specifications using words like gotsu gotsu (a low-frequency, high-impact motion felt in the lower back) or buru buru (a high-frequency, low-impact vibration felt in the belly) — a specialized but deliberately imprecise vocabulary that described how a component should feel rather than exactly how it should measure. This could have been a disaster. But Toyota developed the fastest and most efficient vehicle development cycle in the industry, using fewer engineers than its competitors. The difference wasn’t that Toyota had better suppliers — they often used the very same ones the Big Three were fighting with. It was that Toyota shared deep context and motivations instead of just instructions. Supplier engineers lived full-time in Toyota’s design offices on two- to three-year rotations, absorbing not just what Toyota wanted built, but why — the design philosophy, the customer intent, the trade-offs that mattered. When a Toyota engineer used a term like gotsu gotsu, a supplier who’d spent two years immersed in that context knew exactly what it meant. A supplier working from a spec sheet at arm’s length could never match that. This is a key part of Keiretsu.
Today, practitioners agree that better context is key to producing the best results with AI. But there’s an underexplored dimension of the context problem that I haven’t seen discussed widely: not just how to give AI context, but how to share the right information across the boundaries where it gets stuck today — between departments, organizations, and the vendors they work with. We need a modern Keiretsu — one that works at AI scale.
Let me walk through how I got here.
A list of questions that looks fine but isn’t
I’ve been building an AI interview framework, and a potential customer recently sent me a list of questions they wanted my AI interviewer to use for a demo:
Overall usefulness — In your experience, how useful was Product X during your workday?
Relevance — Did the insights or alerts generated by the system feel relevant and actionable? Can you provide specific examples?
Impact on decision-making — Did the platform influence any operational decisions you made (e.g., X, Y, Z)? If so, how?
Workflow integration — How well did Product X fit into your existing workflow? Did it feel like a natural augmentation or an additional task?
Signal vs. noise — Did you feel the system helped you identify important events as they were happening (e.g., X, Y, Z)? If not, would you prefer some form of alert or notification?
Awareness of key issues — Did the platform improve your awareness of X, Y, or Z issues? If yes, in what ways?
User interface and usability — How intuitive was the interface? Were there any parts of the interface that were confusing or slowed you down?
Trust in the system — How much did you trust the insights or recommendations generated by the system? What would increase your trust?
Highest value features — Which specific features or outputs from Product X felt most valuable to you?
Missing capabilities — What important information, functionality, or workflow support did you feel was missing from the system?
These look like perfectly reasonable research questions. But here’s the problem: if you gave this same list to five different AI interview tools in a bake-off, there would be almost no difference in the quality of what each one produces. The differences will mostly be at the edges. Some AI interviewers will be more conversational (providers using OpenAI’s realtime API will excel here; providers will differ based on VAD strategies or pipeline latency). Some will transcribe better (vendors who give STT providers custom domain-specific vocabulary will do well; some might even use multiple STT providers in creative ways).
But the quality of the questions themselves? That’s where the field really flattens out. There’s no background knowledge of the company, its product, what the founders care about, or prior interviews to draw on. Without that, you can’t ask meaningful, deep-hitting questions. You just get polite, generic follow-ups to polite, generic answers.
This reminded me of a quote from Amp in the context of a product pivot: “With GPT-5.3-Codex, the agent is no longer the bottleneck. Our ability to tell it what to do is.”
What kind of context actually matters?
At first I thought the answer could be context on how to conduct an interview — scan a hundred books on the art of interviewing, distill that knowledge into great prompts, maybe even fine-tune a model. Make sure the AI doesn’t ask leading questions. Make it good at connecting surface-level responses to deeper needs. This is genuinely useful, but it wasn’t the big unlock that I hoped.
It’s only part of the picture. While at Google and YouTube, I spent over 100 days speaking with users or listening to moderators do so. Hot take: the most useful interviews are not those conducted by a rigorously trained professional researcher or moderator. I want to be careful here — professional UX researchers are incredibly good at what they do, and there are domains where that rigor is exactly what you need: usability studies, academic research, longitudinal studies, and more. But when the goal is to inform a product or business decision, a purely rigorous academic approach tends to produce balanced, “correct” reports — thorough and polished, but not always the sharpest tool for someone who needs to make a bold bet.
Product, marketing, and business visionaries don’t make their best decisions when everything is balanced. The decisions that actually move businesses forward are opinionated, pointed — and right. In my experience, the most interesting and valuable research comes from someone who has both interviewing skill and — more importantly — deep, intimate knowledge of and opinions about a problem space or product.
And that’s what’s missing from creating an AI interviewer from just this list of questions. To build one that conducts truly impactful interviews, we need not only the interviewing skills, but also deep, intimate knowledge and context about the problem space and the company.
The deck gap
I could ask the prospective customer for more context — how they think about the business, where the product is headed, what hypotheses they’re operating under — and translate that into prompts for the AI interviewer. But in practice, what I’d likely get from the customer is a distilled version of that thinking (a PPT), shaped for sales calls, roadmap presentations, or investor conversations. This flow of information isn’t very efficient. A deck is often meant to accompany a voiceover. Even when it’s not, it’s a format built for humans to tell a visual story. Can a multimodal LLM consume it? Sure. But it’s far from the most efficient way. What works best is a well-constructed body of text that LLMs can access and use to generate more impactful prompts.
When we step back and think about where the world is headed, it becomes increasingly likely that the deck itself was created using a not-dissimilar well-organized set of text — given to the founder’s AI tool of choice. So why not just share the prompts used to create the deck rather than the deck itself? It’s more useful, avoids a very lossy translation, and saves everyone time going back and forth with LLMs.
There’s an even deeper issue with sharing a deck or any polished deliverable: it’s a snapshot of the conclusion, not the journey. A friend in marketing put it well: “How often do you send a brief to an agency, only for them to come back with an idea you’ve already discussed and killed?” This happens constantly. Given similar constraints, smart people (and smart AI tools) will often converge on the same answer. But sometimes that answer was already tried and abandoned for reasons that never made it into the final artifact. The discussions about why the obvious path didn’t work are often the most enlightening, and they’re precisely what gets lost. Ask someone who’s been on a project from day one to describe it and you’ll get a far richer picture than from someone who joined last week. I’d call this upstream context — and for vendors especially, it’s transformative. A sales prospect you’re pitching cares about what your product offers today and where it’s going tomorrow. But a vendor you work with benefits immensely from understanding how you got here.
Getting from here to a world where context flows freely to the AI tools that need it requires solving three problems — each interesting in its own right, but building toward the one I think represents the biggest opportunity.
Problem 1: Well-formatted context
The first problem is format. Engineering organizations had a head start here because so much of their source material was already LLM-friendly. Code is text. Config files are text. Documentation is markdown.
Other parts of an organization aren’t so lucky. Marketing lives in slide decks and brand guides full of images and layout. Sales lives in CRM notes, call recordings, and PDFs of varying quality. Strategy lives in spreadsheets and board decks. The information exists, but it’s trapped in formats that weren’t designed for LLM consumption.
It’s fascinating to watch how AI tools are handling this gap today. Most AI services seem to process PDFs by parsing out the text and sending it alongside a rendered image to a multimodal model. PowerPoint files get even more interesting treatment: some tools work directly with the underlying XML structure, which means they can modify slides programmatically but also means the “understanding” of a deck is really an understanding of its markup rather than its narrative. And in both cases, the approach is token-inefficient — what could be clean markdown ends up as a blob of junk text plus an image, polluting the context window unless someone does careful preprocessing with a subagent.
Frontier labs seem to do this best today — better document parsing, richer multimodal understanding, longer context windows. But these are all solutions that try to make LLMs better at consuming messy formats. The more interesting question might be: what if we made the formats less messy in the first place? What if, alongside your deck, you maintained a structured context document — a machine-readable source of truth about your business that any AI tool could consume efficiently?
Getting context into the right format is only the first step. Once it’s machine-readable, it still needs to be organized and sharable.
Problem 2: Well-organized context
We might be giving our prospective customer too much credit for having a “well-organized set of text” that describes their business. More likely, the deck emerged from a messy back-and-forth conversation as their AI helped them iterate toward something beautiful. Maybe they’d talked to Claude so much that their prompts were enriched by an abundance of MEMORY.md files floating around on their machine or in the cloud. But this is a far cry from a canonical, structured source of truth.
We’re already seeing early versions of this in engineering. Look at how Stripe describes the design of their “context gathering: rule files” in their minions blog post: they build per-directory, well-structured text to give LLMs the right context for coding decisions. They don’t rely on individual developers to type out large prompts. They’ve institutionalized it, canonicalized it, and made it accessible to the tools and frameworks that call the LLMs their developers use. They standardized on the rule format and sync those rules into a format Cursor or Claude Code can read as well — so their three most popular coding agents can all benefit from the same guidance.
So we see this is happening on the bleeding edge — within forward-thinking organizations, for engineering use cases — but primarily within a single organization. Well-formatted, well-organized context is a necessary foundation — but when it lives entirely inside one organization, it only solves half the problem.
Problem 3: Context sharing across boundaries
This is where I think the big opportunity lives.
The boundaries that matter aren’t just between companies — they’re everywhere information gets siloed. Within a single organization, departments often operate with limited visibility into each other’s context, and frequently for good reason. You might not want the compliance team sharing everything they work on with the entire company. But you probably do want them to share enough context that everyone can make sure they’re following the rules. Sales has context about what specific customers say they need. Product has context about what’s being built and why. Marketing has context about how the product is positioned. Each of these would make the others’ AI tools dramatically more useful — but today that context rarely flows between them in a structured, organized, machine-readable way.
The same problem exists across organizational boundaries, and gets even harder. Even if an organization builds beautifully structured, LLM-ready context about their business — their strategy, their product, their market position, their users — they will be hesitant to share the full version with external partners and vendors. And the concern isn’t unfounded: that context is often tangled up with personally identifiable information, proprietary competitive intelligence, or internal strategy. And a vendor almost never needs the PII or the raw proprietary data to do a good job. They need the insights, the strategic direction, the synthesized understanding. The challenge is that today there’s no good mechanism to separate the signal from the sensitive and share just the parts that matter.
Think about my original scenario. If the customer shared rich, structured context about their product, their users, and the key debates happening at their company, my AI interviewer could produce dramatically better results (just as a human moderator with this additional context would). Multiply this across every vendor relationship an organization has — their design agency, their market research firm, their consulting partners, their SaaS tools — and the compounding value of solving this problem becomes clear.
The design space is wide open, and the hypotheticals are fun to think about:
Scoped context views — curated slices of your internal context that expose only what’s relevant to a particular vendor relationship, with everything else redacted or omitted. Think of it like database views, but for organizational knowledge and likely created by agentic systems.
Sandboxed external agents — allow a vendor’s AI agent to operate within your enterprise environment, gather what it needs, and surface the outputs for human approval before anything crosses the boundary.
Context escrow — a neutral intermediary that holds the full picture but only passes through the specific pieces a vendor’s system requests, with audit trails and access controls.
NDA-gated full access — give vendors access to everything and trust the legal framework. This has the advantage of working today, but doesn’t scale and doesn’t address the legitimate instinct to limit exposure.
A context exchange protocol — something like an API contract, but for business knowledge. Organizations publish structured metadata about what context they can share, and vendor tools negotiate access as it’s needed.
None of these ideas are fully baked, and that’s the point. But some of the building blocks are starting to emerge. Protocols like MCP are creating standards for how AI tools connect to external systems. Features like Claude’s Skills are formalizing how reusable context gets packaged and shared with models. The plumbing for cross-boundary context exchange is being laid — but nobody has assembled it into a complete solution for the problem described here.
The problem is real and growing — as AI tools become more capable, the gap between what they could do with the right context and what they actually do with the context they’re given will only widen. The organizations that figure out how to make context flow safely and efficiently across boundaries will unlock enormous value.
Wrapping up
I recently read Alap Shah’s piece on the “Global Intelligence Crisis”, and while I don’t share his dystopian (or is it optimistic?) view of where we’re headed, one quote in particular caught my attention: “AI agents, however, share nearly perfect, continuous context.”
I don’t believe this is true today. Context management is one of the most important — and most underappreciated — skills in building AI systems. Even within a single session, managing an ever-growing context window is difficult, let alone across sessions or across organizations. But getting this right unlocks huge value.
We can format context better (Problem 1). We can organize it better (Problem 2). But the real unlock — the one that changes the game — is making the right context available across boundaries (Problem 3) — whether that’s between departments, between companies, or between the people who hold the knowledge and the AI tools that need it. Solve that, and you don’t just improve one AI tool. You make every AI tool in the ecosystem better at understanding the businesses they serve.
