Three Weeks and a Crater: Some Notes on the Specific Loneliness of Giving Up on a Tool You Wanted to Love

Three Weeks and a Crater: Some Notes on the Specific Loneliness of Giving Up on a Tool You Wanted to Love
Photo by Fabien Bazanegue / Unsplash

The moment I'm thinking about happened on a Tuesday, in the middle of what passes for a team meeting when your team is scattered across three time zones and nobody's camera is on. Someone on my team, a person I'll call Marta because that's not her name,¹ typed into the chat: "Quick question: are we allowed to paste customer data into AI for a summary?" And I watched the chat go completely silent for eleven seconds, which in a Zoom channel during a live meeting is the equivalent of a full minute of eye contact with a stranger on public transit. Then somebody responded with a thumbs-up emoji on the original message, which answered nothing. Then someone else wrote "good question." Then the meeting moved on to ticket metrics.

What haunts me about this, if "haunts" isn't too strong a word for a Tuesday afternoon in a home office with one dog snoring under the desk and the other dog pawing at my leg, is not the question itself. The question was reasonable. It was, in fact, the most reasonable question anyone on my team had asked in weeks. What haunts me is the silence that followed it. Because Marta is one of my best people. She is methodical, and she cares about accuracy the way some people care about being early to the airport (which is to say: a little more than is strictly necessary, but in a way you learn to appreciate when things go wrong). And her question wasn't really a question about data policy. It was a question about permission.² About whether the act of trying this thing, this tool everyone had been handed six months ago with a PowerPoint deck and a "go be productive," was safe. Not safe for the data. Safe for her.


This was late 2025, around the time Microsoft reportedly tracked hundreds of thousands of employees using its AI Copilot tools and discovered something the technology press described with barely concealed alarm: excitement peaked around week three, and then it cratered.³ Not a gentle decline. A crater. The kind of usage graph you could show to a geologist and they'd ask what hit it.

The shape of this graph is not new. It's basically the Gartner Hype Cycle, compressed from years into days, and playing out not across an industry but inside individual human beings. Peak of inflated expectations (week one: "This will change everything"). Trough of disillusionment (week three: "This keeps giving me confidently wrong answers and it took me longer to fix its output than to just do the work myself"). And then, for most people, not a slope of enlightenment but a quiet retreat to the old way of doing things, accompanied by a vague sense of personal failure they will not mention to anyone.⁴

Here's what I want to understand, and I mean genuinely understand, not in the way where someone sets up a question just to knock it down with a pre-loaded answer: Why do the most conscientious people quit first?

Because they do. This isn't just my anecdotal observation from managing a global support team, though it is that. The research keeps converging on this unintuitive finding: the employees who care the most about doing good work are the ones most inclined to opt out of the tool meant to help them do it. The ones who double-check their emails before sending. The ones who actually read the knowledge base articles instead of guessing. The ones you'd hire again in a heartbeat. They try the AI, get a mediocre or incorrect result, and, because they have standards, they conclude the tool is the problem.

Which, okay, is not an unreasonable conclusion. But it also isn't the right one.⁵


Ethan Mollick, the Wharton professor who has probably thought more publicly and carefully about AI-and-work than anyone currently publishing, puts it this way: the best users of AI are good managers. Not good technologists. Not good prompt engineers (a job title that now feels dated, the way "webmaster" felt dated by 2005). Good managers. People who know how to delegate, how to check someone else's work, how to break a complex task into smaller pieces and assign each piece to the entity best suited to handle it.

This is a wild reframe if you think about it for more than a few seconds. We've been treating AI adoption as a technology problem. Give people access. Show them the interface. Teach them the syntax. But the skill it actually requires is one we already have a word for, and it's not "prompting." It's judgment.⁶

Mollick and his colleagues at Harvard Business School and BCG ran what's become one of the most cited studies in this space: they gave 758 management consultants access to GPT-4 and watched what happened.⁷ The consultants using AI finished tasks 25% faster and produced results rated 40% higher in quality. But (and this is the part that doesn't make it into the LinkedIn summaries) when the tasks fell outside what Mollick calls the "jagged frontier" of AI capability, the consultants using AI performed 19 percentage points worse than those working without it.

The jagged frontier. I love this term because it captures something no one else had quite named: the boundary of what AI can and can't do isn't a clean line. It's jagged. Serrated. The tool will nail a task you expected it to fail at and then botch something you thought was trivially easy. Writing a polished summary of a complex document? Often superb. Counting the number of words in a sentence? Frequently wrong.⁸ The frontier zigs and zags in ways that don't map to any human intuition about difficulty. And if you don't spend enough time with the tool to develop a feel for where the zags are, you'll either trust it in the wrong places or distrust it everywhere. Both are bad. Both lead to the same three-week crater.


The study also surfaced two work patterns among the people who survived the crater and actually got productive. Mollick called them Centaurs and Cyborgs.

Centaurs (half human, half horse, clear division between the two halves) maintain a strategic separation between what they do and what the AI does. The human frames the problem, makes the judgment calls, decides what "good" looks like. The AI generates options, drafts, summaries, data transformations. There's a handoff point, and both sides know which side of it they're on.

Cyborgs are different. They weave in and out. They start a paragraph and let the AI finish it. They use the AI's output as raw material and reshape it sentence by sentence. The line between human and machine contribution becomes, by design, impossible to trace. It's more improvisational, more fluid, and (if you'll permit one non-grandiose metaphor) it's the difference between cooking with a sous chef who handles prep and cooking with a partner where you're both at the stove trading the spatula back and forth.⁹

Both patterns work. Both require something the initial six-hour training session didn't teach: a mental model of when to lean on the tool and when to lean away from it. Neither pattern is about writing better prompts. They're about something harder. They're about knowing the shape of the work, the shape of the tool, and where those shapes overlap. Which is not a skill you can pick up from a slide deck. It's a skill you develop by doing the work, getting it wrong, adjusting, and doing it again. And here is where I start to worry about something bigger than quarterly AI adoption metrics.


There's a concept in learning theory called legitimate peripheral participation, developed by Jean Lave and Etienne Wenger in the early nineties. The core idea is elegant: newcomers learn by doing the unglamorous work at the edges of a community of practice. The junior lawyer doing document review. The first-year teacher grading homework. The new support agent handling password resets. This work looks menial from the outside, and it partly is, but it's also the mechanism by which human beings develop professional judgment. You learn what "good" looks like by handling a thousand instances of "not quite right." You build pattern recognition. You develop, through sheer repetition and exposure, the ability to sense when something is off before you can articulate why.¹⁰

Now consider what happens when AI takes over the peripheral work. The research tasks. The first-draft writing. The data gathering. The summarization. All the work that used to be the on-ramp for developing judgment in a domain.¹¹ We are, without quite meaning to, removing the scaffolding junior employees used to climb. And the people making the removal decisions are the senior people who already have the judgment, which means they can see the AI's mistakes when they appear, which means the tool works great for them, which means they assume it must be working great for everyone. This is a genuine structural problem, not a training gap.¹²

I think about this every time I build a new workflow for my team. Because I can see where the AI output needs correction. I've spent 25 years in support, and I know which customer email tone signals an escalation and which signals a bluff. I know when a summary is leaving out the one detail that matters. But I know this because I spent years (decades, actually, an almost embarrassing number of decades) doing the peripheral work myself. If I hand a new L1 agent a pre-summarized ticket and a suggested response, they'll handle it faster. Of course they will. But will they learn to read the original ticket? Will they develop the instinct for which detail matters? I don't know.¹³ And the honest answer is nobody knows yet, because we're running the experiment in real time on real people's careers.


There's a training gap in most organizations that maps almost perfectly onto the problem Marta's question revealed. Almost everyone gets the Intro to... (here's the tool, here's how to prompt it, go be productive) and some subset of technically inclined people gets the Advanced Techniques in... (fine-tuning, API integration, building custom agents). The middle is empty. The Practical Application. The part where the question shifts from "How do I use this tool?" to "Where does this tool fit in my work, and how do I know when its output is trustworthy?"

This is not a sexy question. It does not make for good conference keynotes. But it is the question on which the entire adoption curve hinges, and we have, collectively, sort of skipped it.¹⁴

The Practical Application skill set looks, on paper, unremarkable. Can you break a complex task into pieces and decide which pieces to hand off? Can you look at an AI output and assess its reliability given the stakes involved? Can you take a first draft that's 70% right and systematically work it to 95%? Can you tell when you've crossed the jagged frontier into territory where the AI will confidently produce garbage? These are not technical skills. They are management skills. They are the skills of someone who knows how to delegate to a competent but unreliable team member who never gets tired and never asks for clarification and never tells you when they don't know the answer.¹⁵

Which, when I put it that way, sounds a lot less revolutionary and sounds a lot more familiar. We've been managing unreliable-but-energetic contributors for as long as organizations have existed. The new part is that this particular contributor can process language at inhuman speed, has read most of the publicly available internet, and will hallucinate a citation to a paper that doesn't exist with the same confidence it uses to correctly summarize one that does.


The permission gap is real, and it's structural, and I keep coming back to it. Marta didn't ask "How do I use this tool?" She asked "Am I allowed to?" And the silence that followed wasn't confusion. It was the organization's answer: we haven't decided, so please don't make us decide, so please just figure it out on your own or, better yet, quietly stop asking.¹⁶

The most conscientious employees, the ones who read the compliance documents and follow the escalation procedures, interpret organizational silence as organizational caution. Which means they wait. And while they wait, the less careful people barrel ahead, sometimes brilliantly and sometimes recklessly, and the gap between the two groups widens, and the organization develops a kind of split personality where AI is simultaneously everywhere and nowhere, celebrated in the all-hands meeting and forbidden in the whispered Slack DM.

I don't have a tidy resolution for this. If I did, I'd be selling it, not writing about it. But I can tell you what I've been thinking about, which is this: the problem isn't the technology. The problem isn't even the training. The problem is that we've treated a judgment skill as a tool skill, and then been surprised when the people with the best judgment decided the tool wasn't worth the risk.

The BCG study found the two successful patterns, Centaurs and Cyborgs, shared one characteristic: the people using them had enough experience with the AI to know its shape. To feel the jagged frontier in their fingers, the way a carpenter feels the grain of the wood.¹⁷ They hadn't attended more training sessions. They'd just logged more hours, made more mistakes, and developed (through practice, not instruction) an internal model of what the tool could and couldn't do.

And for that to happen, people need time. And permission. And a tolerance for being wrong in the short term that most performance-review systems actively punish.

Which brings me back to Tuesday, and the eleven seconds of silence, and Marta's cursor blinking in the chat window. She was asking for something we hadn't given her. Not access (she had access), not training (she'd done the training), but the organizational equivalent of someone saying: yes, try it, it'll go badly sometimes, that's the point, the going-badly part is how you learn the shape of the thing.¹⁸

The dog is still snoring under the desk. The meeting moved on to ticket metrics twenty minutes ago. And somewhere in a usage dashboard, another seat just went dormant.


¹ Not for any particularly dramatic reason. I change names in these pieces as a general policy, partly out of respect and partly because the specific person matters less than the pattern, and I want you thinking about the pattern. ↩︎

² The distinction between a question about policy and a question about permission is one of those things that seems pedantic until you manage people for a living, at which point it becomes the most important distinction in the world. A policy question has an answer. A permission question has a culture. ↩︎

³ The specific numbers get cited differently depending on the source, and I want to be careful here: what Microsoft actually published versus what got telephone-gamed through the tech commentariat are not identical. The shape of the curve, though, is consistent across multiple reports. Excitement, then disillusionment, then quiet abandonment. The 80/20 rule, in the direction you don't want. ↩︎

⁴ The vague sense of personal failure is, I think, the part we're not talking about enough. If your organization gives you a tool and tells you it'll make you more productive, and then you can't make it work, the conclusion you draw is not "the training was insufficient." The conclusion you draw is "I'm insufficient." Which is both wrong and corrosive. ↩︎

⁵ Or rather: it's the right conclusion about the wrong thing. The tool is often the problem, in the sense that its outputs are unreliable in unpredictable ways. But the response to an unreliable tool isn't to stop using it. It's to develop a reliable method for checking its work. Which requires, yes, more effort than doing the work yourself, at least initially. The payoff comes later. Assuming you stick around long enough for "later" to arrive. ↩︎

⁶ I realize "judgment" is one of those words that means everything and nothing, a word you can nod along to in a meeting without anyone having to get specific. What I mean here is: the ability to distinguish between situations where AI output is trustworthy and situations where it isn't, and to make that distinction quickly, repeatedly, and without a flowchart. ↩︎

⁷ The full study, for those inclined: Dell'Acqua et al., "Navigating the Jagged Technological Frontier," Harvard Business School Working Paper, 2023. It's dense but readable, which is a combination academic papers rarely achieve. ↩︎

⁸ This specific example is real and baffling. You'd think counting words would be easier than writing a sonnet, and for a human it is. For a language model, it's the other way around. The reasons are architectural and have to do with tokenization, which is one of those details that's boring until you're relying on the tool and it's confidently telling you your 47-word paragraph contains 52 words. ↩︎

⁹ I recognize this metaphor might be doing more work than it can structurally support, but I'm going to leave it because the image of two people trading a spatula captures something about the Cyborg pattern that the word "integration" simply doesn't. ↩︎

¹⁰ This is related to what the philosopher Michael Polanyi called "tacit knowledge," the knowledge you can use but can't fully articulate. When a senior support agent says "something feels off about this ticket," they're drawing on tacit knowledge built through thousands of prior tickets. No one taught them the rule. The rule assembled itself. ↩︎

¹¹ There is, I should note, a counterargument: maybe AI creates new forms of peripheral participation, new kinds of apprentice work we haven't invented yet. Maybe the new on-ramp is learning to evaluate AI outputs, the way a previous generation's on-ramp was learning to evaluate search engine results. This is possible. It is also, at the moment, entirely speculative. ↩︎

¹² The structural irony here is worth pausing on: the people designing the AI workflows have the judgment to use AI well because they came up through the old system of peripheral participation. They are, in a sense, pulling up the ladder behind them, not out of malice but out of the genuinely innocent failure to notice the ladder was there. ↩︎

¹³ This is the question that keeps me up. Not the 3am kind. The Tuesday-afternoon-staring-at-a-Slack-thread kind, which is arguably worse because you can't even romanticize it. ↩︎

¹⁴ "Collectively sort of skipped it" is, admittedly, a generous way to describe what happened. What actually happened is that the training market bifurcated into the two poles that are easiest to sell: the beginner workshop (low barrier, high volume, feels actionable) and the advanced technical course (high price tag, niche audience, feels prestigious). The middle, where most of the actual value lives, is harder to package and harder to sell, and so it mostly doesn't exist. ↩︎

¹⁵ This last characteristic is the one that trips people up the most. A human colleague who doesn't know the answer will usually say so, or at least hesitate. An AI will produce an answer with the same syntactic confidence whether it's drawing on solid information or fabricating. Learning to read the difference is a skill nobody's teaching because we haven't even agreed it's a skill yet. ↩︎

¹⁶ Organizations communicate through what they don't say at least as much as through what they do. The absence of clear AI guidelines isn't neutral. It's a message: proceed at your own risk. And conscientious people hear "risk" louder than they hear "proceed." ↩︎

¹⁷ I'm aware that comparing AI proficiency to woodworking is the kind of analogy that could collapse under scrutiny, but the tactile dimension matters: the best AI users I've watched don't consult checklists. They feel when the output is wrong. They've internalized the frontier. And they got there the same way the carpenter got there, which is by running their hands over a lot of wood. ↩︎

¹⁸ Which, if you think about it, is what good management has always been: creating the conditions under which people can be productively wrong. The medium changes. The skill doesn't. ↩︎