Abstract landscape painting with a winding path through colorful hills

Machine intelligence and human intelligence differ in kind, not just in degree.

AI systems still lack working long-term memory and struggle to learn a topic from a handful of examples. Agents reason well, but they remain unreliable at detecting patterns between ideas that look unrelated on the surface.

Humans learn language, reading, social behaviour, and motor skills from comparatively few examples, and retrieve memories and form connections far faster, despite a much smaller working memory and retrieval capacity.

Agents do passable retrieval over a far larger working set. They read through huge archives, summarize long documents, and scale horizontally by splitting work across many copies of themselves. What they lack is precision and robustness when a connection has to be drawn across domains.

Machine intelligence is broad. Agents are knowledgeable, fast, scalable, and composable in ways humans are not. Their thinking is still shallow by comparison.

A human mind is not just one thread of reasoning. Humans constantly switch between different states of mind, and more importantly, they combine them. A structural engineer designing a bridge is not only making it stand. They are also accounting for wind, traffic load, the cost of materials, how it will be built in stages, and how it ages over thirty years. A parent is not only answering a question. They are also thinking about safety, emotion, timing, discipline, and the child's personality. A manager in a tense meeting is not only tracking the agenda. They are also reading social cues, remembering past disagreements, predicting reactions, and choosing words carefully.

In a human mind none of this is a separate task. It all runs together as one process.

Agents do not work this way. An agent performs best confined to one clear mode: draft this document, summarize this report, plan this trip, answer this question, find this information. Two modes at once is already harder. Ask it to be creative, precise, emotionally aware, legally careful, factually correct, and strategically thoughtful at the same time, and the output degrades.

This is why the analogy of "AI worker" is often misleading. To replace a person you have to replace every state of mind that person moves through in their work. A pilot is not just "flying the plane." They are watching the weather ahead, tracking fuel, listening to air traffic control, and keeping the passengers calm. A chef in a dinner rush is not just "cooking." They are timing every table at once, watching which cook is falling behind, tasting for seasoning, and protecting the cost of each plate. An electrician is not just "wiring a house." They are reading code requirements, planning the load, thinking about whoever repairs this in ten years, and making sure they don't get themselves killed.

Agents handle each of these individually well, but each is a separate mode, and most human work needs several running at once. That asymmetry drives everything below.

Procedural memory versus associative memory

Agents retrieve memory procedurally. Something has to decide a fact is relevant, go fetch it from a file, a query, or a past message, and feed it back into the task. Nothing surfaces on its own. If the system doesn't go looking, the fact may as well not exist. This is why an agent can read a hundred documents and still miss the one thing a teammate would remember from a meeting six months ago. The fact is there. The agent just never had a reason to look for it.

Human memory is associative. It surfaces on its own, usually because the current moment feels like an old one. A smell brings back a childhood kitchen. A tone of voice drops you back into an argument. A face feels familiar before you know why. This is unreliable: human memory is lossy, biased, often plain wrong. But the mechanism is doing something agents don't. We retrieve context in the background without asking for it. Agents select context. Humans get interrupted by it.

Association without representation

Humans connect ideas that may have no surface relationship at all. A number, a colour, a weekday, and a place from childhood can fuse in your head with no logical path between them. Agents relate ideas through represented similarity: similar words, similar documents, similar concepts. Humans also relate them through felt similarity, which is much weirder and harder to fake.

Dreams are the extreme case of this associative pull, and the oldest evidence we have that the mind retrieves by feeling rather than by category. People have recorded and tried to interpret dreams for thousands of years, from Babylonian omen tablets to Freud to modern sleep labs, and the mechanism is still not settled. We do not fully know why we dream, how the brain constructs a dream, or what the experience is for. But the structure is revealing. A dream stitches together people, places, and feelings that have no business being together, and it stays coherent while it runs. It is not random like noise. It is random like memory: it moves along emotional continuity rather than narrative logic, surfacing associations no one deliberately filed. That is the human retrieval system with the supervision turned off, and it still produces something that feels like a single, lived experience.

Agents based on LLMs struggle to replicate this kind of associative pattern, especially if it is not well represented in training data or post-training schemes.

What agent scaffolding compensates for

The design of agent scaffolding — including coding harnesses, memory systems, skills, and post-training practices — reflects the inherent weaknesses of LLMs. Over the last few years we've discovered that agents perform better when they can:

  • delegate their tasks to more specialized agents (sub-agents)
  • write and retrieve their own memory
  • compact their own context

Modern agentic engineering has introduced these and many other features.

All of these features exploit the strengths of LLMs while compensating for weaknesses. LLMs have gotten really good at long-context retrieval and can benefit greatly from massive parallelism, but they still struggle to make optimal decisions consistently in demanding tasks.

Where each kind of intelligence has the advantage

Agents are broad, fast, scalable, and composable. They read more than us, search more than us, and hold a larger working set than us. Inside a single mode they are often better than a competent person. Their weakness is integration: many domains at once, memory that should have surfaced but didn't, connections nobody wrote down.

Humans are the inverse. Small working memory, slow, biased, error-prone, forever overfitting patterns. But the cognition is integrated. A person can be cautious and creative in the same breath, sense something is off before they can say why, and get reminded of the right thing without searching for it. But for now, agents behave much less like humans and more like states of mind.