Fodor After ChatGPT: The Machine Works, But Do We Understand It?

This article is more than three years in the making. Since ChatGPT came into my consciousness, I have been thinking about the work of Fodor.

Jerry Fodor (1935–2017) was an American philosopher known for his work in the philosophy of mind, cognitive science and the philosophy of language. He is especially associated with three important ideas: the “language of thought”, the view that thinking takes place in an internal mental code; the modularity of mind, the idea that some mental systems, such as vision, are relatively specialised, fast, automatic and informationally encapsulated; and the argument that central thought (reasoning, judgement and belief revision) is much harder to model computationally.

His scepticism about computational modelling of central thought led to him proposing “Fodor’s First Law of the Non-Existence of Cognitive Science”. It states, roughly, that the more global or isotropic a cognitive process is, the less we understand it. Encapsulated and modular systems, such as aspects of vision, are comparatively better understood. Global and unencapsulated systems, like analogical reasoning, are not understood in the same way.

Even before the technological leaps and bounds that produced LLMs, Fodor’s law was controversial. Why should our understanding of analogical reasoning, thought, decision-making and problem-solving by human beings be beyond scientific understanding?

But Fodor’s point was not that thinking does not happen. His point was that the parts of the mind we understand best are not the most central parts. We have made progress on perception, face recognition, language processing, memory impairment, and other relatively local functions. These systems appear, at least to some extent, to have boundaries. Damage one part of the system and a specific ability may be affected. A person may lose the ability to recognise faces. Another may struggle with spoken words. Another may have problems with object recognition.

But central thought is different.

When I decide whether to believe something, almost anything I know may become relevant. If I am deciding whether to trust a politician, I may think about history, personality, incentives, tribe, religion, money, previous promises, the press, my own bias, the bias of the person telling me the story, and even the timing of the statement. Nothing says in advance where the boundary of relevance lies.

This is what Fodor meant when he described central systems as isotropic or Quinean. A central belief-forming process is not neatly sealed off from the rest of the mind. In principle, anything can matter.

That is where the frame problem comes in. The frame problem began as a problem in artificial intelligence. Fodor used the thought experiment of a robot told to call Mary. The robot knows Mary’s number. It picks up the phone and begins to dial.

So far, so good. But the moment it acts, the world changes. The phone line is no longer free. The dial tone disappears. The robot’s fingers have moved. The telephone keypad is now in a different state. The call may connect or fail. Mary may answer or not answer. The robot may need to speak. Or listen. Or redial.

But millions of other things have not changed: the colour of the wall , the location of the moon, Mary’s childhood, the price of garri in Lagos, the price of fish and chips in London or Norway. So which facts should the robot update? Which should it ignore? Which should it keep available in case they become relevant later? What capacity would we require to store all this information, and how long would it take us to search through it?

We do this sort of thing effortlessly every day. That is why it is so easy to underestimate the problem. We walk into a room, pick up a cup, greet somebody, hear a half-sentence, notice a facial expression, remember something from last week, ignore the colour of the ceiling, and continue with the conversation. We do not explicitly calculate all the consequences of every action. We just know what matters. Or, more accurately, we mostly know what matters. Sometimes we misjudge relevance, overreact and miss the obvious. Politics alone is enough to remind anyone that human beings do not always update their beliefs rationally (apologies to Herbert Simon, 1916-2001).

Fodor’s point was that a lot of cognition depends on this ability to determine relevance. And this ability does not look modular. It does not look like a small, encapsulated input system. It looks global.That is why he was sceptical about the prospects of a proper computational psychology of central thought. If reasoning requires access to everything one knows, and if there is no principled way to limit what counts as relevant, then the dream of a neat computational theory of thought begins to look suspect.

Then came large language models. One has to be careful here. There is a lazy way to respond to Fodor by saying, “Well, ChatGPT exists, therefore Fodor was wrong.” I do not think that is good enough. But there is also a lazy way to defend Fodor by saying, “LLMs only predict the next token, therefore they tell us nothing about thought.” That is also too quick.

Large language models have changed the discussion. They have shown that machines can do far more with relevance, analogy and context than many people expected. You can give an LLM a question about philosophy, a legal problem, a mathematical explanation, a poem, a piece of code, or a family argument, and it will often select the relevant background with astonishing fluency. It continues to astonish me how a system built around predicting the next token, sometimes less than a word, can be so effective in conversing with human beings.

This is important. Old-fashioned symbolic AI often looked brittle because someone had to write the rules, and it is very hard to work out all the rules explicitly up front. Human conversations and interactions are full of improvisation. A single response brings in judgement, background knowledge, examples, tone, context and much else. LLMs show that statistical learning over vast language data can approximate some of these capacities surprisingly well.

So yes, LLMs challenge Fodor’s law. They challenge the claim that global-looking cognition is beyond machine modelling. They show that a practical version of common-sense relevance can be engineered in language, at least to a surprising degree. They also show that instead of focusing only on explicit and exhaustive rules for flexible reasoning, statistical learning over vast amounts of data can take us much further than many expected.

They also challenge the neat boundary between language and thought. A language model is not merely handling words in the old narrow sense. It appears to carry within language a great deal of world structure, social expectation, causal pattern, and human habit.

But this is where caution is needed. What LLMs have challenged is Fodor’s pessimism about performance. What they have not fully challenged is his worry about explanation. An LLM can give a good answer without giving us a full theory of human thought. It may perform well on many tasks without possessing beliefs in the human sense. It may select relevant details without understanding relevance as an embodied agent living in the world. It may sound wise without being wise.

If you have had extensive discussions over hours in the same chat or context window with an LLM, as I have done, you would have seen this. The moment you move to a brand-new window for one reason or another, and the LLM no longer has the full context of the previous discussion, its performance degrades. Whereas a conversation I had with a friend five years ago can be resumed today if we meet in a different context.

This distinction matters. A calculator can outperform most human beings in arithmetic. That does not mean it has a human understanding of number. A chess engine can defeat grandmasters. That does not mean it understands those things that influence human performance in competition: ambition, fear of humiliation, thirst for success, tiredness, ego, nerves, and so on. Likewise, an LLM can produce excellent reasoning-like text. But we should not immediately conclude that it has solved the nature of reasoning.

There are also real failures. LLMs can hallucinate and sound confident when they are actually wrong. Change the wording and an LLM can produce an entirely different response. Ask it to keep track of a complicated state over a long conversation and you may discover the difference between sounding coherent and actually maintaining a stable model of the world.

This is why the frame problem has not disappeared. It has returned in a new form. The old robot had to decide what to update after dialling Mary’s number. The modern AI agent has to decide what to remember, what to forget, which tool to call, which source to trust, which instruction overrides which, whether a fact is stale, whether the user has changed their mind, and whether the world outside the chat has moved on. That is still the frame problem. Only now it wears better clothing.

LLMs are impressive in language space. But the full frame problem is not merely a problem of producing relevant sentences. It is a problem of being an agent in a changing world. It is about action, consequence, memory, truth, embodiment and responsibility. The world does not stop at the edge of the prompt.

This is why I think Fodor is both wrong and right.He is wrong if his law means that cognitive science cannot make progress on central processes. It clearly can. AI, psychology, neuropsychology and neuroscience continue to make progress. The fact that we can now build systems that write essays, solve coding problems, summarise books, analyse images and hold extended conversations means that the bleakest form of Fodor’s pessimism cannot stand unchanged.

But he is right if his law is read as a warning. The more global the cognitive process, the harder it is to understand. We still do not really understand judgement. We do not fully understand common sense and analogy. We do not fully understand how human beings decide what matters. And perhaps most importantly, we do not fully understand how language, body, memory, culture, history and action combine to form what we call thought.

And this is one of the strange things about AI: we can build systems whose outputs often make sense, while still not fully understanding how those outputs were produced.

We use AI, but we do not fully understand the deep neural networks on which many of these systems are built. Sometimes, to make sense of what is going on inside them, we have to take their high dimensional internal representations and reduce them to two or three dimensions using techniques such as Principal Component Analysis. Even then, what we get is not a full explanation of the model’s reasoning. At best, we get a glimpse of the patterns the model may have learnt.

That is why I think some humility is needed. We are building systems that can astonish us, while still not fully understanding how they produce many of their answers. This is where the African angle interests me.

In Silicon Valley, the temptation is often to treat intelligence as information processing that can be scaled. Add data. Add compute. Add parameters. Add tools. Then one day, perhaps, intelligence emerges.

But in many African contexts, the question is not only whether a mind can be uploaded, simulated, automated or scaled. The question is also: where is this mind located? Who owns the infrastructure? Who pays for the generator? Which language carries the meaning? Which ancestors are being remembered? Which bodies are being excluded? Which histories are being flattened into data?

Fodor worried that central cognition was not encapsulated. African life almost proves the point every day. Nothing is ever merely one thing. A power cut is not just a technical failure. It is economics, politics, class, infrastructure, prayer, memory, corruption, improvisation and comedy. A family decision is not just individual preference. It is kinship, duty, money, faith, migration, reputation and survival. Relevance is everywhere. That may be why the frame problem feels less like an abstract puzzle and more like ordinary life. The machine asks: what has changed? The human asks: what does it mean? And in real life, meaning is rarely local.

So perhaps Fodor’s First Law should not be discarded. It should be rewritten.

The more global a cognitive process is, the less likely it is that one discipline, one model, or one architecture, one culture will fully explain it.

LLMs have shown that machines can go much further than Fodor expected. They can approximate relevance. They can imitate reasoning. They can extend memory. They can become collaborators in thought. But they have not abolished the mystery of central cognition. They have made the mystery more practical, more visible, and more urgent.

Before LLMs, Fodor’s law sounded like a warning against artificial intelligence. After LLMs, it sounds like a warning against intellectual arrogance. The danger now is not that machines can do nothing. They can do a great deal. The danger is that because they can do a great deal, we may assume we understand more than we do.

But how much of LLMs do we, as the human race, understand in depth? Some will argue that they remain black boxes. Our confidence in them comes less from a full understanding of their internal workings and more from how they are trained, tested, benchmarked and observed in use. We trust them partly because they work, not because we can fully explain how they arrive at every answer.

And that, in the end, may be the most Fodorian lesson of all.

Leave a comment