Advertisement

Hitting the Books: Why a Dartmouth professor coined the term 'artificial intelligence'

John McCarthy's 'summer study' saw the first computers learning from data.

Roger Ressmeyer via Getty Images

If the Wu-Tang produced it in '23 instead of '93, they'd have called it D.R.E.A.M. — because data rules everything around me. Where once our society brokered power based on strength of our arms and purse strings, the modern world is driven by data empowering algorithms to sort, silo and sell us out. These black box oracles of imperious and imperceptible decision-making deign who gets home loans, who gets bail, who finds love and who gets their kids taken from them by the state.

In their new book, How Data Happened: A History from the Age of Reason to the Age of Algorithms, which builds off their existing curriculum, Columbia University Professors Chris Wiggins and Matthew L Jones examine how data is curated into actionable information and used to shape everything from our political views and social mores to our military responses and economic activities. In the excerpt below, Wiggins and Jones look at the work of mathematician John McCarthy, the junior Dartmouth professor who single-handedly coined the term "artificial intelligence"... as part of his ploy to secure summer research funding.

White background with multicolored blocks streaming down from the top like a Tetris board to fill in
WW Norton

Excerpted from How Data Happened: A History from the Age of Reason to the Age of Algorithms by Chris Wiggins and Matthew L Jones. Published by WW Norton. Copyright © 2023 by Chris Wiggins and Matthew L Jones. All rights reserved.


Confecting “Artificial Intelligence”

A passionate advocate of symbolic approaches, the mathematician John McCarthy is often credited with inventing the term “artificial intelligence,” including by himself: “I invented the term artificial intelligence,” he explained, “when we were trying to get money for a summer study” to aim at “the long term goal of achieving human level intelligence.” The “summer study” in question was titled “The Dartmouth Summer Research Project on Artificial Intelligence,” and the funding requested was from the Rockefeller Foundation. At the time a junior professor of mathematics at Dartmouth, McCarthy was aided in his pitch to Rockefeller by his former mentor Claude Shannon. As McCarthy describes the term’s positioning, “Shannon thought that artificial intelligence was too flashy a term and might attract unfavorable notice.” However, McCarthy wanted to avoid overlap with the existing field of “automata studies” (including “nerve nets” and Turing machines) and took a stand to declare a new field. “So I decided not to fly any false flags anymore.” The ambition was enormous; the 1955 proposal claimed “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.” McCarthy ended up with more brain modelers than axiomatic mathematicians of the sort he wanted at the 1956 meeting, which came to be known as the Dartmouth Workshop. The event saw the coming together of diverse, often contradictory efforts to make digital computers perform tasks considered intelligent, yet as historian of artificial intelligence Jonnie Penn argues, the absence of psychological expertise at the workshop meant that the account of intelligence was “informed primarily by a set of specialists working outside the human sciences.” Each participant saw the roots of their enterprise differently. McCarthy reminisced, “anybody who was there was pretty stubborn about pursuing the ideas that he had before he came, nor was there, as far as I could see, any real exchange of ideas.”

Like Turing’s 1950 paper, the 1955 proposal for a summer workshop in artificial intelligence seems in retrospect incredibly prescient. The seven problems that McCarthy, Shannon, and their collaborators proposed to study became major pillars of computer science and the field of artificial intelligence:

  1. “Automatic Computers” (programming languages)

  2. “How Can a Computer be Programmed to Use a Language” (natural language processing)

  3. “Neuron Nets” (neural nets and deep learning)

  4. “Theory of the Size of a Calculation” (computational complexity)

  5. “Self-​improvement” (machine learning)

  6. “Abstractions” (feature engineering)

  7. “Randomness and Creativity” (Monte Carlo methods including stochastic learning).

The term “artificial intelligence,” in 1955, was an aspiration rather than a commitment to one method. AI, in this broad sense, involved both discovering what comprises human intelligence by attempting to create machine intelligence as well as a less philosophically fraught effort simply to get computers to perform difficult activities a human might attempt.

Only a few of these aspirations fueled the efforts that, in current usage, became synonymous with artificial intelligence: the idea that machines can learn from data. Among computer scientists, learning from data would be de-​emphasized for generations.

Most of the first half century of artificial intelligence focused on combining logic with knowledge hard-​coded into machines. Data collected from everyday activities was hardly the focus; it paled in prestige next to logic. In the last five years or so, artificial intelligence and machine learning have begun to be used synonymously; it’s a powerful thought-​exercise to remember that it didn’t have to be this way. For the first several decades in the life of artificial intelligence, learning from data seemed to be the wrong approach, a nonscientific approach, used by those who weren’t willing “to just program” the knowledge into the computer. Before data reigned, rules did.

For all their enthusiasm, most participants at the Dartmouth workshop brought few concrete results with them. One group was different. A team from the RAND Corporation, led by Herbert Simon, had brought the goods, in the form of an automated theorem prover. This algorithm could produce proofs of basic arithmetical and logical theorems. But math was just a test case for them. As historian Hunter Heyck has stressed, that group started less from computing or mathematics than from the study of how to understand large bureaucratic organizations and the psychology of the people solving problems within them. For Simon and Newell, human brains and computers were problem solvers of the same genus.

Our position is that the appropriate way to describe a piece of problem-​solving behavior is in terms of a program: a specification of what the organism will do under varying environmental circumstances in terms of certain elementary information processes it is capable of performing... ​Digital computers come into the picture only because they can, by appropriate programming, be induced to execute the same sequences of information processes that humans execute when they are solving problems. Hence, as we shall see, these programs describe both human and machine problem solving at the level of information processes.

Though they provided many of the first major successes in early artificial intelligence, Simon and Newell focused on a practical investigation of the organization of humans. They were interested in human problem-​solving that mixed what Jonnie Penn calls a “composite of early twentieth century British symbolic logic and the American administrative logic of a hyper-​rationalized organization.” Before adopting the moniker of AI, they positioned their work as the study of “information processing systems” comprising humans and machines alike, that drew on the best understanding of human reasoning of the time.

Simon and his collaborators were deeply involved in debates about the nature of human beings as reasoning animals. Simon later received the Nobel Prize in Economics for his work on the limitations of human rationality. He was concerned, alongside a bevy of postwar intellectuals, with rebutting the notion that human psychology should be understood as animal-​like reaction to positive and negative stimuli. Like others, he rejected a behaviorist vision of the human as driven by reflexes, almost automatically, and that learning primarily concerned the accumulation of facts acquired through such experience. Great human capacities, like speaking a natural language or doing advanced mathematics, never could emerge only from experience—​they required far more. To focus only on data was to misunderstand human spontaneity and intelligence. This generation of intellectuals, central to the development of cognitive science, stressed abstraction and creativity over the analysis of data, sensory or otherwise. Historian Jamie Cohen-​Cole explains, “Learning was not so much a process of acquiring facts about the world as of developing a skill or acquiring proficiency with a conceptual tool that could then be deployed creatively.” This emphasis on the conceptual was central to Simon and Newell’s Logic Theorist program, which didn’t just grind through logical processes, but deployed human-​like “heuristics” to accelerate the search for the means to achieve ends. Scholars such as George Pólya investigating how mathematicians solved problems had stressed the creativity involved in using heuristics to solve math problems. So mathematics wasn’t drudgery — ​it wasn’t like doing lots and lots of long division or of reducing large amounts of data. It was creative activity — ​and, in the eyes of its makers, a bulwark against totalitarian visions of human beings, whether from the left or the right. (And so, too, was life in a bureaucratic organization — ​it need not be drudgery in this picture — ​it could be a place for creativity. Just don’t tell that to its employees.)

This article contains affiliate links; if you click such a link and make a purchase, we may earn a commission.