Introduction

Author

Clayton Cafiero

Published

2026-06-01

About this course

What it is:

  • A bit of history, GOFAI (good old-fashioned AI), introduction to neural networks
  • Logic and logic programming (with Prolog); knowledge representation
  • Heuristic / informed search
    • \astar, \idastar
    • Admissibility and consistency
    • Monte Carlo search (maybe)
  • Adversarial games
    • Minimax
    • \alpha\beta pruning
    • Expectiminmax
  • Markov models and Markov decision processes
  • Decision tree, ensemble learning, and random forest
  • Neural model and simple multi-layer perceptron networks
  • Other topics TBD

What it is not:

  • A substitute for CS 3540 / 6540 Machine Learning and Deep Learning
  • A course in AI policy or ethics
  • A course on logic or proof systems
  • A course in applied AI techniques

What is artificial intelligence?

  • “Reasoning” and problem-solving
  • Knowledge representation and ontologies
  • Decision-making and classification
  • Learning (many different ways)
  • “Perception” (gaining information about the environment)

Apart from poetic flights of fancy, Turing was the among the first to address the question of artificial intelligence (1950). In fact, he poses the question “Can machines think?” in the first sentence of his paper However, the state of computing machinery wasn’t ready at the time (who knows? it may not be ready now!). John McCarthy, Marvin Minsky, Nathaniel Rochester, and others founded the field at the Dartmouth Summer Research Project on Artificial Intelligence, in 1956. The term “artificial intelligence” was coined for this project, and stuck. The project included discussions of deductive reasoning, symbolic logic, and early expert systems. Unfortunately, little of this project was preserved apart from the term “artificial intelligence” and some personal notes and photos.

What is AI?

Is there an adequate, widely-accepted definition of artificial intelligence? No, there is no consensus definition. Some would say AI is the study of heuristic approaches to intractable problems. But things get messy pretty quickly, and there’s a lot that’s called “AI” that doesn’t fit this description.

The goal of AI is to develop machines that behave as though they were intelligent.

–John McCarthy (paraphrased)


Artificial intelligence is the study of how to make computers do things at which, at the moment, people are better.

–Elaine Rich


AI is the ability of digital computers or computer-controlled robots to solve problems that are normally associated with the higher intellectual processing capabilities of humans.

–Encyclopedia Brittanica


Weak AI

John Searle

Daniel Dennett
  • Human intelligence can be simulated, but the machine doing the simulation need not have conscious thought.
  • No claim is made about the ability to think or understand.

Strong AI

Pamela McCorduck

Ray Kurzweil
  • “The appropriately programmed computer with the right inputs and outputs would have a mind in exactly the same sense human beings have minds.”
  • Machines can be made to think and have genuine understanding.

Weak AI vs strong AI

Most AI researchers “don’t care about the strong AI hypothesis–as long as the program works, they don’t care whether you call it a simulation of intelligence or real intelligence.”

–Stuart Russell and Peter Norvig

A little history

  • “Pre-history”: many calculating machines; Leibniz (symbolic thought)

  • 1940s: Game theory (Von Neumann, others)

  • 1943: McCulloch and Pitts introduce a mathematical model of simple neural networks, which laid the groundwork for research into how neurons can process information.

  • 1950: Turing’s “Computing Machinery and Intelligence”—the imitation game.

  • 1950: Nash’s dissertation on non-cooperative games

  • 1951: First neural network (Minsky, 40 neurons)

  • 1956: Dartmouth Conference, where the term “artificial intelligence” was first used. McCarthy, Minsky, Shannon, Rochester, and others participated.

  • 1956: Logic Theorist (Newell and Simon)

  • 1957: Perceptron model introduced by Rosenblatt

  • 1958: LISP (McCarthy)

  • 1959: GPS (general problem solver, Newell and Simon)

  • 1959: Minsky and Papert elaborate on perceptron models; MIT AI lab founded; XOR problem

  • 1967: ELIZA (Weizenbaum / MIT)

  • 1968: \astar algorithm (Hart, Nilsson, Raphael)

  • 1972: Introduction of Prolog (Colmerauer and Kowalski)

  • 1976: MYCIN expert system (Shortliff / Stanford School of Medicine)

  • 1986: Backpropagation algorithm (Hinton, Rumelhart, Williams)

  • 1989: Backpropagation and CNN for handwritten digits (LeCun)

  • 1997: Deep Blue vs Kasparov

  • 2006: Resurgence of research into neural networks (Hinton)

  • 2007: Checkers solved

  • 2008: GPUs emerge

  • 2012: ImageNet and AlexNet (Krizhevsky, Sutskever, Hinton)

  • 2014: GANs (generative adversarial networks; Goodfellow)

  • 2016: AlphaGo (DeepMind / Google) beats Lee Sedol

  • 2018: BERT (bidirectional encoder representations from transformers; Google) and GPT (generative pre-trained transformer)

  • 2022: Open AI’s ChatGPT open beta released to public

  • 2023: Anthropic’s Claude released to public

AI winters

Approximately 1974–1980 and 1987–2000

  • Brought about by failure to achieve success in certain fields (machine translation, others)
  • Criticism of perceptron model, Lighthill report, DARPA cutbacks

We are currently in a boom phase.

A busy picture

Symbolic vs connectionist AI

Symbolic approach

  • Turing, Newell and Simon, McCarthy, Colmarauer, etc.
  • Logic programming and many “traditional” GOFAI techniques
  • Proof systems and expert systems, decision tree, etc.
    • Generally require knowledge engineering (bottleneck)
    • Based on models of the world
    • Deductive reasoning and logical inference
    • Useful when verification and complete explanation are needed
    • Can be very accurate for specific tasks
    • Trouble handling noisy data
    • High human effort; high maintenance

Connectionist approach

  • McCulloch and Pitts, Minsky, Hinton, LeCun, Bengio, Goodfellow, Krizhevsky, Sutskever, etc.
  • Neural networks, transformers, SVMs, etc.
    • Generally require a lot of data (which may require labeling)
    • Do not need a model of the world (statistical inference)
    • Typically trained off-line (at least initially)
    • Not as much explanatory power as symbolic approaches
    • Can be very accurate for specific tasks
    • More robust when presented with noisy data
    • Computationally very expensive in training and inference
    • Currently booming

Copyright © 2023–2026 Clayton Cafiero

No generative AI was used in producing drafts of this material. This was written the old-fashioned way. AI was used to rewrite existing pseudocode in LaTeX to produce standalone *.tex files for rendering, and for revisions toward satisfying WCAG 2.1 AA-level accessibility standards as required by UVM policy. It may also have been used to proofread selected human-written prose. Claude 2.1.150 with model Sonnet 4.6. Revisions, if any, were performed by the author. AI was not used in generating any code whatsoever. All code samples and starter code are by the author only.