"Neural Nets and the Brain Model Problem," Ph.D. dissertation, Princeton University, 1954. The first publication of theories and theorems about learning in neural networks, secondary reinforcement, circulating dynamic storage and synaptic modifications.
Computation: Finite and Infinite Machines, Prentice-Hall, 1967. A standard text in Computer Science. Out of print now, but soon to reappear.
Semantic Information Processing, MIT Press, 1968. This collection had a strong influence on modern computational linguistics.
Perceptrons, (with Seymour A. Papert), MIT Press, 1969 (Enlarged edition, 1988). Developed the modern theory of computational geometry and established fundamental limitations of loop-free connectionist learning machines. Contrary to popular belief, the result in this book are not restricted to networks with 1, 2, or 3 layers; in fact virtually every theorem is easily seen to apply to feedforward networks of any depth, with appropriate reductions in the still-exponential growth rates! Neural net theorists should read it again, this time attending to the central scaling issues.
Artificial Intelligence, with Seymour Papert, Univ. of Oregon Press, 1972. Out of print.
Robotics?, Doubleday, 1986. Edited collection of essays about robotics, with Introduction and Postscript by Minsky.
The Society? of Mind, Simon and Schuster, 1987. The first comprehensive description of the Society of Mind theory of intellectual structure and development. See also The Society of Mind (CD-ROM version), Voyager, 1996.
The Turing? Option, with Harry Harrison, Warner Books, New York, 1992. Science fiction thriller? about the construction of a superintelligent robot in the year 2023..