Information Theory, Pattern Recognition and Neural Networks (1997)

This directory contains old lecture notes from my 1997 course. You should only take stuff from here if you really can't find what you want in the up to date site.

If you are an undergraduate in Cambridge, please go away. No peeking at last year's notes!

[want to choose a nearer web site? | Cambridge, Europe | Toronto, North America |]

Lecture notes

Downloading Options: choose http/ftp, choose one-page or two-page

Chapter 1 - Lecture 1 | two-page | html | (ftp from cambridge) 20 pages.
Introduction to information theory

Part I: data compression

Chapter 2 - Lectures 2 and 3 | two-page | (ftp from cambridge) 12 pages.
Data Compression: Source Coding Theorem
Chapter 3 - Lecture 4 | two-page | (ftp from cambridge) 7 pages.
Data Compression: Symbol Codes
Chapter 4 - Lecture 5 | two-page | (ftp from cambridge) 12 pages.
Data Compression: Stream Codes

Part II: communication over noisy channels

Chapter 5 - Lecture 6 | two-page | (ftp from cambridge) 10 pages.
Noisy Channels
Chapter 6 - Lecture 7 | two-page | (ftp from cambridge) 8 pages.
Noisy Channel Coding Theorem
Chapter 7 - Lecture 8 | two-page | (ftp from cambridge) pages.
Real Channels and Practical Error Correcting Codes

Part III: probabilities

Chapter 8 - Lecture 9 | two-page | (ftp from cambridge) pages.
Bayesian Inference
Chapter 9 - Extra reading | two-page | (ftp from cambridge) pages.
Ising models
Chapter 10 - Lecture 10 | two-page | (ftp from cambridge) pages.
Variational methods
Chapter 11 - Lecture 10 | two-page | (ftp from cambridge) pages.
Monte Carlo methods
Extra figures for chapters 2,3,6,7 (ftp from cambridge) pages.
You only need these figures if you got an *old* version of the above chapters. (pre-97.02)
Extra figures for chapter 7 (ftp from cambridge) pages.
You only need these figures if you got an *old* version of chapter 7. (pre-97.02)

Part IV: neural networks

Chapter 12 - Lecture 11 | two-page | (ftp from cambridge) 3 pages.
Introduction to neural networks
Chapter 13/14 - Lecture 11 & 13 | two-page | (ftp from cambridge) 26 pages.
The single neuron as a classifier ; Learning as Inference
Chapter 15 - Lecture 12 | two-page | (ftp from cambridge) 10 pages.
The capacity of a single neuron
Chapter 16 - Lecture 14/15 | two-page | (ftp from cambridge) 6 pages.
The Hopfield network
Chapters 17,18 - Lecture 15/16 | two-page | (ftp from cambridge) 18 pages.
The Capacity of the Hopfield network ; From Hopfield networks to Boltzmann Machines ; Learning in multilayer networks
Appendix 1 | two-page | (ftp from cambridge) 24 pages.
Solutions to exercises in chapters 1-4
Appendix 2 | two-page | html | (ftp from cambridge) 11 pages.
Solutions to exercises in chapters 5,6, and a few others.
Poster (ftp from cambridge) 10 pages.
A poster presenting a summary of information theory and low density parity check codes.

Copyright issues: you are welcome to download and print the above documents, and give them to your friends. Please do not make multiple copies of them (e.g., more than ten) without consulting me. (mackay@mrao.cam.ac.uk) Thanks.

History:
Draft 1.1.1 - March 14 1997.
Draft 1.2.1 - April 4 1997.
Draft 1.2.3 - April 9 1997.
Draft 1.2.4 - April 10 1997. Margins altered so as to print better on Northamerican paper


Administrative stuff

Admin 1 (ftp from cambridge) 1 page.
Supervision arrangements
synopsis (ftp from cambridge) 2 pages.
Postscript version of synopsis, including updated bibliography.

A message to part III students, April 1997

Welcome back! Hope you had a good holiday.

There is no substantially new material on this web site compared with the lecture notes I handed out. I have not been able to write up any more worked solutions, for example.

I will be away in Canada from April 12th to May 1st, so I will only be reachable by email.

Errors - make an income for yourself!

Something I intended to say last term was that I would be grateful to hear about errors in my lecture notes. If you find an error that has not already been detected, I'll pay you 50p or one pound, depending if it's a small or a big error. [This offer will be withdrawn if I appear to be going bankrupt.]

Synopsis

(Also see up to date synopsis in admin section)

	Information Theory, Pattern Recognition and Neural Networks

                                                    Minor Option  
                                                    Lecturer: David MacKay 

Introduction to information theory                                    {1}
	The possibility of perfect communication over noisy channels.

Entropy and data compression                                    {3}
	Entropy, conditional entropy, mutual information.  Shannon's
	source coding theorem: entropy as a measure of information
	content. Codes for data compression. Uniquely decodeable codes
	and the Kraft-MacMillan inequality.  Huffman codes. Arithmetic
	coding.	Lempel-Ziv coding.

Communication over noisy channels                                   {3}
	Definition of channel capacity.  Capacity of binary symmetric
	channel; of binary erasure channel; of Z channel.  Shannon's
	noisy channel coding theorem.  Practical error-correcting
	codes.

Statistical inference, data modelling and pattern recognition     {3}
	The likelihood function	and Bayes' theorem.
	Inference of discrete and continuous parameters.
	Curve fitting. Classification. Density estimation.

Neural networks as information storage devices                         {2}
	Capacity of a single neuron.  Hopfield 
	network and its relationship to spin glasses. 
	Boltzmann machine and maximum entropy. 

Data modelling with neural networks                                      {2}
	Interpolation and classification using multilayer perceptrons. 
	Backpropagation	algorithm.

Unsupervised neural networks                                             {2}
	Principal component analysis. Vector quantization. 
	Density modelling with neural networks. 
	Kohonen network. Helmholtz machine.

-----------------

Required courses: 1B Mathematics
----------------

References
----------

Berger, J. (1985)
 Statistical Decision theory and Bayesian Analysis. Springer.

Bishop, C.M. (1995)
 Neural Networks for Pattern Recognition. Oxford University Press.

Blahut, R.E. (1987)
  Principles and Practice of Information Theory. New York: Addison-Wesley.

Box, G.E.P. & Tiao, G.C. (1973)
 Bayesian inference in statistical analysis. Addison-Wesley.

Bretthorst, G. (1988)
 Bayesian spectrum analysis and parameter estimation. Springer.

Cover, T.M. & Thomas, J.A. (1991)
 Elements of Information Theory. New York: Wiley.

Duda, R. & Hart, P. (1973)
 Pattern Classification and Scene Analysis. Wiley.

Hertz, J., Krogh, A. & Palmer, R.G. (1991)
 Introduction to the Theory of Neural Computation. Addison-Wesley.

Jeffreys, H. (1939)
  Theory of Probability. Oxford Univ. Press.

McEliece, R.J. (1977)
  The theory of information and coding: a mathematical framework
  for communication. Reading, Mass.: Addison-Wesley.

Rosencrantz, R.D. (1983)
  E.T. Jaynes. Papers on Probability, Statistics and Statistical
  Physics. Kluwer.

Witten, I.H., Neal, R.M. & Cleary, J.G. (1987)
 Arithmetic coding for data compression.
 Communications of the ACM 30 (6):520-540.
My old 8 lecture course on Information Theory. | mirror (Toronto) |
David MacKay <mackay@mrao.cam.ac.uk>
Last modified: Tue Dec 23 17:04:47 1997