(1997 synopsis) Information Theory, Pattern Recognition and Neural Networks Minor Option Lecturer: David MacKay Introduction to information theory {1} The possibility of perfect communication over noisy channels. Entropy and data compression {3} Entropy, conditional entropy, mutual information. Shannon's source coding theorem: entropy as a measure of information content. Codes for data compression. Uniquely decodeable codes and the Kraft-MacMillan inequality. Huffman codes. Arithmetic coding. Lempel-Ziv coding. Communication over noisy channels {3} Definition of channel capacity. Capacity of binary symmetric channel; of binary erasure channel; of Z channel. Shannon's noisy channel coding theorem. Practical error-correcting codes. Statistical inference, data modelling and pattern recognition {3} The likelihood function and Bayes' theorem. Inference of discrete and continuous parameters. Curve fitting. Classification. Density estimation. Neural networks as information storage devices {2} Capacity of a single neuron. Hopfield network and its relationship to spin glasses. Boltzmann machine and maximum entropy. Data modelling with neural networks {2} Interpolation and classification using multilayer perceptrons. Backpropagation algorithm. Unsupervised neural networks {2} Principal component analysis. Vector quantization. Density modelling with neural networks. Kohonen network. Helmholtz machine. ----------------- Required courses: 1B Mathematics ---------------- References ---------- Berger, J. (1985) Statistical Decision theory and Bayesian Analysis. Springer. Bishop, C.M. (1995) Neural Networks for Pattern Recognition. Oxford University Press. Blahut, R.E. (1987) Principles and Practice of Information Theory. New York: Addison-Wesley. Box, G.E.P. & Tiao, G.C. (1973) Bayesian inference in statistical analysis. Addison-Wesley. Bretthorst, G. (1988) Bayesian spectrum analysis and parameter estimation. Springer. Cover, T.M. & Thomas, J.A. (1991) Elements of Information Theory. New York: Wiley. Duda, R. & Hart, P. (1973) Pattern Classification and Scene Analysis. Wiley. Hertz, J., Krogh, A. & Palmer, R.G. (1991) Introduction to the Theory of Neural Computation. Addison-Wesley. Jeffreys, H. (1939) Theory of Probability. Oxford Univ. Press. McEliece, R.J. (1977) The theory of information and coding: a mathematical framework for communication. Reading, Mass.: Addison-Wesley. Rosencrantz, R.D. (1983) E.T. Jaynes. Papers on Probability, Statistics and Statistical Physics. Kluwer. Witten, I.H., Neal, R.M. & Cleary, J.G. (1987) Arithmetic coding for data compression. Communications of the ACM 30 (6):520-540.