David MacKay
 Search :

### Supervision information

Supervisions are Thursdays at 2pm and 5.30pm starting Feb 2nd 2006.

#### What work to do

The book contains numerous exercises, complete with worked solutions. A subset of these exercises will be designated as the exercises you should work on each week. [Generally, these exercises are the ones marked `highly recommended' by a marginal rat.] I encourage you not to look at the worked solutions for these exercises before you have attempted them yourself.

In the supervisions, we will focus in detail on just a few of the exercises. (You should work on all the exercises yourselves before supervisions.)

#### About the two weeks of the course so far

The first week's recommended exercises were: 1.3 (p.8), 1.5-7 (p.13), 1.9, & 1.11 (p.14).

The first supervision focused on exercises 1.9 and 1.11 (p.14). You all did a great job on this!

Recommended reading for the first four lectures was: Chapters 1, 2, 4, and 5.

Make sure you are happy with the facts that
a) entropy is maximized by the uniform distribution (exercise 2.25, page 37; solution on page 43)
b) Gibbs inequality (section 2.6) is true (exercise 2.26; solution on p 44).

#### About the next two weeks of supervisions

Recommended exercises before the 2nd supervision are: 5.22, 5.26, 5.27. (page 102-3)

The second and third supervision will focus on the question:

Invent a compressor and uncompressor for a source file of N=10,000 bits, each having probability f=0.01 of being a 1. Implement your inventions and/or estimate how well your method works (ex 5.29, page 103).
If it's helpful to you, here's a Huffman algorithm in perl, and here's a Huffman algorithm in python.

Recommended exercises before the third supervision are: 6.7, 6.17.

Recommended reading: Chapter 6 (sec 6.1-6.3)

Advance reading: I highly recommend that you read ahead into chapters 8, 9, 10, in order to prepare for the lectures on noisy-channel coding.

If any of the recommended exercises cause insurmountable difficulties, please (1) ask a friend; (2) ask me in supervisions; or (3) ask me some other time.

##### (arrangements for 2005)

Supervisions will be at 3pm and 5pm

```Dates:

Week 3	 3 Feb
Week 4	 10 Feb
Week 5	 17 Feb       **
Week 7	 3 Mar
Week 8	 10 Mar
Week 9	 17 Mar

** Note the exclusion of
Week 6	 24 Feb
which is intended to reduce your 6th week blues.
```
All supervisions will be in the Ryle seminar room (Rutherford building, upper floor), except: ** the 10th February supervision at 3pm will be in the Pippard Lecture theatre ***.

### Suggested exercises during week 5

Ex 22.5, mixture of 2 gaussians.

### Suggestions for supervision number 4

Ex 3.10 (p57) (children)
8.10, black and white cards I(;)
9.19 TWOS
9.20, birthday problem
15.5, 15.6, (233) magic trick
22.8 photon counter

8.3 (140), 8.7, 22.4, (mu/sigma)

### Suggestions for supervision number 5

Main activities of the supervision may be: 27.1 laplace approx
and | Handout 2 (two pages postscript) | pdf |

Also recommended: Exercises 29.14, 33.5, 33.7.
22.11 sailor
22.14 maximize density

### Compression challenge - compress me!

One of the main assignments for this course is to write compression and uncompression algorithms to compress a sparse file containing N=10000 bits of which roughly 0.01 are 1s; your algorithm should work well on all such files. Please email to djcm1 the compressed size that your algorithm achieves, and maybe write a brief webpage summarising your method. If you'd like to use the Huffman algorithm in constructing your solution, here's one in perl. [Worked solutions: C programs for compression]

Another of the three main assignments is this:

• implement or describe a method to distinguish, based on string of successive characters in a document,
• whether the document is an English-language document or a totally-random document.
• OR - whether the document is an English-language document or German
• OR - whether the document is an English-language document or backwards-English (siht ekil)
• estimate how many successive characters your method needs in order to work well.
• English and German letter frequencies are provided, in case you find them useful.

An outline of the main topics to be covered in each of the six supervisions is as follows:

1. Design your own code; typicality; Gibbs's inequality
2. Symbol codes, Arithmetic codes [Worked solutions: C programs for compression]
3. Capacity of channel; noisy channel coding theorem
4. Bayesian inference
5. Monte Carlo methods, variational methods
6. Single neuron, Hopfield network

First supervision

``` Some topics for discussion:
1.9      [p14]  make your own code
2.28     [p38]  efficient computation of entropy
2.36     [p39]  Bayes's theorem
```
A list of all the recommended exercises for the course is being refined below.

Before each supervision, please hand in a sheet of paper, with a summary of your REACTIONS to the chapters of the book. Please write one paragraph on each chapter, giving the parts that were difficult to follow, the examples that were helpful, the examples that were not included that should have been, anything that's still unclear or confusing. These reactions will help guide the supervision onto worthwhile topics of discussion. Reactions may be sent by email.
Particularly recommended exercises for 2004 are as follows (with bold highlighting exercises to be discussed in supervisions):
 Exercise Purpose 1.3 (p.7) familiarity with repetition code, and approximation 1.5, 1.6, 1.7 (p.13) familiarity with Hamming code 1.9 make your own code - very likely to be an exam question 2.4, 2.5 (p.27) familiarity with binomial distribution, how to compute variances 2.21-2.26 (p.37) working with probabilities, proving inequalities 2.28 efficient computation of entropy 2.16 (p36) law of large numbers 2.37 Bayes's theorem 2.17 manipulating probabilities 2.20 life in high dimensions 3.3 (p47) Bayes's theorem no specific exercise Typicality 5.21, 5.22 (p.102) Make optimal symbol code 5.29 Creative use of Huffman algorithm - likely to be part of an exam question 6.2, 6.3, 6.9 Arithmetic coding - likely to be part an exam question [From here onwards are the old ex's for 2003; watch this space for updates to 2004.] 8.3 (p.157), 8.4, 8.6, 8.7, 8.10 (159) Two-variable entropy expressions 9.17 (175), 10.12 (196), 15.10, 15.11 (262), 15.13 Computing Capacity of channel 9.20 (176) "Birthday problem" - collisions between random strings 9.19 Modelling real information sources and channels 15.3 (260) 15.4, 15.5 Counting arguments, computing mutual information 24.4, 24.5 (332), 24.8 (337). (Cut from list: 24.12) Finding Maximum Likelihood parameters 24.14 life in high dimensions; atypicality of the most probable state 24.15 occasional silliness of Maximum Likelihood answers 29.1 (377) Laplace approximation 31.13, 31.14 (415) Metropolis method 31.4 (404), 31.5 Gibbs sampling 33.3 (447) Think creatively about making a content-addressable memory 35.5, 35.6, 35.7 Variational methods 37.5 (484) Inference problems 42.1 (510) Perceiving connections between topics 42.2 (513) Deriving simple learning algorithms 42.5 (519) Inferences as "neurons" [43.1 (529) (This topic is probably not examinable)] Counting 43.3 Information in the real world 45.3, 45.4 (551), 45.9 (560) Hopfield net

For further suggested reading and examples before the supervision, see the course summary page.