First hour: why should we talk about estimation?
Basic structure: Markov chain f(\theta) --- \theta --- X^n --- T(X^n). Parameter estimation given (possibly dependent) data X^n.
Example: \theta is a visual scene, f(\theta) indicates the presence of a given object, X^n is some collection of cortical cells
Basic questions:
1) What do we mean by "encode?" Collection of conditional distributions P(X^n ; \theta)
2) How much information does the population encode? I(X^n ; \theta), or some Bayes error.
3) How much information can the population encode? Constrained maximization problem again.
4) Data analysis: how do we measure how much info is encoded?
5) How is info "read out?"
Special cases:
1) Detection/discrimination
2) continuous/perhaps high-dimensional estimation
Second hour: outline of mathematical approaches to the above questions.
Asymptotic approaches are usually required; we'll talk about four.
1) Asymptotics of linear Gaussian estimation
2) Max likelihood; more generally, M-estimation and associated empirical processes: where K-L divergence, Fisher information come from
3) Asymptotics of a posteriori distributions, I(\theta ; X^n)
4) Hypothesis testing and Chernoff information