% lab 8: Bayes Rule and Bayesian updates to probability
clear
close all



%% 1 - conditional and marginal probability (prosecutor fallacy)
% see slides.
sensitivity = 0.9;              % probability of positive test given disease
specificity = 0.8;              % probability of negative test given no disease
prevalence = 0.02;
n = 100000;                     % number of test subjects

%% Compute the full table of expected numbers of subjects, both 
% marginals (e.g., proportion and number who test negative) and joint 
% (e.g., proportion and number of subjects who test negative and 
% don't have Covid)

%% Compute P(infected | test positive) and p(not infected | test negative)
% two different ways (printed as percentages):

% First method: Use the numbers in the table you computed above, directly

% Second method: Use Bayes Rule (you'll need to find the denominator in
% your table above)

%% 2 - binomial
% Suppose we collected N coin flips of which proportion p came up head
% (or had the letter 'e', or whatever)
N = 10;
p = 4/10;
% If that proportion is ground truth, what is the mean, SD and variance of
% the count of heads in N coin flips?

% Method 1: Use makedist(), mean(), std(), var()
Bino = makedist('Binomial','n',N,'p',p);
mean = XXX etc.

% Method 2: compute the full binomial distribution and derive the 
% statistics from that. You may use nchoosek()
values = 0:N;           % the possible numbers of heads
% Loop because nchoosek() won't take a vector argument
ETC ETC
fprintf(1,'mean=%.2f, SD = %.2f var = %.2f\n',mn,sd,var);

% Method 3: Compute the values for a single coin flip, then generalize
% since this is the sum of N independent coin flips
bernmean = p;
bernvar = XXX
mn = XXX
var = XXX
sd = XXX
fprintf(1,'mean=%.2f, SD = %.2f var = %.2f\n',mn,sd,var);


%% 3 - Bayesian updating: Beta and Bernoulli distribution
% We start with a prior expectation of the probability of our Bernoulli
% random variable (P(heads) or P("e" in a name)
% We need a probability distribution over values of P(head), i.e., the 
% distribution ranges over the interval [0,1]. The "beta" distribution
% is the standard distribution used for this. It has two parameters, which
% behave as if you had already collected a series of observations of heads
% and tails. beta(1,1) is a flat, uniform distribution over [0,1], 
% beta(a+1,b+1) acts as if you have already tossed the coin a+b times and 
% observed 'a' heads and 'b' tails
x = linspace(0,1,101);          % calculate the distribution over x = 0,.01,.02,...,1
a = random(Bino);               % pick a random initial number of heads
b = N-a;                        % the corresponding number of tails
beta1 = betapdf(x,a+1,b+1);     % compute the beta "prior" distribution
% Now, plot the prior distribution and compute it's maximum (i.e., the 
% most likely estimate of the probability of heads before collecting the
% new sample data
figure;
plot(x,beta1)
hold on
% find the mode
[~,idx] = max(beta1);
sprintf('most likely rate is %.3f',x(idx))

% Next, collect new sample data
a2 = random(Bino);
b2 = N-a;

% Next, compute the posterior distribution, combining the new sample
% with the assumed sample that gave rise to the prior distribution

% Method 1: Just update the parameters of the beta distribution to 
% add in the new sample values
% Compute that posterior, plot it in the same figure as the prior and
% compute the x-value that yields its maximum (the "Maximum a Posteriori"
% or MAP estimate
atotal = a2 + a;
btotal = b2 + b;
beta2 = betapdf(x,XXX,XXX);
ETC

% Method 2 - use Bayes Rule
% compute P(x|data)=P(data|x)P(x), normalize it to have the same
% peak as the prior (for easy visual comparison), and compute the value
% of x for which posterior probability is maximal
prior = beta1;
likelihood = XXX
posterior = XXX
ETC

%% 4 - Bayesian updating: Normal distribution
%
% The previous example showed that the "conjugate" distribution to the
% Binomial is the Beta. That is, if you have a Beta prior distribution
% over the coin's probability 'p' and combine with binomial data (a set
% of coin tosses governed by P(tails)=p), the posterior distribution 
% remains a beta
%
% Here, we'll qualitatively check that the conjugate distribution for a
% normal distribution (unknown mean, but known variance sigma^2) is the
% normal distribution

% Pick a mean and standard deviation for your prior distribution
priormean = 0;
priorvar = 1;

% Pick a set of parameters from which you will draw your samples (number
% of samples and mean). You might as well use the common variance, since
% that's what we are assuming
N = 20;
datamean = 3;

% Draw a sample
priorsd = sqrt(priorvar);
sample = XXX

% Compute the prior over a fine, discrete set of values that is wide
% enough to encompass all samples and most of the probability mass
x = -5:.01:7;
nsample = length(x);
prior = XXX

% Compute the posterior using Bayes Rule, normalizing both prior and 
% posterior to sum to one over the discrete points
XXX

% Plot prior and posterior

figure;
plot(x,prior,'-');
hold on;
plot(x,posterior,'--');

% Play with this, examining what happens as the sample deviates more and
% more from the mean of the prior, and as the sample size increases
