EECE5644 Assignment 0

Question 1 (30%)
Design a classifier that achieves minimum probability of error for a three-class problem where the class priors are respectively p(L = 1) = 0.15, p(L = 2) = 0.35, p(L = 3) = 0.5 and the class- conditional data distributions are all Gaussians for two-dimensional data vectors:
􏰀−1􏰁 􏰀 1 −0.4􏰁 􏰀1􏰁 􏰀0.5 0􏰁 􏰀0􏰁 􏰀0.1 0􏰁 N(0,−0.40.5) N(0,00.2) N(1,00.1)
Generate N = 10000 samples according to this data distribution, keeping track of the true class labels for each sample. Design your optimal classifier according to the above probabilities and parameterizations. Then apply this classifier to the dataset and obtain decision labels for each sample. Report the following:
• actual number of samples that were generated from each class;
• the confusion matrix for your classifier consisting of number of samples decided as class
r ∈ {1, 2, 3} when their true labels were class c ∈ {1, 2, 3}, using r, c as row/column indices;
• the total number of samples misclassified by your classifier;
• an estimate of the probability of error your classifier achieves based on these samples;
• avisualizationofthedataasa2-dimensionalscatterplot,withtruelabelsanddecisionlabels
denoted using two separate visulization cues, such as marker shape and marker color;
• a clear but brief description of the results presented above.
CS Help, Email: tutorcs@163.com
Question 2 (30%)
Consider a scalar real-valued feature x ∈ R that has the following probability distributions under two class labels:
x if0≤x≤1 p(x|L=1)=2−x if1≤x≤2
0 otherwise
Minimum Expected Loss Classification
p(x|L=2)= 2x−3 if 23 ≤x≤ 52 0 otherwise
Let loss values be set to Loss(Decide i when truth is j) = λij ≥ 0 for i, j ∈ 1,2, with loss of erroneous decisions assigned to be greater than corresponding correct decisions. Let the class priors be q1 = p(L = 1) and q2 = p(L = 2), respectively. Express the minimum expected loss decision rule with a discriminant function that is simplified as much as possible. Show your steps.
Maximum a Posteriori Classification
For the case when 0-1 loss (zero-one loss) assignments are used, the minimum expected risk classifier reduces to the maximum a posteriori classification rule. For this case, express the maxi- mum a posteriori classification rule.
Maximum Likelihood Classification
In addition to 0-1 loss assignments, assume that the class priors are equal (uniformly dis- tributed). In this case, minimum expected risk further reduces to maximum likelihood classifica- tion. Express the maximum likelihood classification rule for this case.
浙大学霸代写加微信 cstutorcs
Question 3 (40%)
Generate N random iid (independent and identically distributed) 2-dimensional samples from two Gaussian pdfs N(μi,Σi) with specified prior class probabilities qi for i ∈ 0,1. Set
􏰀−1􏰁 􏰀1􏰁 􏰀16 0􏰁 􏰀1 0􏰁 μ0=0 μ1=0 Σ0=01 Σ1=016
Make sure to keep track of the true label for each sample. To produce numerical results, please use N = 10000, q0 = 0.35, and q1 = 0.65. For the following two classification methods, generate classification decisions for different values of their respective thresholds (symbolized as γ). For each classifier, plot the estimated false-positive and true-positive probabilities for each value of the threshold, i.e. plot their ROC curves. Overlay the two curves in the same plot for easy visual comparison.
Minimum Expected Loss Classifier
Determine and implement the minimum expected loss classifier parameterized by a threshold γ in the following form:
Decide 1 lnp(x;μ1,Σ1)−lnp(x;μ0,Σ0) >< lnγ where γ > 0 is a scalar that depends on the loss values and prior probabilities. Make this threshold take many values along the positive real axis, and for each threshold value, classify every sample. Empirically estimate the true-positive and false-positive probabilities (by counting samples that fall into each category), and then plot the true-positive versus false-positive performance at each threshold using an ROC curve. Sweep the positive real axis by sampling densely to see the true- positive vs false-positive curves for the classifier in detail. Hint: sweep through the sorted values for every possible classification split, as determined by the difference in class-conditional log- likelihoods per sample. Here class 1 is positive and class 0 is negative.
Fisher’s Linear Discriminant Analysis (LDA)
Implement the Fisher’s LDA classifier using the true mean vectors and covariance matrices provided above to obtain a classifier γ in the following form:
w⊺x >< γ Decide 0 where γ ∈ (− inf, inf). Repeat the exercise in the previous section. In other words, first assign many values to this threshold parameter and estimate the true-positive and false-positive probabilities from data classification label matches and mismatches. Then plot the true-positive versus false- positive performance curves as the threshold changes. Overlay this curve on top of the previous one. Computer Science Tutoring