3.14 Suppose we wish to find a prediction function g(x) that minimizes MSE = E[(y − g(x))^2], where x and y are jointly distributed random variables with density function f (x, y).
(a) Show that MSE is minimized by the choice g(x) = E(y | x). Hint: MSE = EE[(y – g(x))^2 | x]. (b) Apply the above result to the model y = x² + z, x where x and z are independent zero-mean normal variables with variance one. Show that MSE = 1. c) Suppose we restrict our choices for the function g(x) to linear functions of the form
g(x) = a+bx
and determine a and b to minimize MSE. Show that a=1 and
b= E(xy)/E(x62) = 0 and MSE = 3. What do you interpret this to mean?