Inference
We are often interested in understanding the way that Y is affected as
X1,...,Xp change. In this situation we wish to estimate f, but our goal is
not necessarily to make predictions for Y . We instead want to understand
the relationship between X and Y , or more specifically, to understand how
Y changes as a function of X1,...,Xp. Now ˆf cannot be treated as a black
box, because we need to know its exact form. In this setting, one may be
interested in answering the following questions:
• Which predictors are associated with the response? It is often the case
that only a small fraction of the available predictors are substantially
associated with Y . Identifying the few important predictors among a
large set of possible variables can be extremely useful, depending on
the application.
• What is the relationship between the response and each predictor?
Some predictors may have a positive relationship with Y , in the sense
that increasing the predictor is associated with increasing values of
Y . Other predictors may have the opposite relationship. Depending
on the complexity of f, the relationship between the response and a
given predictor may also depend on the values of the other predictors.
• Can the relationship between Y and each predictor be adequately summarized using a linear equation, or is the relationship more complicated? Historically, most methods for estimating f have taken a linear
form. In some situations, such an assumption is reasonable or even desirable. But often the true relationship is more complicated, in which
case a linear model may not provide an accurate representation of
the relationship between the input and output variables.