19-11-2012, 06:15 PM
Regression Analysis: Basic Concepts
regress.pdf (Size: 68.01 KB / Downloads: 205)
The simple linear model
Suppose we reckon that some variable of interest, y, is ‘driven by’ some other variable x. We then call y the
dependent variable and x the independent variable. In addition, suppose that the relationship between y and x is
basically linear, but is inexact: besides its determination by x, y has a random component, u, which we call the
‘disturbance’ or ‘error’.
Goodness of fit
The OLS technique ensures that we find the values of O0 and O1 which ‘fit the sample data best’, in the specific
sense of minimizing the sum of squared residuals. There is no guarantee, however, that O0 and O1 correspond
exactly with the unknown parameters 0 and 1. Neither, in fact, is there any guarantee that the ‘best fitting’ line
fits the data well: maybe the data do not even approximately lie along a straight line relationship. So how do we
assess the adequacy of the ‘fitted’ equation?
First step: find the residuals. For each x-value in the sample, compute the fitted value or predicted value of y, using
yOi D O0 C O1xi .
Then subtract each fitted value from the corresponding actual, observed, value of yi . Squaring and summing these
differences gives the SSR, as shown in Table 1. In this example, based on a sample of 14 houses, yi is sale price in
thousands of dollars and xi is square footage of living area.