calc: Error Estimates for Fits
11.8.3 Error Estimates for Fits
-------------------------------
With the Hyperbolic flag, ‘H a F’ [‘efit’] performs the same fitting
operation as ‘a F’, but reports the coefficients as error forms instead
of plain numbers. Fitting our two data matrices (first with 13, then
with 14) to a line with ‘H a F’ gives the results,
3. + 2. x
2.6 +/- 0.382970843103 + 2.2 +/- 0.115470053838 x
In the first case the estimated errors are zero because the linear
fit is perfect. In the second case, the errors are nonzero but
moderately small, because the data are still very close to linear.
It is also possible for the _input_ to a fitting operation to contain
error forms. The data values must either all include errors or all be
plain numbers. Error forms can go anywhere but generally go on the
numbers in the last row of the data matrix. If the last row contains
error forms ‘Y_I +/- SIGMA_I’, then the ‘chi^2’ statistic is now,
chi^2 = sum(((y_i - (a + b x_i)) / sigma_i)^2, i, 1, N)
so that data points with larger error estimates contribute less to the
fitting operation.
If there are error forms on other rows of the data matrix, all the
errors for a given data point are combined; the square root of the sum
of the squares of the errors forms the ‘sigma_i’ used for the data
point.
Both ‘a F’ and ‘H a F’ can accept error forms in the input matrix,
although if you are concerned about error analysis you will probably use
‘H a F’ so that the output also contains error estimates.
If the input contains error forms but all the ‘sigma_i’ values are
the same, it is easy to see that the resulting fitted model will be the
same as if the input did not have error forms at all (‘chi^2’ is simply
scaled uniformly by ‘1 / sigma^2’, which doesn’t affect where it has a
minimum). But there _will_ be a difference in the estimated errors of
the coefficients reported by ‘H a F’.
Consult any text on statistical modeling of data for a discussion of
where these error estimates come from and how they should be
interpreted.
With the Inverse flag, ‘I a F’ [‘xfit’] produces even more
information. The result is a vector of six items:
1. The model formula with error forms for its coefficients or
parameters. This is the result that ‘H a F’ would have produced.
2. A vector of “raw” parameter values for the model. These are the
polynomial coefficients or other parameters as plain numbers, in
the same order as the parameters appeared in the final prompt of
the ‘I a F’ command. For polynomials of degree ‘d’, this vector
will have length ‘M = d+1’ with the constant term first.
3. The covariance matrix ‘C’ computed from the fit. This is an MxM
symmetric matrix; the diagonal elements ‘C_j_j’ are the variances
‘sigma_j^2’ of the parameters. The other elements are covariances
‘sigma_i_j^2’ that describe the correlation between pairs of
parameters. (A related set of numbers, the “linear correlation
coefficients” ‘r_i_j’, are defined as ‘sigma_i_j^2 / sigma_i
sigma_j’.)
4. A vector of ‘M’ “parameter filter” functions whose meanings are
described below. If no filters are necessary this will instead be
an empty vector; this is always the case for the polynomial and
multilinear fits described so far.
5. The value of ‘chi^2’ for the fit, calculated by the formulas shown
above. This gives a measure of the quality of the fit;
statisticians consider ‘chi^2 = N - M’ to indicate a moderately
good fit (where again ‘N’ is the number of data points and ‘M’ is
the number of parameters).
6. A measure of goodness of fit expressed as a probability ‘Q’. This
is computed from the ‘utpc’ probability distribution function using
‘chi^2’ with ‘N - M’ degrees of freedom. A value of 0.5 implies a
good fit; some texts recommend that often ‘Q = 0.1’ or even 0.001
can signify an acceptable fit. In particular, ‘chi^2’ statistics
assume the errors in your inputs follow a normal (Gaussian)
distribution; if they don’t, you may have to accept smaller values
of ‘Q’.
The ‘Q’ value is computed only if the input included error
estimates. Otherwise, Calc will report the symbol ‘nan’ for ‘Q’.
The reason is that in this case the ‘chi^2’ value has effectively
been used to estimate the original errors in the input, and thus
there is no redundant information left over to use for a confidence
test.