calc: Curve Fitting Details

 
 11.8.5 Curve Fitting Details
 ----------------------------
 
 Calc’s internal least-squares fitter can only handle multilinear models.
 More precisely, it can handle any model of the form ‘a f(x,y,z) + b
 g(x,y,z) + c h(x,y,z)’, where ‘a,b,c’ are the parameters and ‘x,y,z’ are
 the independent variables (of course there can be any number of each,
 not just three).
 
    In a simple multilinear or polynomial fit, it is easy to see how to
 convert the model into this form.  For example, if the model is ‘a + b x
 + c x^2’, then ‘f(x) = 1’, ‘g(x) = x’, and ‘h(x) = x^2’ are suitable
 functions.
 
    For most other models, Calc uses a variety of algebraic manipulations
 to try to put the problem into the form
 
      Y(x,y,z) = A(a,b,c) F(x,y,z) + B(a,b,c) G(x,y,z) + C(a,b,c) H(x,y,z)
 
 where ‘Y,A,B,C,F,G,H’ are arbitrary functions.  It computes ‘Y’, ‘F’,
 ‘G’, and ‘H’ for all the data points, does a standard linear fit to find
 the values of ‘A’, ‘B’, and ‘C’, then uses the equation solver to solve
 for ‘a,b,c’ in terms of ‘A,B,C’.
 
    A remarkable number of models can be cast into this general form.
 We’ll look at two examples here to see how it works.  The power-law
 model ‘y = a x^b’ with two independent variables and two parameters can
 be rewritten as follows:
 
      y = a x^b
      y = a exp(b ln(x))
      y = exp(ln(a) + b ln(x))
      ln(y) = ln(a) + b ln(x)
 
 which matches the desired form with ‘Y = ln(y)’, ‘A = ln(a)’, ‘F = 1’,
 ‘B = b’, and ‘G = ln(x)’.  Calc thus computes the logarithms of your ‘y’
 and ‘x’ values, does a linear fit for ‘A’ and ‘B’, then solves to get ‘a
 = exp(A)’ and ‘b = B’.
 
    Another interesting example is the “quadratic” model, which can be
 handled by expanding according to the distributive law.
 
      y = a + b*(x - c)^2
      y = a + b c^2 - 2 b c x + b x^2
 
 which matches with ‘Y = y’, ‘A = a + b c^2’, ‘F = 1’, ‘B = -2 b c’, ‘G =
 x’ (the -2 factor could just as easily have been put into ‘G’ instead of
 ‘B’), ‘C = b’, and ‘H = x^2’.
 
    The Gaussian model looks quite complicated, but a closer examination
 shows that it’s actually similar to the quadratic model but with an
 exponential that can be brought to the top and moved into ‘Y’.
 
    The logistic models cannot be put into general linear form.  For
 these models, and the Hubbert linearization, Calc computes a rough
 approximation for the parameters, then uses the Levenberg-Marquardt
 iterative method to refine the approximations.
 
    Another model that cannot be put into general linear form is a
 Gaussian with a constant background added on, i.e., ‘d’ + the regular
 Gaussian formula.  If you have a model like this, your best bet is to
 replace enough of your parameters with constants to make the model
 linearizable, then adjust the constants manually by doing a series of
 fits.  You can compare the fits by graphing them, by examining the
 goodness-of-fit measures returned by ‘I a F’, or by some other method
 suitable to your application.  Note that some models can be linearized
 in several ways.  The Gaussian-plus-D model can be linearized by setting
 ‘d’ (the background) to a constant, or by setting ‘b’ (the standard
 deviation) and ‘c’ (the mean) to constants.
 
    To fit a model with constants substituted for some parameters, just
 store suitable values in those parameter variables, then omit them from
 the list of parameters when you answer the variables prompt.
 
    A last desperate step would be to use the general-purpose ‘minimize’
 function rather than ‘fit’.  After all, both functions solve the problem
 of minimizing an expression (the ‘chi^2’ sum) by adjusting certain
 parameters in the expression.  The ‘a F’ command is able to use a vastly
 more efficient algorithm due to its special knowledge about linear
 chi-square sums, but the ‘a N’ command can do the same thing by brute
 force.
 
    A compromise would be to pick out a few parameters without which the
 fit is linearizable, and use ‘minimize’ on a call to ‘fit’ which
 efficiently takes care of the rest of the parameters.  The thing to be
 minimized would be the value of ‘chi^2’ returned as the fifth result of
 the ‘xfit’ function:
 
      minimize(xfit(gaus(a,b,c,d,x), x, [a,b,c], data)_5, d, guess)
 
 where ‘gaus’ represents the Gaussian model with background, ‘data’
 represents the data matrix, and ‘guess’ represents the initial guess for
 ‘d’ that ‘minimize’ requires.  This operation will only be, shall we
 say, extraordinarily slow rather than astronomically slow (as would be
 the case if ‘minimize’ were used by itself to solve the problem).
 
    The ‘I a F’ [‘xfit’] command is somewhat trickier when nonlinear
 models are used.  The second item in the result is the vector of “raw”
 parameters ‘A’, ‘B’, ‘C’.  The covariance matrix is written in terms of
 those raw parameters.  The fifth item is a vector of “filter”
 expressions.  This is the empty vector ‘[]’ if the raw parameters were
 the same as the requested parameters, i.e., if ‘A = a’, ‘B = b’, and so
 on (which is always true if the model is already linear in the
 parameters as written, e.g., for polynomial fits).  If the parameters
 had to be rearranged, the fifth item is instead a vector of one formula
 per parameter in the original model.  The raw parameters are expressed
 in these “filter” formulas as ‘fitdummy(1)’ for ‘A’, ‘fitdummy(2)’ for
 ‘B’, and so on.
 
    When Calc needs to modify the model to return the result, it replaces
 ‘fitdummy(1)’ in all the filters with the first item in the raw
 parameters list, and so on for the other raw parameters, then evaluates
 the resulting filter formulas to get the actual parameter values to be
 substituted into the original model.  In the case of ‘H a F’ and ‘I a F’
 where the parameters must be error forms, Calc uses the square roots of
 the diagonal entries of the covariance matrix as error values for the
 raw parameters, then lets Calc’s standard error-form arithmetic take it
 from there.
 
    If you use ‘I a F’ with a nonlinear model, be sure to remember that
 the covariance matrix is in terms of the raw parameters, _not_ the
 actual requested parameters.  It’s up to you to figure out how to
 interpret the covariances in the presence of nontrivial filter
 functions.
 
    Things are also complicated when the input contains error forms.
 Suppose there are three independent and dependent variables, ‘x’, ‘y’,
 and ‘z’, one or more of which are error forms in the data.  Calc
 combines all the error values by taking the square root of the sum of
 the squares of the errors.  It then changes ‘x’ and ‘y’ to be plain
 numbers, and makes ‘z’ into an error form with this combined error.  The
 ‘Y(x,y,z)’ part of the linearized model is evaluated, and the result
 should be an error form.  The error part of that result is used for
 ‘sigma_i’ for the data point.  If for some reason ‘Y(x,y,z)’ does not
 return an error form, the combined error from ‘z’ is used directly for
 ‘sigma_i’.  Finally, ‘z’ is also stripped of its error for use in
 computing ‘F(x,y,z)’, ‘G(x,y,z)’ and so on; the righthand side of the
 linearized model is computed in regular arithmetic with no error forms.
 
    (While these rules may seem complicated, they are designed to do the
 most reasonable thing in the typical case that ‘Y(x,y,z)’ depends only
 on the dependent variable ‘z’, and in fact is often simply equal to ‘z’.
 For common cases like polynomials and multilinear models, the combined
 error is simply used as the ‘sigma’ for the data point with no further
 ado.)
 
    It may be the case that the model you wish to use is linearizable,
 but Calc’s built-in rules are unable to figure it out.  Calc uses its
 algebraic rewrite mechanism to linearize a model.  The rewrite rules are
 kept in the variable ‘FitRules’.  You can edit this variable using the
 ‘s e FitRules’ command; in fact, there is a special ‘s F’ command just
 for editing ‘FitRules’.  SeeOperations on Variables.
 
    SeeRewrite Rules, for a discussion of rewrite rules.
 
    Calc uses ‘FitRules’ as follows.  First, it converts the model to an
 equation if necessary and encloses the model equation in a call to the
 function ‘fitmodel’ (which is not actually a defined function in Calc;
 it is only used as a placeholder by the rewrite rules).  Parameter
 variables are renamed to function calls ‘fitparam(1)’, ‘fitparam(2)’,
 and so on, and independent variables are renamed to ‘fitvar(1)’,
 ‘fitvar(2)’, etc.  The dependent variable is the highest-numbered
 ‘fitvar’.  For example, the power law model ‘a x^b’ is converted to ‘y =
 a x^b’, then to
 
      fitmodel(fitvar(2) = fitparam(1) fitvar(1)^fitparam(2))
 
    Calc then applies the rewrites as if by ‘C-u 0 a r FitRules’.  (The
 zero prefix means that rewriting should continue until no further
 changes are possible.)
 
    When rewriting is complete, the ‘fitmodel’ call should have been
 replaced by a ‘fitsystem’ call that looks like this:
 
      fitsystem(Y, FGH, ABC)
 
 where Y is a formula that describes the function ‘Y(x,y,z)’, FGH is the
 vector of formulas ‘[F(x,y,z), G(x,y,z), H(x,y,z)]’, and ABC is the
 vector of parameter filters which refer to the raw parameters as
 ‘fitdummy(1)’ for ‘A’, ‘fitdummy(2)’ for ‘B’, etc.  While the number of
 raw parameters (the length of the FGH vector) is usually the same as the
 number of original parameters (the length of the ABC vector), this is
 not required.
 
    The power law model eventually boils down to
 
      fitsystem(ln(fitvar(2)),
                [1, ln(fitvar(1))],
                [exp(fitdummy(1)), fitdummy(2)])
 
    The actual implementation of ‘FitRules’ is complicated; it proceeds
 in four phases.  First, common rearrangements are done to try to bring
 linear terms together and to isolate functions like ‘exp’ and ‘ln’
 either all the way “out” (so that they can be put into Y) or all the way
 “in” (so that they can be put into ABC or FGH).  In particular, all
 non-constant powers are converted to logs-and-exponentials form, and the
 distributive law is used to expand products of sums.  Quotients are
 rewritten to use the ‘fitinv’ function, where ‘fitinv(x)’ represents
 ‘1/x’ while the ‘FitRules’ are operating.  (The use of ‘fitinv’ makes
 recognition of linear-looking forms easier.)  If you modify ‘FitRules’,
 you will probably only need to modify the rules for this phase.
 
    Phase two, whose rules can actually also apply during phases one and
 three, first rewrites ‘fitmodel’ to a two-argument form ‘fitmodel(Y,
 MODEL)’, where Y is initially zero and MODEL has been changed from ‘a=b’
 to ‘a-b’ form.  It then tries to peel off invertible functions from the
 outside of MODEL and put them into Y instead, calling the equation
 solver to invert the functions.  Finally, when this is no longer
 possible, the ‘fitmodel’ is changed to a four-argument ‘fitsystem’,
 where the fourth argument is MODEL and the FGH and ABC vectors are
 initially empty.  (The last vector is really ABC, corresponding to raw
 parameters, for now.)
 
    Phase three converts a sum of items in the MODEL to a sum of
 ‘fitpart(A, B, C)’ terms which represent terms ‘A*B*C’ of the sum, where
 A is all factors that do not involve any variables, B is all factors
 that involve only parameters, and C is the factors that involve only
 independent variables.  (If this decomposition is not possible, the rule
 set will not complete and Calc will complain that the model is too
 complex.)  Then ‘fitpart’s with equal B or C components are merged back
 together using the distributive law in order to minimize the number of
 raw parameters needed.
 
    Phase four moves the ‘fitpart’ terms into the FGH and ABC vectors.
 Also, some of the algebraic expansions that were done in phase 1 are
 undone now to make the formulas more computationally efficient.
 Finally, it calls the solver one more time to convert the ABC vector to
 an ABC vector, and removes the fourth MODEL argument (which by now will
 be zero) to obtain the three-argument ‘fitsystem’ that the linear
 least-squares solver wants to see.
 
    Two functions which are useful in connection with ‘FitRules’ are
 ‘hasfitparams(x)’ and ‘hasfitvars(x)’, which check whether ‘x’ refers to
 any parameters or independent variables, respectively.  Specifically,
 these functions return “true” if the argument contains any ‘fitparam’
 (or ‘fitvar’) function calls, and “false” otherwise.  (Recall that
 “true” means a nonzero number, and “false” means zero.  The actual
 nonzero number returned is the largest N from all the ‘fitparam(N)’s or
 ‘fitvar(N)’s, respectively, that appear in the formula.)
 
    The ‘fit’ function in algebraic notation normally takes four
 arguments, ‘fit(MODEL, VARS, PARAMS, DATA)’, where MODEL is the model
 formula as it would be typed after ‘a F '’, VARS is the independent
 variable or a vector of independent variables, PARAMS likewise gives the
 parameter(s), and DATA is the data matrix.  Note that the length of VARS
 must be equal to the number of rows in DATA if MODEL is an equation, or
 one less than the number of rows if MODEL is a plain formula.
 (Actually, a name for the dependent variable is allowed but will be
 ignored in the plain-formula case.)
 
    If PARAMS is omitted, the parameters are all variables in MODEL
 except those that appear in VARS.  If VARS is also omitted, Calc sorts
 all the variables that appear in MODEL alphabetically and uses the
 higher ones for VARS and the lower ones for PARAMS.
 
    Alternatively, ‘fit(MODELVEC, DATA)’ is allowed where MODELVEC is a
 2- or 3-vector describing the model and variables, as discussed
 previously.
 
    If Calc is unable to do the fit, the ‘fit’ function is left in
 symbolic form, ordinarily with an explanatory message.  The message will
 be “Model expression is too complex” if the linearizer was unable to put
 the model into the required form.
 
    The ‘efit’ (corresponding to ‘H a F’) and ‘xfit’ (for ‘I a F’)
 functions are completely analogous.