Join Mailing List

Vernier Tech Info Library TIL #1675

Question

How do LabQuest Curve fits work?

Answer

Labquest (version 1.0.0) has five curve fit forms, whose coefficients are computed in one of two ways.

*** Linear form ***
Linear: y = mx +b
The m and x coefficients of the Linear form are computed using the linear fit algorithm taken from "An Introduction to Error Analysis" (pp 156-157) It is an implementation of the least squares approach. The root mean square error (RMSE,) uncertainties for each coefficient and the correlation are also computed. The correlation calculation is taken from the same book (p 180.)

*** Non-linear forms ***
Proportional: y = Ax
Quadratic: y = Ax^2+Bx+C
Power: y = Ax^B
Natural Exponent: y = Aexp(-Cx)+B

The coefficients of the non-linear forms are computed using the iterative, best-fit values approach of the Levenburg-Marquadt method taken from "Numerical Recipes in C" (pp 681-688.) The root mean square error (RMSE) and uncertainties for each coefficient are also computed.

*** Natural Exponent starting coefficient adjustment ***
The starting coefficients for the Natural Exponent are adjusted prior to finding the best-fit coefficients.The adjustment makes a few assumptions and creates some ranged, random coefficients. Therefore, multiple fit attempts can produce different, best-fit coefficient results.

Form: A * exp (-Cx) +B
Assumption: The x column contains ascending values.

We divide the cases of these functions into two sets. The first of the two sets is when all the data lives in the upper right quadrant of the x-y plane. The second set is all the remaining cases.


*** Data sets that exist entirely in the upper right quadrant of the x-y plane ***
When the data exists only in the upper right quadrant of the x-y plane, we follow an adjustment algorithm that invokes the linear least squares fit solver on the logarithm of all the values in the original data set. Once we have run the linear fit on the log of the original data, we exponentiate the results of that fit to arrive at our initial guesses for use in the Levenburg-Marquadt nonlinear solver.

The justification for this approach is that in the logarithmic basis, the exponent coefficients that we hope to find become the linear coefficients in the slope intercept form (y = mx + b), as can be seen below.

(set b = 0, it can be found later)
y = a * exp (-cx) + b
y = a * exp (-cx)

(take the natural log of both sides of the equation)
ln (y) = ln (a * exp (-cx))

(algebraic simplification to arrive at a slope intercept form)
ln (y) = ln (a) + ln (exp (-cx))
ln (y) = ln (a) - cx

so in the above equation, we assume that the values in the slope intercept form are as follows:

b (y-intercept) = ln (a)
m (slope) = -c

the inputs to the linear solver are a data set consisting of ln (y) values for the y values, and ln (x) for the x values. Once the linear solver finds a least squares fit for this data set, we then revert these values back to the exponentiated basis and finally have our initial guesses for use in the Levenberg Marquardt nonlinear solver.

a = exp (ln (a)) = exp (b)
c = -m
b can be found at x = 0 (or the closest y point to x = 0 in the data set), since exp (0) = 1.

y (x=0) = a + b
b = y(x = 0) - a



*** Datasets that include data outside of the upper right quadrant of the x-y plane ***
For datasets that include data outside of the upper right quadrant of the x-y plane, we follow a heuristic approach:

Let:
x = the list of x-column values (where x[0] is the first value and x[n] the last)
y = the list of y-column values
x_range = | x[n] - x[0] | (absolute range)

If x_range >10; then:
x_range = 1/x_range

The C coefficient is very roughly inversely proportional to the x_range. The goal is to start with a reasonable value in the exponent, based on the x_range value.
C = random value between 0 and +\- x_range. The sign is based on another assumption. If x[n] > x[0], it is assumed positive, else negative. This coefficient is the most critical in terms of providing the curve fit algorithm a good starting point for reaching a best-fit.

If the C coefficient is assumed to be a positive value, the exponential term is negative. So as x->infinity: exp (-Cx)->0 (zero); Therefore: as x->infinity: y = B
B = y[n] (for positive C)
B = y[0] (for negative C)

An assumption about the sign of the A coefficient is difficult to make. However, we can give it a range. This makes a further assumption that there is indeed an x = 0 value, which provides a y-intercept. At x = 0: y = A+B
A = random value between (y[n]-y[0]) and (y[[0]+y[n])

Created by: ihonohan on August 29 2007
Last updated by: ihonohan on September 15 2008