Monday, July 8, 2013

Accelerometer and Magnetometer Calibration: Part 2 - Least-Square Solving Methods

Linear versus Nonlinear

When trying to solve an over-defined (more data-points than there is unknowns) it is a popular approach to use the Least-Squares method. This works by finding the best fit of a model that minimizes the mean square error of each measurement. What is important to note is that this is for linear problems and usually performs bad when the problem is nonlinear, what is nonlinear is when you have a product/division of parameters or when the variable is not a polynomial. Looking directly at the model equation,
\begin{equation*}\left (\frac{x - b_x}{g_x} \right )^2 + \left (\frac{y - b_y}{g_y} \right )^2 + \left (\frac{z - b_z}{g_z} \right )^2 = 1, \end{equation*}
it is difficult to determine if the system is linear or nonlinear. But if we expand the squares and see what we end up with it's much easier to determine:
\begin{equation*} \frac{x^2}{g_x^2} + \frac{y^2}{g_y^2} + \frac{z^2}{g_z^2} - \frac{2b_xx}{g_x^2} - \frac{2b_yy}{g_y^2} - \frac{2b_zz}{g_z^2} + \frac{b_x^2}{g_x^2} + \frac{b_y^2}{g_y^2} + \frac{b_z^2}{g_z^2} = 1 \end{equation*}
Now we can see that in six different places we have parameter divided by parameter, hence this is a nonlinear case. If you don't want the gain estimation then it would be a simple linear case, but now we want to know all errors in the sensors hence we will need to use a nonlinear case.
    There is a method to do least-squares on ellipses called "Direct Least Square Fitting of Ellipses", however I will not go into how this is done.

Gauss-Newton Linearization Method

The method of choice to solve this nonlinear problem is going to be the Gauss-Newton linearization method, this method works like Least-Squares but uses the derivative of the function in regards to the parameters to create an iterative successive approximation routine.
   The first thing you do when using this method is to create a column vector, usually denoted $\boldsymbol{\theta}$, that contains the parameters we want to know:
\begin{equation*}\boldsymbol{\theta} = \begin{bmatrix}b_x & b_y & b_z & g_x & g_y & g_z \end{bmatrix}^T \end{equation*}
On top of this we need the function we want to estimate, denoted $\hat{y}(\boldsymbol{\theta})$ where $\mathbf{m} = \begin{bmatrix} x & y & z \end{bmatrix}$ is the measurement vector. Note that the = 1 from the original equation has disappeared, this comes from the fact that we want it to be 1 and now we are calculating it based on the estimated parameters.
\begin{equation*}\hat{y}(\boldsymbol{\theta}) = \left (\frac{x - b_x}{g_x} \right )^2 + \left (\frac{y - b_y}{g_y} \right )^2 + \left (\frac{z - b_z}{g_z} \right )^2 \end{equation*}
As stated before, this method uses the derivative of the function with regard to the parameters. In effect this method works a little like the gradient decent method, you need the gradient of the function in regard to the parameters:
\begin{equation*}
\mathbf{H}(\boldsymbol{\theta}) = \frac{\partial \hat{y}(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}} = -2 \begin{bmatrix}
\frac{x - b_x}{g_x^2} \\
\frac{y - b_y}{g_y^2} \\
\frac{z - b_z}{g_z^2} \\
\frac{(x - b_x)^2}{g_x^3} \\
\frac{(y - b_y)^2}{g_y^3} \\
\frac{(z - b_z)^2}{g_z^3}
\end{bmatrix}^T
\end{equation*}
The gradient will be a $m \times 6$ matrix where m is the number of measurement points. For each measurement set the gradient is calculated and you end up with as many derivatives as measurements. This will make it possible to see if we are coming closer to the real solution or if we are diverging from it. 
   Now we have the prerequisites to the Gauss-Newton linearization method! The iterative solution is calculated with the following equation (how this equation is derived is not something I will go through here):
\begin{equation} \boldsymbol{\theta}_{k+1} = \boldsymbol{\theta}_k + \left ( \mathbf{H}^T (\boldsymbol{\theta}_k) \mathbf{H}(\boldsymbol{\theta}_k) \right )^{-1} \mathbf{H}^T (\boldsymbol{\theta}_k) \left ( \mathbf{y} - \mathbf{\hat{y}}(\boldsymbol{\theta}_k) \right ) \end{equation} Where $\mathbf{y}$ is what we want our function to be, which in our case is $\mathbf{y} = 1$ and note that $\mathbf{\hat{y}}$ is a vector - not a scalar. Now it's just to follow the following steps:
  1. Set the iteration number $k = 0$
  2. Make an initial guess, $ \boldsymbol{\theta}_0 $ of the parameter vector.
  3. Compute the gradient $ \mathbf{H}(\boldsymbol{\theta}_k) $ and the modeled response by $ \hat{y}(\boldsymbol{\theta}_k) $.
  4. Update the parameter estimates according to Eq. (1).
  5. Check your stop criteria $\left \| \boldsymbol{\theta}_k - \boldsymbol{\theta}_{k-1} \right \| < \varepsilon$, where $\varepsilon$ is a small number, I choose $\varepsilon = 10^{-7}$.
  6. If any of the stop criteria is met, stop the algorithm, otherwise increase the iteration counter and start from point 3.
How many measurement points are needed?
One of the big questions are how many data points are needed for this to work? The answer requires some testing, but I have found that as low as 5 measurements per axis and direction is about where the minimum is (total of 30 measurements). To get the best data set it is recommended to do the so-called "six-point measurement" which means that the system will be put in 6 known positions corresponding to the faces of a cube. With that said, I use 50 (total of 300) measurements and with that I get within 0.2 % of the real value in simulations. Which value you choose is up to you.
   How to do this with the magnetometer is a bit trickier. The same method can be used, which works, but is sub optimal. I have tried holding the system facing North and slowly rotating it along the E-W axis and after that rotating it around the N-S axis. This seems to work as well but some more testing is required. However even with the six-point measurement magnetometer calibration is within 0.5%, which is more than good enough!

No comments:

Post a Comment