Monday, July 8, 2013

Accelerometer and Magnetometer Calibration: Part 3 - Implementation in Matlab

Matlab Implementation

In this part I will be providing the source code for an implementation of the Gauss-Newton solver in Matlab. As you will see, it is a quite simple algorithm that easily can be turned into C(++)/C#/Java without much hassle. However I will not (for starters) provide this algorithm in any other language because I don't have the time, but in the future I plan on having a C, C++ and C# version that uses matrix simplifications to minimize memory consumption for simple integration into any project.

Structure
The structure is composed of three main scripts (the solver) and two scripts as an example. The code for all scripts are at the end of this post, I only explain their function here:
  • nonlin_solve.m - The main nonlinear solver that implements the Gauss-Newton solver.
  • yHAT.m - Creates $\hat{y}(\boldsymbol{\theta})$ for nonlin_solve.m.
  • gradH.m - Creates  $\mathbf{H}(\boldsymbol{\theta})$ for nonlin_solve.m.
  • test_solver.m - An example of an accelerometer calibration.
  • create_meas.m - Creates a 6-point measurement set for test_ solver.m with gain plus bias errors added.
The script is extremely easy to use, the solver takes nine inputs as: nonlin_solve(x, y, z, bx, by, bz, gx, gy, gz), where x, y and z are the measurement vectors and bx, by, bz, gx, gy and gz are start guesses. To find a good starting guess it is easiest to look at the datasheet, for example an accelerometer might have the value 1024 = 1g, then the starting guess for the gain is 1024. For the bias I always start at zero, you can see my starting guess in the test_solver.m script. If you are unsure, it won't be a problem, the solver will still converge even with a substantial error in the starting guess. For example, try changing the starting guess from 1024 to 1 in test_solver.m, that is three orders of magnitude (!!!) error and it will still converge however it will need more iterations to do so.

Example run of the test_solver.m:
Loop aborted at n = 6 of 100, requested precision reached!

-: Sensor Bias :-
X-axis: 125.6636
  Error: 0.66356 (0.53%) 

Y-axis: -249.7382
  Error: 0.26185 (0.1%) 

Z-axis: 99.9163
  Error: -0.083738 (0.084%) 

MSE: 0.17196 


-: Sensor Gain :-
X-axis: 1079.4757
  Error: -0.52428 (0.049%) 

Y-axis: 1149.8909
  Error: -0.10913 (0.0095%) 

Z-axis: 920.6441
  Error: 0.6441 (0.07%) 

MSE: 0.23388
From this it is quite evident that the solver works well. But to get a feel for it you should play with it yourself! Please give it a try and come with some comments, I am always looking for improvements to this script.
   This is it for this series! I hope you have learned something about estimating parameters and how to calibrate sensors. It has been an interesting problem to try and solve. And thank you for reading!

Matlab Code

test_solver.m
clear
clc

% Set bias and gains for the solver to find
theta = [125; % mx
         -250; % my
         100; % mz
         1080; % gx
         1150; % gy
         920]; % gz

% Create measurements based on the theta
% Std. dev. of 5 and 50 measurements of each axis
[x, y, z] = create_meas(theta(1), theta(2), theta(3), ...
                           theta(4), theta(5), theta(6), 5, 50);
                       
% For nice printing
format shortG

% Run the measurements through the solver to find theta_hat                      
theta_p = nonlin_solve(x, y, z, 0, 0, 0, 1024, 1024, 1024);

% Calculate the errors
err = (theta_p - theta);
MSE_m = sum(err(1:3).^2)/3;
MSE_g = sum(err(4:6).^2)/3;

% Some nice write-out
fprintf(['-: Sensor Bias :-\n'])
fprintf(['X-axis: ', num2str(theta_p(1)), '\n\t Error: ', ...
    num2str(err(1)),' (', num2str(abs(err(1)*100/theta(1)), 2),'%%) \n'])
fprintf(['\nY-axis: ', num2str(theta_p(2)), '\n\t Error: ', ...
    num2str(err(2)),' (', num2str(abs(err(2)*100/theta(2)), 2),'%%) \n'])
fprintf(['\nZ-axis: ', num2str(theta_p(3)), '\n\t Error: ', ...
    num2str(err(3)),' (', num2str(abs(err(3)*100/theta(3)), 2),'%%) \n'])
fprintf(['\nMSE: ', num2str((MSE_m)),' \n'])



fprintf(['\n\n-: Sensor Gain :-\n'])
fprintf(['X-axis: ', num2str(theta_p(4)), '\n\t Error: ', ...
    num2str(err(4)),' (', num2str(abs(err(4)*100/theta(4)), 2),'%%) \n'])
fprintf(['\nY-axis: ', num2str(theta_p(5)), '\n\t Error: ', ...
    num2str(err(5)),' (', num2str(abs(err(5)*100/theta(5)), 2),'%%) \n'])
fprintf(['\nZ-axis: ', num2str(theta_p(6)), '\n\t Error: ', ...
    num2str(err(6)),' (', num2str(abs(err(6)*100/theta(6)), 2),'%%) \n'])
fprintf(['\nMSE: ', num2str((MSE_g)),' \n'])

create_meas.m
function [ax, ay, az] = create_meas(mx, my, mz, gx, gy, gz, stddev, meas)
    % Create measurements on each side of the sensor:
    % x, y and z axises (six measurement positions).
    A = [ ones(meas,1)   zeros(meas,1)  zeros(meas,1);
         -ones(meas,1)   zeros(meas,1)  zeros(meas,1);
          zeros(meas,1)  ones(meas,1)   zeros(meas,1);
          zeros(meas,1) -ones(meas,1)   zeros(meas,1);
          zeros(meas,1)  zeros(meas,1)  ones(meas,1)
          zeros(meas,1)  zeros(meas,1) -ones(meas,1)];
    
    % Multiply with the gain to get "sensor accurate data"
    A(:,1) = A(:,1)*gx;
    A(:,2) = A(:,2)*gy;
    A(:,3) = A(:,3)*gz;
    
    % Add the biases of each axis
    A(:,1) = A(:,1) + mx;
    A(:,2) = A(:,2) + my;
    A(:,3) = A(:,3) + mz;
    
    s = size(A);
    
    R = normrnd(0, stddev, s(1), s(2));
    A = A + R;
    
    ax = round(A(:,1));
    ay = round(A(:,2));
    az = round(A(:,3));
end

nonlin_solve.m
function theta = nonlin_solve(x, y, z, bx, by, bz, gx, gy, gz)
    % Formula for Gauss-Newton solver:
    % theta(k+1) = theta(k) + inv(H'*H)*H'*[y - yHAT(theta_k)]
    
    theta = [bx;
             by;
             bz;
             gx;
             gy;
             gz];
    
         
    for i = 1:100
        H = gradH(x, y, z, theta(1), theta(2), theta(3), ...
                  theta(4), theta(5), theta(6));

        yH = yHAT(x, y, z, theta(1), theta(2), theta(3), ...
                  theta(4), theta(5), theta(6));
              
        hTh = (H'*H);
        step = hTh\H'*(1 - yH);
        theta = theta + step;
        
        if max(abs(step)) < 1e-7
            fprintf(['Loop aborted at n = ', num2str(i), ...
                     ' of 100, requested precision reached!\n\n'])
            break
        end
    end
end

yHAT.m
function a = yHAT(ax, ay, az, mx, my, mz, gx, gy, gz)
    % Calculates the acceleration
    a = (ax - mx).^2./gx.^2 + (ay - my).^2./gy.^2 + (az - mz).^2./gz.^2;
end

gradH.m
function H = gradH(ax, ay, az, mx, my, mz, gx, gy, gz)
    % Derivative of a = (ax - mx)^2/gx^2 + (ay - my)^2/gy^2 + (az - mz)^2/gz^2
    % = d/dtheta (ax - mx)^2/gx^2 + (ay - my)^2/gy^2 + (az - mz)^2/gz^2 = 
    % = [ -(2*ax - 2*mx)/gx^2, -(2*ay - 2*my)/gy^2, -(2*az - 2*mz)/gz^2,
    % -(2*(ax - mx)^2)/gx^3, -(2*(ay - my)^2)/gy^3, -(2*(az - mz)^2)/gz^3]
    
    % H = [da/dmx da/dmy da/dmz da/dgx da/dgy da/dgz]
    
    H = -2.*[(ax - mx)./gx.^2,     (ay - my)./gy.^2,    ...
             (az - mz)./gz.^2,     ((ax - mx).^2)./gx.^3,  ...
             ((ay - my).^2)./gy.^3,  ((az - mz).^2)./gz.^3];
end

Accelerometer and Magnetometer Calibration: Part 2 - Least-Square Solving Methods

Linear versus Nonlinear

When trying to solve an over-defined (more data-points than there is unknowns) it is a popular approach to use the Least-Squares method. This works by finding the best fit of a model that minimizes the mean square error of each measurement. What is important to note is that this is for linear problems and usually performs bad when the problem is nonlinear, what is nonlinear is when you have a product/division of parameters or when the variable is not a polynomial. Looking directly at the model equation,
\begin{equation*}\left (\frac{x - b_x}{g_x} \right )^2 + \left (\frac{y - b_y}{g_y} \right )^2 + \left (\frac{z - b_z}{g_z} \right )^2 = 1, \end{equation*}
it is difficult to determine if the system is linear or nonlinear. But if we expand the squares and see what we end up with it's much easier to determine:
\begin{equation*} \frac{x^2}{g_x^2} + \frac{y^2}{g_y^2} + \frac{z^2}{g_z^2} - \frac{2b_xx}{g_x^2} - \frac{2b_yy}{g_y^2} - \frac{2b_zz}{g_z^2} + \frac{b_x^2}{g_x^2} + \frac{b_y^2}{g_y^2} + \frac{b_z^2}{g_z^2} = 1 \end{equation*}
Now we can see that in six different places we have parameter divided by parameter, hence this is a nonlinear case. If you don't want the gain estimation then it would be a simple linear case, but now we want to know all errors in the sensors hence we will need to use a nonlinear case.
    There is a method to do least-squares on ellipses called "Direct Least Square Fitting of Ellipses", however I will not go into how this is done.

Gauss-Newton Linearization Method

The method of choice to solve this nonlinear problem is going to be the Gauss-Newton linearization method, this method works like Least-Squares but uses the derivative of the function in regards to the parameters to create an iterative successive approximation routine.
   The first thing you do when using this method is to create a column vector, usually denoted $\boldsymbol{\theta}$, that contains the parameters we want to know:
\begin{equation*}\boldsymbol{\theta} = \begin{bmatrix}b_x & b_y & b_z & g_x & g_y & g_z \end{bmatrix}^T \end{equation*}
On top of this we need the function we want to estimate, denoted $\hat{y}(\boldsymbol{\theta})$ where $\mathbf{m} = \begin{bmatrix} x & y & z \end{bmatrix}$ is the measurement vector. Note that the = 1 from the original equation has disappeared, this comes from the fact that we want it to be 1 and now we are calculating it based on the estimated parameters.
\begin{equation*}\hat{y}(\boldsymbol{\theta}) = \left (\frac{x - b_x}{g_x} \right )^2 + \left (\frac{y - b_y}{g_y} \right )^2 + \left (\frac{z - b_z}{g_z} \right )^2 \end{equation*}
As stated before, this method uses the derivative of the function with regard to the parameters. In effect this method works a little like the gradient decent method, you need the gradient of the function in regard to the parameters:
\begin{equation*}
\mathbf{H}(\boldsymbol{\theta}) = \frac{\partial \hat{y}(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}} = -2 \begin{bmatrix}
\frac{x - b_x}{g_x^2} \\
\frac{y - b_y}{g_y^2} \\
\frac{z - b_z}{g_z^2} \\
\frac{(x - b_x)^2}{g_x^3} \\
\frac{(y - b_y)^2}{g_y^3} \\
\frac{(z - b_z)^2}{g_z^3}
\end{bmatrix}^T
\end{equation*}
The gradient will be a $m \times 6$ matrix where m is the number of measurement points. For each measurement set the gradient is calculated and you end up with as many derivatives as measurements. This will make it possible to see if we are coming closer to the real solution or if we are diverging from it. 
   Now we have the prerequisites to the Gauss-Newton linearization method! The iterative solution is calculated with the following equation (how this equation is derived is not something I will go through here):
\begin{equation} \boldsymbol{\theta}_{k+1} = \boldsymbol{\theta}_k + \left ( \mathbf{H}^T (\boldsymbol{\theta}_k) \mathbf{H}(\boldsymbol{\theta}_k) \right )^{-1} \mathbf{H}^T (\boldsymbol{\theta}_k) \left ( \mathbf{y} - \mathbf{\hat{y}}(\boldsymbol{\theta}_k) \right ) \end{equation} Where $\mathbf{y}$ is what we want our function to be, which in our case is $\mathbf{y} = 1$ and note that $\mathbf{\hat{y}}$ is a vector - not a scalar. Now it's just to follow the following steps:
  1. Set the iteration number $k = 0$
  2. Make an initial guess, $ \boldsymbol{\theta}_0 $ of the parameter vector.
  3. Compute the gradient $ \mathbf{H}(\boldsymbol{\theta}_k) $ and the modeled response by $ \hat{y}(\boldsymbol{\theta}_k) $.
  4. Update the parameter estimates according to Eq. (1).
  5. Check your stop criteria $\left \| \boldsymbol{\theta}_k - \boldsymbol{\theta}_{k-1} \right \| < \varepsilon$, where $\varepsilon$ is a small number, I choose $\varepsilon = 10^{-7}$.
  6. If any of the stop criteria is met, stop the algorithm, otherwise increase the iteration counter and start from point 3.
How many measurement points are needed?
One of the big questions are how many data points are needed for this to work? The answer requires some testing, but I have found that as low as 5 measurements per axis and direction is about where the minimum is (total of 30 measurements). To get the best data set it is recommended to do the so-called "six-point measurement" which means that the system will be put in 6 known positions corresponding to the faces of a cube. With that said, I use 50 (total of 300) measurements and with that I get within 0.2 % of the real value in simulations. Which value you choose is up to you.
   How to do this with the magnetometer is a bit trickier. The same method can be used, which works, but is sub optimal. I have tried holding the system facing North and slowly rotating it along the E-W axis and after that rotating it around the N-S axis. This seems to work as well but some more testing is required. However even with the six-point measurement magnetometer calibration is within 0.5%, which is more than good enough!

Accelerometer and Magnetometer Calibration: Part 1 - Theory

Introduction

In this small series of posts I'll be going into the art of calibrating sensors and mainly the calibration of 3-axis accelerometers and 3-axis magnetometers. The question that arises might be why there even is a need of calibration and the best answer is that all sensors have manufacturing differences, differences in the PCB design, components close by creating distortions and so on, and to counter these effects calibration is mainly used. An example of how an ideal measurements of accelerometer and magnetometer looks versus measurements corrupted with gain errors and bias errors is shown bellow in Figure 1. The errors have been  some what over emphasized. This is the result you'd expect if, as an example, you put a magnetometer on a table and rotated it one turn while keeping it flat to the table.

Figure 1: Ideal sensor measurements versus corrupted measurements. Crosses denote measurement points and the red circle is the center of the ellipsoid.

This really rises three questions: 1) how to model the ideal case, 2) which errors can occur and 3) how to model the case with errors. All these problem have quite simple solutions however by just looking at the data it can be hard to identify them.
     I will warn the average person, there will be a lot of math and I will not explain the background theories. I will assume that you know about vector and matrix algebra, derivatives and how to calculate with ellipses.

Modeling

1) How to model the ideal case?
Before we start trying to make a model we need to figure out what we are measuring. Both the magnetic field and the gravitational field are, in each point on Earth, a vector that has a direction and a magnitude and what we are measuring is the x, y and z components of these two vectors respectively. In the ideal case we assume that the magnitude is constant for all directions, there is no shifting (bias) and with this we can create an ideal measurement model. A vector with constant magnitude is most simply modeled as a sphere with the base of the vector in its center and the tip being able to touch any point, this gives the following equation of a sphere with the radius R:
\begin{equation*} x^2 + y^2 + z^2 = R^2 \end{equation*}  This is how we would want our measurements to look, but unfortunately this is usually not the case as we will see in the next part.

2) Which errors can occur?
Before we start looking at different errors we have to make some assumptions. We assume that the errors are uncorrelated, meaning that for example errors in the x-axis does not effect the errors in the y- or z-axis and that the noise is Gaussian (White). With this we can start identifying the different errors from the right picture in Figure 1. In this figure we can see that the center of the ellipse has been shifted, we will call this bias errors, and that the circle is not round any more but elliptical, we will call this gain errors, plus on top of this there is some random noise.

3) How to model the errors?
To model this let start with the radius of the circle. If you remember the old trigonometry classes, the way to represent a sphere with gains on x, y and z is called an ellipsoid and the general equation for an ellipse with its center in the origin have the equation:
\begin{equation*} \left (\frac{x}{a} \right )^2 + \left (\frac{y}{b} \right )^2 + \left (\frac{z}{c} \right )^2 = 1 \end{equation*}
Here I have included the $R^2$ in the a, b and c constants (more information on ellipsoids) and with this equation we can represent gains in each of the axises in the sensor.
     Now, lets start looking at the offset of the center of the ellipsoid. If we look back to our old equation solving classes then we know that to shift a function $f(x)$ a distance $b$ to the right we can write that as $f(x-b)$. Remember that we assumed the error to be uncorrelated and knowing this we can rewrite the ellipsoid equation to incorporate bias. In this equation I have renamed the gains from $a, b, c$ to $g_x, g_y, g_z$ respectively and I call the biases $b_x, b_y, b_z$:
\begin{equation*}\left (\frac{x - b_x}{g_x} \right )^2 + \left (\frac{y - b_y}{g_y} \right )^2 + \left (\frac{z - b_z}{g_z} \right )^2 = 1 \end{equation*}
This is the complete measurement model of both the accelerometer and the magnetometer, where $x, y, z$ are the measurements corrupted by gain ($g_x, g_y, g_z$) and bias ($b_x, b_y, b_x$) errors respectively.
    Using this equation I will, in the next part, show how to use Least-Square solving methods for finding the gains ($g_x, g_y, g_z$) and biases ($b_x, b_y, b_x$) respectively plus how to handle the random noise on top of the model.