Parametric least squares estimation equations (the solution)

In our previous lesson we looked at the overall process of solving a geomatics networks problem. We get to it as follows for the parametric case:

  • By deriving the relevant functional model \mathbf{l}_{true} - \mathbf{F}(\mathbf{x})=\mathbf{0}
  • Then by linearizing it to get it into the form \mathbf{A}\boldsymbol{\delta} - \mathbf{e} + \mathbf{w}=\mathbf{0}
  • Then by solving that linear form under the constraint that \mathbf{e}^T\mathbf{C}_\mathbf{l}^{-1}\mathbf{e} = minimum

We also discussed how, fortunately, the general form of the latter solution has long been known under the name “estimation using a parametric least squares adjustment”.

That solution is as follows1.

Estimates of what we’ve been after from the start

Ever since our very first example, we’ve been building up to the way in which we can solve for estimates of the parameters and a set of adjusted measurements. Well here it is.

1. Estimates of the parameters

The vector containing the estimate of the unknown parameters, \mathbf{x}, is given by:

    \begin{equation*} \underset{u\times 1}{\hat{\mathbf{x}}}=\mathbf{x}_0+\hat{\boldsymbol{\delta}} \end{equation*}


    \begin{equation*} \underset{u\times 1}{\hat{\boldsymbol{\delta}}} = -(\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1}\mathbf{A})^{-1}\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1}\mathbf{w} \end{equation*}

and where \mathbf{A}, \mathbf{w}, and \mathbf{x}_0 are exactly the same design matrix, misclosure vector, and approximate coordinate vector that we studied in detail when we considered the linearized form of the functional model and when we practiced linearizing the fundamental observation equations.

2. The estimated residuals

The vector containing the estimated residuals, \hat{\mathbf{r}}, is given by:

    \begin{align*} \underset{n\times 1}{\hat{\mathbf{r}}} &= \mathbf{A}\hat{\boldsymbol{\delta}} + \mathbf{w} \\ &= \begin{bmatrix} -\mathbf{A}(\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1}\mathbf{A})^{-1}\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1} + \mathbf{I} \end{bmatrix}\mathbf{w} \end{align*}

Recall our earlier treatments of what is a residual for the general parametric case. This is the least squares estimate of that residual. (And you might notice that it is a rearranged form of the original \mathbf{A}\boldsymbol{\delta} - \mathbf{e} + \mathbf{w}=\mathbf{0} where \boldsymbol{\delta} is approximated by \hat{\boldsymbol{\delta}}.)

3. The adjusted measurements

From the above we can get the vector containing the adjusted measurements, \hat{\mathbf{l}}, as follows:

    \begin{equation*} \underset{n\times 1}{\hat{\mathbf{l}}}=\mathbf{l}_{measured}+\hat{\mathbf{r}} \end{equation*}

As you can see, this is just our best guess at what the measurements should be – given our estimated residuals.

Estimates of how good our estimates are

As we’ve discussed in class, one of the powers of the least squares approach is that it doesn’t just give you the parameters you’re after – it also gives you an estimate of their precision.

The least squares solution yields the following variance-covariance matrices.

Once we get to it, you will recognize that these come from not much more than the propagation of errors.

4. Variance-covariance matrix of the parameters

The variance-covariance matrix of the estimated unknown parameters is as follows:

    \begin{equation*} \underset{u\times u}{\mathbf{C}_{\hat{\mathbf{x}}}} =\mathbf{C}_{\hat{\boldsymbol{\delta}}} =(\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1}\mathbf{A})^{-1} \end{equation*}

5. Variance-covariance matrix of the adjusted measurements

The variance-covariance matrix of the adjusted measurements is as follows:

    \begin{equation*} \underset{n\times n}{\mathbf{C}_{\hat{\mathbf{l}}}} =\mathbf{A}(\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1}\mathbf{A})^{-1}\mathbf{A}^T \end{equation*}

6. Variance-covariance matrix of the estimated residuals

The variance-covariance matrix of the estimated residuals is as follows:

    \begin{equation*} \underset{n\times n}{\mathbf{C}_{\hat{\mathbf{r}}}} =\mathbf{C}_{\mathbf{l}} - \mathbf{C}_{\hat{\mathbf{l}}} \end{equation*}

7. Variance-covariance matrix of the misclosure vector

The variance-covariance matrix of the misclosure vector is just that of the observations:

    \begin{equation*} \underset{n\times n}{\mathbf{C}_{\mathbf{w}}} =\mathbf{C}_{\mathbf{l}} \end{equation*}

On the assumptions about the observation errors

As you can see from the solutions provided above, the whole least squares adjustment depends on the estimate of the variance-covariance matrix of the observations, \mathbf{C}_{\mathbf{l}}. If this is not correct, then everything above is also not correct.

As we will also see again later in the course, this variance-covariance matrix is defined with an a-priori variance factor, {\sigma}_0^2 which allows for statistical testing before the adjustment takes place (if {\sigma}_0^2 is known).

Fortunately, if {\sigma}_0^2 is not known, it’s possible to estimate the precision of the observations by looking at the residuals – which we can think about as the amount the observed values are “adjusted” by the estimation process. Put another way, we can estimate {\sigma}_0^2 from the output of the adjustment itself:

    \begin{equation*} \hat{\sigma}_0^2=\dfrac{\hat{\mathbf{r}}^T\mathbf{C}_{\mathbf{l}}^{-1}\hat{\mathbf{r}}}{df}=\dfrac{\hat{\mathbf{r}}^T\mathbf{C}_{\mathbf{l}}^{-1}\hat{\mathbf{r}}}{r-u}=\dfrac{\hat{\mathbf{r}}^T\mathbf{C}_{\mathbf{l}}^{-1}\hat{\mathbf{r}}}{n-u} \end{equation*}

If the factor \hat{\sigma}_0^2 is different from the a-priori factor {\sigma}_0^2 by a statistically significantly amount then the covariance matrices should be scaled by it, e.g. for a-priori variance-covariance matrices:

    \begin{equation*} \hat{\mathbf{C}} = \mathbf{C}_{corrected} = \hat{\sigma}_0^2 \mathbf{C}_{a-priori} \end{equation*}

and for variance-covariance matrices we have estimated:

    \begin{equation*} \hat{\mathbf{C}} = \mathbf{C}_{corrected} = \hat{\sigma}_0^2 \mathbf{C}_{estimated} \end{equation*}

For example, for the observations:

    \begin{equation*} \hat{\mathbf{C}}_{\mathbf{l}} = \hat{\sigma}_0^2 \mathbf{C}_{\mathbf{l}} \end{equation*}

and for the estimated parameters:

    \begin{equation*} \hat{\mathbf{C}}_{\hat{\mathbf{x}}} =\hat{\mathbf{C}}_{\hat{\boldsymbol{\delta}}} = \hat{\sigma}_0^2 \mathbf{C}_{\hat{\mathbf{x}}}= \hat{\sigma}_0^2(\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1}\mathbf{A})^{-1} \end{equation*}

The normal matrix

At this point we can also defined the so-called normal matrix, \mathbf{N}, which depends on the datum characteristics and geometry of the network. It’s given by the following subset of what we saw above:

    \begin{equation*} \mathbf{N} = \mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1}\mathbf{A} \end{equation*}

Which in turn means that some of the above equations can also be written as follows:

    \begin{equation*} \hat{\boldsymbol{\delta}} = -\mathbf{N}^{-1}\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1}\mathbf{w} \end{equation*}

    \begin{equation*} \underset{n\times 1}{\hat{\mathbf{r}}} = \begin{bmatrix} -\mathbf{A}\mathbf{N}^{-1}\mathbf{A}^T\mathbf{C}_{\mathbf{l}}^{-1} + \mathbf{I} \end{bmatrix}\mathbf{w} \end{equation*}

    \begin{equation*} \mathbf{C}_{\hat{\mathbf{x}}} =\mathbf{C}_{\hat{\boldsymbol{\delta}}} =\mathbf{N}^{-1} \end{equation*}

    \begin{equation*} \mathbf{C}_{\hat{\mathbf{l}}} =\mathbf{A}\mathbf{N}^{-1}\mathbf{A}^T \end{equation*}

    \begin{equation*} \hat{\mathbf{C}}_{\hat{\mathbf{x}}} =\hat{\mathbf{C}}_{\hat{\boldsymbol{\delta}}} = \hat{\sigma}_0^2 \mathbf{C}_{\hat{\mathbf{x}}}= \hat{\sigma}_0^2\mathbf{N}^{-1} \end{equation*}

On the implementation of this solution

As a final note here, be aware that all of the above are just the algebraic forms of the expressions and do not necessarily indicate the best way to compute things in a practical situation.

The following schedules are for Alex’s in-class students:

Welcome (back)!

Sign in:

RegisterForgot your password?