## Least-squares fitting

Consider a second-order polynomial, a nonlinear output of the network node, utilized for least-squares fitting of its input regressors $\small&space;\mathbf{x_{m}^{t}}$ and $\small&space;\mathbf{x_{n}^{t}}$ to $\small&space;\mathbf{y^{t}}$

$&space;\hat{y}{_{i}}^{t}=\mathbf{a}\begin{bmatrix}&space;1\\&space;x_{mi}^{t}x_{ni}^{t}\\&space;x_{ni}^{t}\\&space;x_{ni}^{t}^{2}\\&space;x_{mi}^{t}\\&space;x_{mi}^{t}^{2}\\&space;\end{bmatrix}&space;=&space;\begin{bmatrix}&space;a_{0}&space;&&space;a_{1}&space;&&space;a_{2}&space;&&space;a_{3}&space;&&space;a_{4}&space;&&space;a_{5}&space;\end{bmatrix}&space;\begin{bmatrix}&space;1\\&space;x_{mi}^{t}x_{ni}^{t}\\&space;x_{ni}^{t}\\&space;x_{ni}^{t}^{2}\\&space;x_{mi}^{t}\\&space;x_{mi}^{t}^{2}\\&space;\end{bmatrix}$
The error of approximation is given by

$\small&space;e_{SSE}^{t}=\sum_{i=1}^{M}(\hat{y}_{i}^{t}-y_{i}^{t})^{2}$
The minimal error is obtained for a that is a solution to the system of equations

$\small&space;\begin{bmatrix}&space;N&\sum&space;x_{mi}x_{ni}&\sum&space;x_{ni}&\sum&space;x_{ni}^{2}&&space;\sum&space;x_{mi}&\sum&space;x_{mi}^{2}\\&space;\sum&space;x_{mi}x_{ni}&space;&&space;\sum&space;x{_{mi}}^{2}x{_{ni}}^{2}&\sum&space;x_{mi}x{_{ni}}^{2}&\sum&space;x_{mi}x{_{ni}}^{3}&\sum&space;x{_{mi}}^{2}x_{ni}&space;&\sum&space;x{_{mi}}^{3}x_{ni}\\&space;\sum&space;x_{ni}&\sum&space;x_{mi}x{_{ni}}^{2}&space;&&space;\sum&space;x{_{ni}}^{2}&space;&&space;\sum&space;x{_{ni}}^{3}&\sum&space;x_{mi}x_{ni}&\sum&space;x{_{mi}}^{2}x_{ni}\\&space;\sum&space;x{_{ni}}^{2}&\sum&space;x_{mi}x{_{ni}}^{3}&space;&&space;\sum&space;x{_{ni}}^{3}&\sum&space;x{_{ni}}^{4}&\sum&space;x_{mi}{x_{ni}}^{2}&space;&&space;\sum&space;x{_{mi}}^{2}x{_{ni}}^{2}\\&space;\sum&space;x_{mi}&\sum&space;x{_{mi}}^{2}x_{ni}&space;&&space;\sum&space;x_{mi}x_{ni}&\sum&space;x_{mi}{x_{ni}}^{2}&\sum&space;x{_{mi}}^{2}&\sum&space;x{_{mi}}^{3}\\&space;\sum&space;x{_{mi}}^{2}&\sum&space;x{_{mi}}^{3}x_{ni}&\sum&space;x{_{mi}}^{2}x_{ni}&\sum&space;x{_{mi}}^{2}x{_{ni}}^{2}&\sum&space;x{_{mi}}^{3}&\sum&space;x{_{ni}}^{4}&space;\end{bmatrix}&space;\begin{bmatrix}&space;a_{0}\\&space;a_{1}\\&space;a_{2}\\&space;a_{3}\\&space;a_{4}\\&space;a_{5}&space;\end{bmatrix}&space;=&space;\begin{bmatrix}&space;\sum&space;y_{i}\\&space;\sum&space;y_{i}x_{mi}x_{ni}\\&space;\sum&space;y_{i}x_{ni}\\&space;\sum&space;y_{i}x{_{ni}}^{2}\\&space;\sum&space;y_{i}x_{mi}\\&space;\sum&space;y_{i}x{_{mi}}^{2}&space;\end{bmatrix}$
where the summations are over all the instances.

The least-squares error can be evaluated on the validation set according to

$\small&space;\hat{y}{_{i}}^{v}=\mathbf{a}\begin{bmatrix}&space;1\\&space;x_{mi}^{v}x_{ni}^{v}\\&space;x_{ni}^{v}\\&space;x_{ni}^{v}^{2}\\&space;x_{mi}^{v}\\&space;x_{mi}^{v}^{2}\\&space;\end{bmatrix}&space;=&space;\begin{bmatrix}&space;a_{0}&space;&&space;a_{1}&space;&&space;a_{2}&space;&&space;a_{3}&space;&&space;a_{4}&space;&&space;a_{5}&space;\end{bmatrix}&space;\begin{bmatrix}&space;1\\&space;x_{mi}^{v}x_{ni}^{v}\\&space;x_{ni}^{v}\\&space;x_{ni}^{v}^{2}\\&space;x_{mi}^{v}\\&space;x_{mi}^{v}^{2}\\&space;\end{bmatrix}$
$\small&space;e_{SSE}^{v}=\sum_{i=1}^{M}(\hat{y}_{i}^{v}-y_{i}^{v})$