## Iteration

First the pool of nodes gets extended by putting the i-1-th layer in the pool.

$S_{i}=S_{i-1}\cup&space;L_{i-1}$

The i-th layer gets formed as follows:

$L_{i}\left&space;\{&space;N_{i}^{j}(N_{k}^{m},N_{l}^{n}):k\neq&space;l,\forall&space;(N_{k}^{m},N_{l}^{n})\in&space;S_{i}&space;\right&space;\}$

i.e. every node from the i-1-th layer gets combined with every node from the pool as inputs, to produce a new node from the i-th layer; coefficients of the new node are obtained by least-squares fitting to the training set.

The next step is to limit $\small&space;L_{i}$ to a smaller number of nodes to narrow the search space. It is done by introducing an appropriate error criterion, which gets evaluated on the validation set for each node $\small&space;N_{j}^{i}\in&space;L_{i}$, and only the best n nodes according to this criterion remain in $\small&space;L_{i}$; others get discarded, thus effectively limiting the search space to a beam of width n.