First the pool of nodes gets extended by putting the i-1-th layer in the pool.

The i-th layer gets formed as follows:

i.e. every node from the i-1-th layer gets combined with every node from the pool as inputs, to produce a new node from the i-th layer; coefficients of the new node are obtained by least-squares fitting to the training set.

The next step is to limit  to a smaller number of nodes to narrow the search space. It is done by introducing an appropriate error criterion, which gets evaluated on the validation set for each node , and only the best n nodes according to this criterion remain in ; others get discarded, thus effectively limiting the search space to a beam of width n.