## Network parameter selection

Though most scholars are concerned with the techniques to define artificial neural network architecture, practitioners want to apply the ANN architecture to the model and obtain quick results. The term neural network architecture refers to the arrangement of neurons into layers and the connection patterns between layers, activation functions, and learning methods. The neural network model and the architecture of a neural network determine how a network transforms its input into an output. This transformation is, in fact, a computation. Often, the success depends on a clear understanding of the problem, regardless of the network architecture. However, in determining which neural network architecture provides the best prediction, it is necessary to build a good model. It is essential to be able to identify the most important variables in a process and generate best-fit models. How to identify and define the best model is very controversial.

Despite the differences between traditional approaches and neural networks, both methods require preparing the model. The classical approach is based on the precise definition of the problem domain as well as the identification of a mathematical function or functions to describe it. It is, however, very difficult to identify an accurate mathematical function when the system is nonlinear and parameters vary with time due to several factors. The control program often lacks the capability to adapt to the parameter changes. Neural networks are used to learn the behavior of the system and subsequently to simulate and predict its behavior. In defining the neural network model, first the process and the process control constraints have to be understood and identified. Then, the model is defined and validated.

When using a neural network for prediction, the following steps are crucial. First, a neural network needs to be built to model the behavior of the process, and the values of the output are predicted based on the model. Second, based on the neural network model obtained in the first phase, the output of the model is simulated using different scenarios. Third, the control variables are modified to control and optimize the output.

When building the neural network model, the process has to be identified with respect to the input and output variables that characterize it. The inputs include measurements of the physical dimensions, measurements of the variables specific to the environment or equipment, and controlled variables modified by the operator. Variables that have no effect on the variation of the measured output are discarded. These are estimated by the contribution factors of the various input parameters. These factors indicate the contribution of each input parameter to the learning of the neural network and are usually estimated by the network, depending on the software employed.

The selection of training data plays a vital role in the performance and convergence of the neural network model. An analysis of historical data for identification of variables that are important to the process is important. Plotting graphs to check whether the various variables reflect what is known about the process from operating experience and for discovery of errors in data is very helpful.

All input and output values are usually scaled individually such that the overall variance in the data set is maximized. Therefore, the input and output values are normalized. This is necessary because it leads to faster learning. The scaling used is either in the range -1 to 1 or in the range 0 to 1, depending on the type of data and the activation function used.

The basic operation that has to be followed to successfully handle a problem with ANNs is to select the appropriate architecture and the suitable learning rate, momentum, number of neurons in each hidden layer, and the activation function. The procedure for finding the best architecture and the other network parameters is laborious and time-consuming, but as experience is gathered, some parameters can be predicted easily, tremendously shortening the time required.

The first step is to collect the required data and prepare them in a spreadsheet format with various columns representing the input and output parameters. If a large number of sequences or patterns are available in the input data file, to avoid a long training time, a smaller training file may be created, containing as much as possible representative samples of the whole problem domain, in order to select the required parameters and to use the complete data set for the final training.

Three types of data files are required: a training data file, a test data file, and a validation data file. The former and the last should contain representative samples of all the cases the network is required to handle, whereas the test file may contain about 10% of the cases contained in the training file.

During training, the network is tested against the test file to determine accuracy, and training should be stopped when the mean average error remains unchanged for a number of epochs. This is done in order to avoid overtraining, in which case, the network learns the training patterns perfectly but is unable to make predictions when an unknown training set is presented to it.

In back-propagation networks, the number of hidden neurons determines how well a problem can be learned. If too many are used, the network will tend to memorize the problem and not generalize well later. If too few are used, the network will generalize well but may not have enough "power" to learn the patterns well. Getting the right number of hidden neurons is a matter of trial and error, since there is no science to it. In general, the number of hidden neurons (N) may be estimated by applying the following empirical formula (Ward Systems Group, Inc., 1996):

where

I = number of input parameters.

O = number of output parameters.

Pi = number of training patterns available.

The most important parameter to select in a neural network is the type of architecture. A number of architectures can be used in solar engineering problems. A short description of the most important ones is given in this section: back-propagation (BP), general regression neural networks (GRNN), and the group method of data handling (GMDH). These are described briefly in the next sections.