Make your own free website on Tripod.com

Artificial Neural Network


1.      Objectives of the analysis

Artificial neural network (ANN) is a computer program implementing sophisticated pattern detection and machine learning algorithms on a computer to build predictive models from large historical databases (Berson and Smith, 1997).  It has the ability to learn from examples in much the same way that human experts gain from experience (Berry and Linoff, 1997).  This ability is useful in data mining for a wide range of complicated business problems and produce good results.

ANN is popular for its wide applicability in many data mining and decision-support applications.  It has been applied across a broad range of industries such as customer response prediction or fraud detection.  Hence, the objectives of the analysis might include data mining tasks as following (Berry and Linoff, 1997):

2.      Description of the data required

ANN is particularly sensitive to the format of incoming data.  Different data representations can produce different results and therefore, setting up the data is a significant part of the effort of using them (Berry and Linoff, 1997).  Kay (2001) argues that ANN has also proved useful in dealing with complex data.

The way that the data in the training database is fed into the ANN can have a huge impact on the time it takes to train the network and also on the accuracy of the network (Berson and Smith, 1997; Berry and Linoff, 1997).  According to Berson and Smith (1997), the data is often preprocessed to spoon-feed the ANN with a form of the raw data because ANN expects all inputs and outputs to be floating-point values between 0 and 1.  Therefore, for a predictor to be mapped into one of these ranges, the values must be scaled in some way.  There are some strategies for the mapping.  For continuous data such as dollar amounts or ratios, scaling, non-uniform scaling, binning, thermometer encoding, time-difference embedding and general non-linear transformations can be used (Berson and Smith, 1997).  For categorical data such as gender or marital status, numeric encoding, one-of-N encoding and binary encoding can be used (Berson and Smith, 1997).  Berry and Linoff (1997) argue that a good data set should cover the values for all the features.

3.      Description of technique

ANN is an analytical models based on the assimilation of physiological properties of animal nervous systems.  As with animal nervous systems, ANN is composed of interconnected nodes (neurons) that are capable of processing and transmitting information (Boone and Roehm, 2002).  Datta and Tassou (1998) argue that ANN learns the main characteristics of a system through an iterative training process and they can automatically update the learned knowledge online over time.  Consequently, a self-organizing ANN is exposed to large amounts of data and tends to discover patterns and relationships in that data.  It also overcomes the limitations of traditional foresting methods including misspecification, biased outliers, assumption of linearity and re-estimation (Alon et al, 2001).

ANN can be applied to both directed and undirected data mining.  For undirected data mining, ANN will identify clusters of records that are similar to each other but it does not explain how they are similar (Berry and Linoff, 1997).  The discussion in the following section is focused on directed data mining only.  Berry and Linoff (1997) identify the following steps to use ANN for directed data mining: identify the input and output features, massage the inputs and outputs, set up a network, train the network, test the network, and finally apply the model.

Step 1: Identify the input and output features

The input and output must be well understood to reap the benefits of ANN.  For example, how much a customer is willing to pay for a shirt?  The input might be the shirt・s quality, color or pattern, finishing, other special features, brand, country of origin, etc.  Based on these inputs, ANN then use a feed-forward neural network to calculate the expected value that a customer is willing to pay, which is the output of the analysis.

A feed-forward neural network is a one-way flow through the network from the inputs to the outputs and there are no cycles in the network (Berry and Linoff, 1997).  It is organized into three layers: input layer, hidden layer (the weighting system), and output layer.  According to Kay (2001), each feed-forward neural network takes many input signals, then based on an internal weighting system, produces a single output signal that is typically sent as input to another feed-forward neural network.

Step 2: Massage the inputs and outputs

ANN works best when all the input and output values are between 0 and 1.  This requires massaging all the values to get new values between 0 and 1.  The massage includes three steps: the input layer converts the input into value between 0 and 1; then hidden layer multiplies the value by the weight and then sums up the value; finally this summed value is passed through a special conversion filter that turns the value into a number between 0 and 1 which is fed into output layer (Berry and Linoff, 1997).

Step 3 and 4: Set up and train a network

The network is set up with an appropriate topology and then training the network is required.  According to Berson and Smith (1997), the training of the ANN has the following steps:

(1)    Create an initial network with random weights assigned to the links.

(2)    Run each record from the training database through the network and use the backpropagation learning algorithm to correct the error.  At this stage, plenty of examples based on past experiences where both the input and output are known are used to train the network repeatedly by backpropagation.  This is a very technical and complicated process.  There are three steps for backpropagation (Berry and Linoff, 1997).  Firstly, the network gets a training example and uses the existing weights in the network to calculate the output.  Secondly, backpropagation calculates the error by taking the difference between the calculated result and the expected actual result.  Finally, the error is fed back through the network and the weights are adjusted to minimize the error.  Each feed-forward neural network is assigned a specific responsibility for the error, the network then assign responsibility for part of the error to input layer, hidden layer and output layer.

(3)    Check for stopping criterion.

(4)    If stopping criterion is not reached, return to step 2 and repeat with the entire training database.

Step 5 and 6: Test the network and apply the model

Finally, the network is tested strictly independent from the training examples to ensure that it has learned to recognize the best patterns in the training set.  The ANN model is done when the performance on the test set is satisfied.

Although there are benefits using ANN, there are two major weaknesses for this technique according to Berry and Linoff (1997).  First, ANN cannot produce explicit rules and explain results.  Secondly, ANN may come up with an inferior solution as there is no guarantee that the solution provides the best model of the data.

4.      Expected outcomes of analysis

The expected outcomes of the analysis are well understood.  The outcomes can be used for the following purposes:


References

Alon, Ilan; Qi, Min and Sadowski, Robert J (2000), .Forecasting Aggregate Retail Sales: A Comparison of Artificial Neural Networks and Traditional Methods・, Elsevier Science, [Online, accessed 27 December 2002]
URL:http://www.sciencedirect.com

Berry, Michael J and Linoff, Gordon (1997), Data Mining Techniques: For Marketing, Sales, and Customer Support, John Wiley & Sons, New York

Berson, Alex and Smith, Stephen J (1997), Data Warehousing, Data Mining, & OLAP, McGraw Hill, US

Boone, Derrick S and Roehm, Michelle (2002), .Retail Segmentation using Artificial Neural Networks・, Elsevier Science, [Online, accessed 27 December 2002]
URL:http://www.sciencedirect.com

Datta, D and Tassou S A (1998), .Artificial Neural Network Based Electrical Load Prediction for Food Retail Stores・, Elsevier Science, [Online, accessed 27 December 2002]
URL:http://www.sciencedirect.com

Kay, Alexx (2001), .Artificial Neural Networks・, Computerworld, Framingham, 12 February, [Online, accessed 27 December 2002]
URL:http://proquest.umi.com

Bibliography

Alon, Ilan; Qi, Min and Sadowski, Robert J (2000), .Forecasting Aggregate Retail Sales: A Comparison of Artificial Neural Networks and Traditional Methods・, Elsevier Science, [Online, accessed 27 December 2002]
URL:http://www.sciencedirect.com

Berry, Michael J and Linoff, Gordon (1997), Data Mining Techniques: For Marketing, Sales, and Customer Support, John Wiley & Sons, New York

Berson, Alex and Smith, Stephen J (1997), Data Warehousing, Data Mining, & OLAP, McGraw Hill, US

Boone, Derrick S and Roehm, Michelle (2002), .Retail Segmentation using Artificial Neural Networks・, Elsevier Science, [Online, accessed 27 December 2002]
URL:http://www.sciencedirect.com

Datta, D and Tassou S A (1998), .Artificial Neural Network Based Electrical Load Prediction for Food Retail Stores・, Elsevier Science, [Online, accessed 27 December 2002]
URL:http://www.sciencedirect.com

Fish, Kelly E; Johnson, John D; Dorsey, Robert E and Blodgett, Jeffery G (2000), .Using an Artificial Neural Networks Trained with a Genetic Algorithm to Model Brand Share・, Elsevier Science, [Online, accessed 27 December 2002]
URL:http://www.sciencedirect.com

Harrold, Dave (2001), .An Easier Way to Develop Artificial Neural Networks・, Control Engineering, Barrington, August, [Online, accessed 27 December 2002]
URL:http://proquest.umi.com

Kay, Alexx (2001), .Artificial Neural Networks・, Computerworld, Framingham, 12 February, [Online, accessed 27 December 2002]
URL:http://proquest.umi.com


Back to Information Systems and Internet Article List