Neural Networks are very complicated programs accessible only to elite academics and geniuses, not something which an average developer would be able to work with, and definitely not anything I could hope to comprehend. Right?
Well, no. After an enlightening talk by Louis Monier and Gerg Renard at Holberton, I realized that neural networks were simple enough even for just about any developer to understand and implement. Of course the most complicated networks are huge projects and their designs are elegant and intricate, but the core concepts underlying them are more or less straightforward. Writing any network from scratch would be challenging, but fortunately there are some excellent libraries that can handle the grunt work for you.
A neuron in this context is quite simple. It takes several inputs, and if their sum passes a threshold, it fires. Each input is multiplied by a weight. The learning process is simply the process of adjusting the weights to produce a desired output. The networks we’re interested in right now are called “feed forward” networks, which means the neurons are arranged in layers, with input coming from the previous layer and output going to the next.
There are other kinds of networks, like recurrant neural networks, which are organized differently, but that’s a subject for another day.
The type of neuron described above, called a perceptron, was the original model for artificial neurons but is rarely used now. The problem with perceptrons is that a small change in the input can lead to a dramatic change in the output, due to their stepwise activation function. A negligible reduction in its input value can cause that value to no longer exceed the threshold and prevent that neuron from firing, leading to even bigger changes down the line. Fortunately this is a relatively easy problem to resolve with a smooth activation function, which most modern networks use.
However, our network will be simple enough that perceptrons will do. We’re going to create a network that can process an AND operation. That means it requires two input neurons and one output neuron, with relatively few neurons in the middle “hidden” layer. The image below shows its design, which should be familiar.
Monier and Renard used convnet.js to create browser demos for their talk. Convnet.js creates neural networks directly in your browser, allowing you to easily run and manipulate them on almost any platform. Of course there are drawbacks to using a javascript solution, not the least of which is speed. So for this article, we’ll use FANN (Fast Artificial Neural Networks). There is a python module, pyfann, which contains bindings for FANN. Go ahead and install that now.
Import FANN like so:
>>> from pyfann import libfann
And we can get started! The first thing we need to do is create an empty network.
>>> neural_net = libfann.neural_network()
Now, neural_net has no neurons in it, so let’s go ahead and add some. The function we’re going to use is libfann.create_standard_array(). Create_standard_array() creates a network where all neurons are connected to all the neurons for their neighboring layers, which we call a “fully connected” network. As a parameter, create_standard_array takes an array of the number of neurons in each layer. in our case, this array will be [2, 4, 1].
>>> neural_net.create_standard((2, 4, 1))
Then, we set the learning rate. This reflects how much the network will change its weights on a single iteration. We’ll set a pretty high learning rate, .7, since we’re giving it a simple problem.
>>> neural_net.set_learning_rate(0.7)
Then set the activation function, as discussed above. We’re using SIGMOID_SYMMETRIC_STEPWISE, which is a stepwise approximation of the tanh function. It’s less precise and faster than the tanh function, which is okay for this problem.
>>> neural_net.set_activation_function_output(libfann.SIGMOID_SYMMETRIC_STEPWISE)
finally, run the training algorithm and save the network to a file. The training command takes four arguments: the file containing the data that will be trained on, the maximum number of times the training alorithm will run, the number of times the network should train before it reports its status, and the desired error rate.
>>> neural_network.train_on_file(“and.data”, 10000, 1000, .00001)
>>> neural_network.save(“and.net”)
The file “and.data” should look as follows:
4 2 1
-1 -1
-1
-1 1
-1
1 -1
-1
1 1
1
The first line contains three values: the number of examples contained in the file, the number of inputs, and the number of outputs. The rest of the file consists of examples, each line alternating between input and output.
One your network has trained successfully, you’re going to want to try it out, right? Well, first, let’s load it from the file we stored it to.
>>> neural_net = libfann.neural_net()
>>> neural_net.create_from_file(“and.net”)
Next, we can simply run it like so:
>>> print nn.run([1, -1])
which should output [-1.0] or a similar value, depending on what happened during training.
Congratulations! You just taught a computer to do basic logic!