In Neural Network Programming - Part 1, we looked at how networks are trained, some mathematical formulae, and the processes I took in building a beginner neural network. Now, let's talk about the training efficiency, and optimizations of the network which resulted in 100x speed improvements.
I spent many hours writing some concise PHP scripts which would run the neural network on test data
N times and retrieve an average result over 100 trails for each
Ni time. The network would train with an iteration count of:
iterations = i * 10000. In other words, for each set of 100 trials, it would train with 10000 more iterations. I ran this starting at 10000 trials, and went to 200000 trials.
My hypothesis was the efficiency of training should decrease as the iterations increases. Or in other words, the rate at which the training was effective should decrease with more iterations. After all, there's only so much the network can learn.
I put the results into Excel and the result wasn't surprising, but none-the-less it was intriguing. The results in the table correspond to the average output when fed with sample data, not the error.
I put this into Excel and created a basic scatter plot and I was surprised to see how smooth the curve was.
Clearly the rate of change in the line is exponentially decreasing as we linearly increase the iterations which the network trains with. This was to be expected, however, I didn't expect the exponential curve to be so smooth and apparent. Even with 100 trials being averaged, I had not anticipated the curve being truly decreasing exponentially.
Echo Statements are Unbelievably Costly!
Let me be blunt here. PHP is a great scripting language, but some aspects of it are horrendously slow. I had not realized before creating this neural network, just how turtle-paced echo statements are. I was only utilizing them once every
1000 iterations to output the error at that point as a sort of debug, but after removing all echo statements altogether except when the program was finished saw net speed improvements of over
100x. Trying to run the neural network with
25000 iterations with echo statements could take
2 to 4 minutes. On the same computer, without echo statements, and using
200000 iterations, the algorithm would conclude in
15 to 20 seconds.
Hidden Layers and Hidden Neurons
As with almost anything, there's a point when the advantaged gained by using more of a tool becomes less and less effective. Hidden layers and their hidden neurons are no exception. The running time of adding more hidden layers and hidden neurons increases quadratically. It can be summarized to look a bit like the function:
time = x * y where x is the number of hidden layers and y is the count of hidden neurons per layer. Notice the function looks similar to a function of area, and thus, represents quadratic growth.
I learned a function for optimization the amount of hidden layers and neurons. It looks a little something like:
neurons = x / (a * (y + z)) where a is an arbitrary coefficient between 2-20, x is the number of training sets you are feeding the network, y is the amount of input neurons the network has, and z is the amount of output neurons the network has. The amount of hidden layers in a neural network is often left at
1 as adding more layers is typically only useful in extremely complex problems, and it is much harder to train a neural network with many layers.
Thanks for reading!
Software Engineering Student - U of R
Current: Assistant to Manager of Instructional Tech - U of R
Email 2: firstname.lastname@example.org