By Stephen Garside on 7/4/2017
Neural networks are nothing new in software development, but their recent rise in prominence in web and business software has been phenomenal. This rise has been responsible for the recent wave of 'machine learning' software and also the rapid development of Artificial Intelligence (AI) software including image, hand writing and voice recognition. The potential uses for neural networks in business and web based software are huge, and I hope to discuss some of these in this article as my development progresses.
For now I am going to document progress as I develop my first c# neural network example - my code can be viewed / downloaded on my Github page - c# neural network example. I am not going to go into the intricacies of how neural networks work - I will save that for another article, so I am assuming as you read this that you have some basic knowledge.
The initial aim is to develop a re-usable, feed-forward neural network using c# that I will initially use to recognise the characters of the alphabet from an 8 x 8 pixel image - much like an ASCII character from the keyboard. Once I have achieved this I will then explore other uses for the same neural network code base in business and web based software.
One of the aims of writing good software is re-use, and I am hoping to achieve with my c# neural network example.
My vision is to ultimately be able to pull a pre-trained neural network from the database and have it perform a function for me in code. For example, I might have a customer object in memory and I want to know whether or not they are a satisfied customer. I should be able to load up my 'Is this customer satisfied' neural network, push my customer record through it and get a real-time score that I can base other business logic upon.
I could write specific c# code to spot if a customer is satisfied, but why bother. If I have trained my neural network sufficiently in advance then it should be able to spot a satisfied/ dis-satisfied customer a mile off.
Using this approach I can set up and train another neural networks (using the same code base) to help with other logic tasks i.e. enable me to identify customers who aren't interacting with my company as much as I would like - again, all real-time and without having to write a separate piece of code.
A key benefit to this type of approach is the development and training of these 'logic neural networks' can be picked up by people who understand a company's data rather than a software developer. Once you have a coded neural network it can be re-used for any logic task without the overhead of having to write new code - you just train it do a different business task.
Initial check-in of my software to Github. I have started with a basic c# console app for speed. For now I am storing the initial inputs and weights in strings that are converted into two dimensional arrays and parsed at runtime. This approach should also make it much easier to store and retrieve from a database as the software develops. Initially I am going to create a basic 3 input, 2 layer network whilst I build up the coding concepts. The activation function I will be using is Sigmoid.
Today I have taken the decision to 'enterprise' my c# neural network rather than just creating a basic example. I have made this decision because at the end of this journey I would like to have a set of robust c# libraries that anybody can use to create, train and utilise neural networks in their business applications. Watch this space for a new check-in to github in the coming days.
I have completed code refactoring to change my initial array based approach into an object-orientated design. I have also renamed my projects to be more reflective of their purpose and remove the use of 'Example' within the project names. I have also introduced the use of C# interfaces so I can play around with different implementations of a neural network going forward.
My next task is to work on the calculation and back-propagation of errors through the neural network. Back Propagation is used to 'train' a neural network to give more accurate answers. This involves the use of training data to tweak the weights in a network until it gives you an expected answer - more on that in a later update.
Phew! working out how to back-propagate neural network errors in c# is a toughy! I have separated my back-propagation logic out into a 'neural network training service', and also introduced the ability to pass in a function that calculates the distribution of error across the incoming weights of a neuron. Currently I am using a function that distribute the error proportionate to the size of each incoming weight, but this approach allows me to change this in the future.
My next step is to do some testing of the back propagation logic to ensure it works with a neural network of unequal numbers of neurons per layer e.g. a network with 6 input, 3 hidden and 2 output neurons. If the logic stacks up then my next task will be to refine the weights within the network using the error.
If I have learnt anything at this stage it is to have a written down working example of a small neural network so you can test your logic, here is mine:
With the back propagation logic complete, the next step has been to calculate the revised network weights using the back propagated error values - this involves a bit of algebra! I have again chosen to pass a weight calculation function into my training service to allow me to test the logic separately and also switch out the logic if required. I had initially thought that I would need to use the individual back propagated error weights, but have since realised it is the sum of these weights on a per neuron basis that is important. The equation I have used to calculate a new error is as follows:-
Once again, my manual paper version of a neural network has proved invaluable when sense checking the output of my code - If you are having a go a writing a c# neural network then one of these is a must.
Its amazing what a holiday can do for you! Whilst away I have spent a little time reading 'volume 3: Deep Learning and Neural Networks' by Jeff Heaton - its not for the faint-hearted and I would recommend reading other introductory neural network books first before diving into this one :) .
My latest code commit includes a couple of new activation function options - ReLU (Rectifier) and Softmax. The ReLU function is apparently a much better option than using Sigmoid and is the function of choice in most networks.
The Softmax function is used to convert all the values of your output neurons into a proportion of 1 so you get a percentage probability for each neuron that collectively sum up to 1 in total. For example if your have three output values - [2.5, 2.5, 5], once softmaxed you are left with values of [0.25, 0.25, 0.5] - easy peasy :) .
My next Job will be to utilise these in my test harness program and also refactor the BackPropgation code to accept a different type of activation function - at the moment it is hard coded to use Sigmoid.