Distributed approach for Pattern Recognition
Most of the past initiatives in distributed pattern recognition (DPR) have focused on providing a distributed architecture for pattern recognition. The problem with this approach is the high dependency on hardware...
by Tomas
There were attempts to create a hardware independent implementation of the distributed pattern recognition, but this was fully realized. The problem is the complexity of the pattern recognition algorithms. This blog will describe the important characteristics and aspects of DPR and it is a summary of the second chapter of the book internet-scale pattern recognition.
Scalability of neural network approaches
Scalability can be achieved using a distributed approach. There are two important factors related to scalability: storage capacity and inter-neuron communication frequency. • Storage capacity – a baseline evaluation of storage capacity is based on how the increased number of patterns affects a given network. • Inter-neuron communication frequency – it is related to the number of messages projected by a single neuron towards other neurons in the network. Problem is that the actual implementations of the neural networks have a high communication frequency which limits the scalability of the recognition implementation. It is important to keep the communication frequency as low as possible if the network should be scalable.
Key components of DPR
There are three main key components of a scalable distributed pattern recognition scheme: learning algorithms, the process approach, and the training procedure. • Learning mechanism – there are three main learning mechanisms mentioned according to DPR. First one is the Hebbian learning. The potential for saturation and catastrophic forgetting make this technique less scalable. The next two learning mechanisms are incremental learning and one-shot learning. In incremental learning training data are divided into separated subset. Each data subset individually undergoes a training phase. One-shot learning is a method which learns in one cycle. • Processing approach – Distributing the input space within a pattern recognition algorithm improves the processing speed. According to Amdahls law, when a higher fraction of the task can be parallelized, parallel processing can achieve the maximum speed. • Training procedure – It allows the algorithm to learn from a sample data set before it will be used for recognition. The graph neuron was mentioned which is a single-cycle training method.
System approaches
There were two types of distribution approaches described. The first one was process farming. In this method there are copies of the pattern recognition algorithm. Each one uses for training different subset of the training set. But the weight changes and errors are sent to the master node. This approach is showed in the next figure.
The second approach is called pipelining. In this method, the weights and error changes are passed from one process to another. This approach is showed in the next figure.
Pattern distribution techniques
There are two pattern distribution techniques: Subpattern distribution and set distribution. • Subpattern distribution: each pattern is partitioned into subpatterns for recognition over the entire network. • Set distribution: A pattern set containing a number of patterns is distributed for recognition. This two methods are showed in the next figure.
Conclusions
This blog showed the main features related to the distributed pattern recognition. It describes the two most important factors according to the scalability of neural network approaches, key components of DPR, system approaches for DPR and pattern distribution techniques. It was based on the second chapter of the book internet-scale pattern recognition.
Tomáš Cádrik













