Deep Learning Application Areas – Pattern Recognition

Since it’s revival in 2006, Deep learning advanced so quick that is used in many aspects of modern society such as:

  • Deep Learning Analytics
    Companies such Google and Microsoft have access to large volumes of data, social media organisations such as Facebook, YouTube and Twitter have billions of users that constantly generate a very large flow of data, and they have the necessity of analyse it to make business decisions that impact existing and may define future technology. A way to achieve it would be with a tool capable of analyse and learn from massive amounts of unsupervised data. If this tool could replace feature engineering with human input by automated features extraction, better. That tool may be Deep Learning. Deep learning algorithms can create a hierarchical architecture of learning and representing data, where they can decompose high level features in a more small abstract features. By extracting such features, Deep Learning enables the use of simpler models, such as classification and prediction, which is important when developing models to deal in a scale such as the Big Data scale.
    Deep Learning can learn from large amounts of labelled data, but its most attractive characteristic is the capacity of learning from large amounts of unlabelled/unsupervised data, making it possible to extract meaningful representations and patterns from Big Data. After achieving this representations, more conventional discriminating models can be trained with the aid of relatively fewer supervised/labelled data points, where the data is typically obtained through human/expert input. More advantages of Deep Learning in Big Data Analytics is the capacity to exploit the availability of massive amounts of data where other algorithms without the learning hierarchies made available in Deep Learning failed to recognise and understand the complexity of the existent data patterns. His abstract data representation is better suited for perform analysis independently of the source and, as referred previously, it can minimise the input from human experts, i.e., no need to manually add new inputs for each different type of data that we need to analyse.
    Deep Learning algorithms are applicable to different kinds of input data, for example on image, text and audio data, and can combine with machine-learning algorithms.

 

  • Named-entity Recognition
    Name-entity recognition (NER) is a sub-task of Information Extraction to recognise information units like names, including person, organisation and location names, and numeric expressions including time, date, money and percent expressions. Most NER’s use machine learning models where are defined a large set of manually engineered features. While designing good features for NER’s requires a great deal of expertise and can be labor intensive, it also makes it harder to adapt to new domains and languages because the features could be language related. In 2011, Ronan Collobert proposed a Convolutional Neural Network architecture and learning algorithm, where instead of using human engineered features, the system learned internal representations in an unsupervised fashion from large amounts of unlabelled data.
    With this system the top performance reaches 89.59% F1 score, while the benchmark performance was 89.31% F1 score. Using Neural Networks to automate the features extraction achives better results than with human engineered features.

 

  • Speech Recognition
    Automatic speech recognition has always been an important research topic in the machine learning community. Most of these models used hidden Markov models (HMM) for the decoding. But with advances in both machine learning algorithms and computer hardware, more efficient methods have appeared for training deep neural networks (DNNs) that contain many layers of non-linear hidden units and a very large output layer. The use of Convolutional DNN had better results over the TIMIT (corpus of read speech for the development and evaluation of automatic speech recognition systems), with an accuracy of 81.7%.

 

  • Image Recognition
    When in 2012 Hinton’s team won the ImageNet competition using CNN’s with efficient GPU implementations, it was only the start. The system error was of 15.3% and since then this error was further reduced and the current best result on MNIST set is an error rate of 0.23%. With these results, the target became more ambitious and a task of automatic image captioning was created, in which deep learning is the essential underlying technology.
    The Facebook, for example, has an image recognition project called DeepFace, which uses a deep neural network with an accuracy of 97.35%, approaching human-level performance.

 

Fig.1. Deep Face (from https://www.slideshare.net/zzwolf/details-of-lazy-deep-learning-for-images-recognition-in-zz-photo-app)

 

  • Computer Vision + Natural Language Processing
    Describing a complex scene requires a deeper representation of what’s going on in the scene, capturing how the various objects relate to one another and translating it all into natural-sounding language. Many efforts to construct computer-generated natural descriptions of images propose combining current state-of-the-art techniques in both computer vision and natural language processing to form a complete image description approach. Models based on deep convolutional networks have dominated recent image interpretation tasks, but a new state-of-the art is being used. The idea is to merge computer vision and language models into a single trained system, taking an image and directly producing a human readable sequence of words to describe it. This model is based on a convolution neural network that encodes an image into a compact representation, followed by a recurrent neural network that generates a corresponding sentence.

But it’s not only used for speech or image recognition, it goes even further as I will show in my next post.
 
References

  1. Najafabadi, M., Villanustre, F., Khoshgoftaar, T., Seliya, N., Wald, R., Muharemagic, E.: Deep learning applications and challenges in big data analytics. J. Big Data (2015)
  2. Nadeau, D., Sekine, S.: A survey of named entity recognition and classification.
  3. Taigman, Y., Yang, M., Ranzato, M., A., Wolf, L.: DeepFace: Closing the Gap to Human-Level Performance in Face Verification. In Conference on Computer Vision and Pattern Recognition (2014).

Leave a Comment

Your email address will not be published. Required fields are marked *