October 2018
Text by Emilia Marius

Image: © a-image/istockphoto.com

Emilia Marius is a senior business analyst with more than eight years of experience. She focuses on IT solutions for retail and e-commerce, has applied her skills to projects such as a sales analysis system for a retail company, a mobile payment solution for an e-shop, and more. 


 


emilita.marius[at]gmail.com 

Are deep learning algorithms a viable solution?

We have entered the age of Artificial Intelligence (AI). But how do we implement it? To assess the pros and cons and choose the most suitable solution for each project, it’s important to understand the difference between machine learning and deep learning.

Until now, these terms have sometimes been used interchangeably, although there are notable differences. In fact, deep learning is a very particular type of machine learning which aims to replicate the way the brain works and apply this insight to computers. It uses neural networks, which contain artificial neurons organized in layers.

A neural network has three categories of neurons: the input layer, the hidden layers, and the output layer. The input layer receives bits of information to be analyzed and assigns probability weights, sending the result to the hidden layers. Here, this process is repeated and sent to the output layer.

Let’s take a closer look at the strengths and challenges of using deep learning.

 

The pros of using deep learning

Deep learning is recommended for the recognition of text, images, video, and audio. By adapting the number of layers – called architecture – the neural network can be reconfigured to solve a number of interconnected problems. The role of the hidden layers is to reduce the need for feature engineering. The difference to generic machine learning is that deep learning uses unstructured information and requires no labeling.

Unstructured, unlabeled data

This appetite for unstructured and diverse data is the main benefit of deep learning. You can download various data such as score tests or even complete social media profiles and feed them into a neural network to identify all certain links, including, for example, the correlation between IQ and the likes on Facebook. Such associations can be most significant when it comes to diagnosing diseases. But there are many other cases, where they can be applied, such as for making stock market predictions.

Not only does deep learning use unstructured data, it even reduces the need to label this data, as was previously necessary with generic machine learning. Most labeling processes require low skills, but create a high volume of work, which was not necessarily a barrier. Yet, if the algorithm is used for identifying delicate medical conditions which would require the pre-labeling of enormous data sets by experts, the costs would not justify the project. A great example can be found in this study by InData Labs.

Reusable, generalizable and scalable

Once a network is trained, it can be reused as many times as necessary for the same problem. The good news is that as the system performs more analysis, it even becomes better. Also, since neural networks operate over a cloud architecture, it is highly parallelizable. The training can be split between centers for fast and accurate results.

Also, the same neural network setting can be repurposed to serve a similar problem. You don’t have to think about implementing features ahead of time, it adapts.

Forget feature engineering

Traditional machine learning processes required the human input of slicing content to extract features of raw data. Deep learning networks can do this on their own, without the need for data scientists. This can save months of work and therefore resources.

By allowing the algorithm to do its discovery process, the results might be surprising. This is due to the fact that some hidden links are not necessarily logical, but discoverable and exciting.

The obstacles for deep learning algorithms

Although useful, deep learning algorithms have some particularities which often make them a second choice. To be accurate, these require vast amounts of training and calibrating data. Also, as these are black-boxes, they need particular attention when it comes to training and fine-tuning. Even with all necessary measures, deep learning can be affected by over-fitting which means learning too much and getting lost in the details.

Beating the black box

One of the most severe criticism machine learning and particularly deep learning faces is related to its black box behavior. Information goes in, the artificial neurons process it, and results come out. There is no traceability of how this happens, and there is no way to influence the process in small increments.

This puzzles engineers and data scientist who are used to making slight variations in the input, measuring results and drawing conclusions. In the case of deep learning, this can only be achieved by retraining the network, sometimes with surprising results.

Loads of data

Since every slight change requires retraining on a consistent data set, the total amount of information needed to build a capable deep learning algorithm can be detrimental to the project. Furthermore, there is no mathematical formula to approximate just how much data would be enough to generate satisfactory results. It’s a matter of iterating until consistent outcomes appear often enough.

Neural networks require even more data compared to generic ML since these first need to explore the problem, create a pattern of the solution and test new data against that template looking for a match.

For example, to learn the difference between chihuahua or muffin, the computer needs thousands of pictures of both entities from a lot of different angles, in a different light and even varying degrees of detail. There is no guarantee that the next picture set for evaluation will be a success.

The danger of overfitting

Just like a child who gets lost in the details of a book but doesn’t remember the story, an algorithm can pay so much attention to details that it misses the big picture. Since there is no way of pre-determining what the neurons will focus on, you could end up with a model which is incapable of performing the task it should. To avoid this problem, simply look at the accuracy of the model after each training epoch. Stop once you see linearity in the evolution of precision.

 

Further reading