HyperAI

By Super Neuro

“We make five predictions for data science, machine learning, and artificial intelligence in 2019. We also take a look back at last year’s predictions to see which ones actually came true.”

By William Vorhies

Editorial Director, DataScienceCentral; President and Chief Data Scientist, Data-Magnum; Chairman, EB5C

A quick review of last year’s forecast

Prediction 1: Model building and data preparation will become more automated. Larger data science operations will be integrated into one platform. Only a small number of data scientists will be needed to do many jobs.

That’s right, with end-to-end integration on advanced analytics platforms, no-code data science is on the rise.

Prediction 2: Data science will move towards specialization, and the “full stack” data scientist will no longer exist.

Also right. People now pay more attention to data engineers than data scientists because data engineers are more practical.

Prediction 3: Non-data scientists will be better able to perform high-volume and complex analytics than data scientists.

Indeed, as evidenced by the popularity of Data Viz and Visual Analytics, non-data scientists are getting more value from sophisticated data science tools.

Prediction 4: Deep learning is difficult. Not many data scientists are proficient in this field, which will limit the application of artificial intelligence unless deep learning platforms are more simplified and productized.

Both Microsoft and Google have launched automatic deep learning platforms, starting with transfer learning and moving towards full AutoDL (automatic machine learning). In addition, there are some integrated AutoDL platforms. For example, one clicks.ai has a complete AutoML and AutoDL platform. Gartner recently nominated DimensionalMechanics, which has an AutoDL platform, as one of the "Top Five Trend Companies".

Prediction 5: Despite the hype, the penetration of AI and deep learning and the breadth and depth of their impact on the market will be disappointing.

Aside from the rapid development of chatbots, AI applications are actually very limited. AI has been overhyped, and in actual companies, the proportion of AI and machine learning that is actually used is also very small.

Prediction 6: The public (and governments) will begin to seriously examine the social and privacy implications of AI.

Looking at some of the actions of the US government and the European Union, we can see that these issues are being taken seriously, such as California's upcoming privacy regulations and Australia's anti-encryption statements.

Unsurprisingly, all of our 6 predictions from last year were correct. While some of our views are still true this year, we’ll try to make some more specific predictions.

Predictions for 2019

Prediction 1: Data will become more important than algorithms

There have been no major breakthroughs in deep learning and classical machine learning algorithms for more than a year. Slightly improved temporal convolutional networks (TCNs) are used instead of RNNs to reduce latency in NLP, but there is no innovation at all. Good algorithms are either already well-known or can be developed using automatic machine learning.

At this stage, having a large amount of high-quality data is the key to a company's digital transformation. This has also led to competition and opportunities in data provision solutions, which can be roughly divided into the following directions.

The first direction is how to obtain accurately labeled training data. Companies like Figure Eight that work on labeled data are promoting cost-effective and intelligent strategies. For example, active learning can make the best choice between labeling data and ensuring model accuracy.

The second direction is access to third-party data. Service companies like DymstData have entered this field as a clearing house for hundreds of additional data. They also take on the role of protecting sensitive PII, and their users can enforce roles to access certain sensitive information, which is particularly important in financial and medical services.

The third direction is to automatically track and record the source of data in the model. Especially when streaming data from multiple sources is integrated and changes in real time, it is important to know the source of the data and how to use it. Tibco and some other analysis platforms are integrating this function.

Prediction 2: Everything will get easier as AI/ML moves analytics platforms to industry or process specific programs.

Looking at the AI/ML startups, the competition is shifting towards industry or specific programs. These programs or mini-platforms are focused on solving industry specific problems in various businesses such as marketing, B2B sales, healthcare, fintech, and other defined groupings.

These new applications focus on embedding AI/ML, so that when the enterprise updates, it does not need the support of a large internal group of data scientists and only relies on these developers.

Some call this the commoditization of AI/ML, but it’s more accurately called the professionalization of AI/ML.

Such a transformation is like the shift from process transformation (reengineering) to enterprise resource planning (ERPs) in the late 1990s. At that time, reengineering called for companies to use complex custom-developed IT solutions to improve processes, which opened the door to major integrated ERP and CRM companies such as Oracle, PeopleSoft, and SAP.

New vendors all strive to provide broad solutions in their specific markets, but inevitably end up with smaller ERP platforms.

Also keep an eye on mid-sized and smaller companies that don’t have large data science teams or rely entirely on custom-developed models to accelerate their AI/ML adoption.

Prediction 3: The rise of data engineers and data analysts

It’s not that the world has given up on data scientists. There’s a process. But when you lack certain skills, the market fills that gap in different ways.

One way to do this is through the industry- and process-specific intelligence applications discussed above, which do not require a large number of in-house data scientists.

The second approach is the rapid emergence of automated machine learning (AML) platforms, which are more efficient in data science, meaning fewer data scientists can do a lot of the work.

Since the number of models is increasing rather than decreasing, this will shift the burden of work to data engineers who have both skills.

First, he was able to create the infrastructure required for data science, such as a data lake and Spark instances.

The second is to take the models and ensure they are implemented in the operational system and track accuracy and refresh.

Some data engineers are also responsible for data operations, ensuring the cleanliness and pre-processing of data streams.

Another trend in analytics platforms is the development of visual analytics and data visualization tools. Now, most of these tools are fully integrated with data science tool sets, allowing data analysts and LOB managers to extract more value and even guide analytical work. They will not replace data scientists, but strengthen the role of advanced analytics in the team.

Prediction 4: Neuromorphic chips: AI and IoT move to the forefront

Both of these different technologies are in a semi-mature state at the same time to solve a long-standing problem - latency.

For example, consider when you want to translate a word of text or an image using your mobile device, your device sends that signal back to the application where the translation takes place in the translation cloud and then back to your device.

Google and other instant translation services have moved from RNN to a specialized CNN structure, namely a temporal convolutional network, because RNN does not adapt well to MPP, but CNN does. This conversion reduces the latency, but the signal is still intact.

One technology that will solve this problem is 5G networks. Everyone knows that 5G will be fast, but its real benefit is that it can carry more traffic.

The second solution is to introduce new neuromorphic chips (also called spiking neural networks). This neural network architecture may be the key to achieving general artificial intelligence (AGI). But it will take some time.

Currently, major chip manufacturers and some start-ups are developing dedicated chips for pulse neural networks, which are optimized for CNN and RNN, some of which are also optimized for extremely low power consumption.

Together, these capabilities will move deep learning to chips at the edge of the network, and we'll see how that ignites the Internet of Things and other streaming data applications starting this year.

Prediction 5: Different AI frameworks will merge with each other

Text, speech, image, and video models have become mainstream, but they still encounter great obstacles. Models built on one framework (such as Caffe2, PyTorch, Apache MXNet, Microsoft Cognitive Toolkit, and TensorFlow) cannot be easily ported to another framework.

Fortunately, this has also led to some innovations, such as the Open Neural Network Exchange (ONNX) platform built by AWS, Facebook, and Microsoft, which makes models interoperable across different frameworks.

As the number of models shared between developers, applications, and devices increases, ONNX will be a key technology this year.

This is a prediction for this year. We will look back next year to see if this field is different from what we thought.

Command Palette