Select Page

There are numerous advantages and benefits of big data, machine learning, and predictive analytics, however, data science is still a bit of a touchy subject amongst businesses of all sizes. Many businesses are still reluctant to adopt any of the related systems, and even when they do, they lag when it comes to using the information they collect properly. This is a huge issue as poor data across businesses, organizations, and the government amount to costs of up to $3.1 trillion a year. What’s worse is that almost 15 percent of marketers claim not to understand what big data is or how to use it. This proves that there is a general lack of knowledge when it comes to big data and data science. With that said, here are some of the most common misconceptions about data science.


Data Science Equals Big Data

It is not uncommon to see the interchangeable use of the terms “data science” and “big data.” In fact, one could argue that the “big data revolution” provided the platforms for what is now called data science. However, despite the origins of the confusion between the two, big data and data science are very different. Big data involves the collection, managing, and processing, of incredibly large amounts of data. This idea goes well beyond the 1s and 0s, hence why is more characterized by the “Three Vs” which include volume, variability, and velocity. On the other hand, data science refers to everything from mining, transforming, modeling, and storing data to exploring and analyzing data, as well as building models and algorithms around data and visualizing and interacting with the results. Big data should be seen as an aspect of data science as it describes the situation in which the data involved is characterized by one or more of the Three V’s.


Machines Learn

Many of us understand machine learning without even realizing it. A simple linear regression is a form of machine learning. This refers to a supervised learning algorithm where the observations given to the algorithm include both the dependent and independent variables. So, by providing the said algorithm with the “correct answers” in advance, you can build a model that can predict the answers for new observations. The key aspect in this example, however, is that the machine needs to learn the relationship by being taught. There seems to be a misconception that you can simple feed data into a computer and it will magically pop out useful answers. For most people, this misconceptions comes down to having realistic and grounded expectations. You can go a long way by understanding the basics of machine learning so that you can appreciate its strengths and weaknesses, and its limitations as well.