So why is this a damnation? Besides the facts that “big data” is not new and the term “data scientist” is bullish!t? Let’s list a few reasons:
- it’s unnecessary confusion for the sole purpose of
- fear marketing and
- labour cost inflation.
Let’s take each of these one by one.
Big Data is NOT new
We’ve always had more data than we could fit in memory, or even on a hard disk. When the doctor was getting his degrees back in the early 90’s, he was focussing on (multi-dimensional) data structures, algorithms, and computational geometry (which are the fundamental computer science and mathematics theories that underlie databases, analytics, and optimization) and when he was designing structures and algorithms to process this data efficiently and effectively, and studying their ability to scale, he regularly ran into the problem of not having enough memory to fit all of the data in memory that he wanted (to study large scale applications) or not enough disk space to store everything he wanted (and support swap files). Physicists, (GIS) engineers, operations researchers, etc. always had (access to) more data than they could work with at any one time or even fit on a single machine. Nothing has changed. Yes it is true that we can collect data faster than ever with so many devices with microprocessors and onboard memory collecting data every second, but it’s also true that hard drives and memory have scaled. Back in the day, the doctor‘s first PC had a 10 GB hard drive and 1 MB of RAM (and that was a lot). Now, the doctor‘s three-year old mid-end laptop has 8 GB of RAM and a 500 GB hard drive (and the average server has 256 GB, or more, of RAM and a few terabytes of hard drive space).
“Data” Scientist is a bullsh!t term
Pardon my language, but what the hell is a “data” scientist. Don’t you understand that every scientist is a “data” scientist. All scientists collect data, analyze data, interpolate data, make hypotheses on data, and collect more data to test those hypotheses. All scientists do this, no exceptions. Some, such as statisticians or computer scientists, focus more on data analysis and interpretation than others, but they are not a “data” scientist. They are a statistician or a computer scientist.
Pretty much everyone who uses “big data” or “data scientist” is using it in such a way as to do their best to confuse you to the point where you feel stupid and ask them for help. Help which will cost a small fortune.
Most utilization of the terms is designed to not only confuse you, but instill as much fear as possible because it is designed to make you feel like you don’t understand it, but your competitors do and, moreover, if you don’t figure it out fast, your competitors are going to use their understanding to derive insights that these competitors will then use to steal your customers and your marketshare — so you better pay a big fistful of cash to take a ride on the “big data” bandwagon fast, or risk being stranded on the side of a desert road with no horse, no water, and no map.
Labour Cost Inflation
Because the providers who are driving the “big data” bandwagons are driving unnecessary demand for their analysts (which they are calling “data scientists”) through their confusion-based fear marketing, and rapidly reducing availability of those resources (beyond normal utilization of those resources) when they are successful, they are able to unrealistically inflate prices because of the perceived lack of resource supply. Net effect, you pay more for resources you may not even need!
In short, not only is “big data” an eternal damnation (and one that children of the 80’s would proclaim originated on Eternia), but it is a damnation that will be in your face day-in and day-out until the providers find a new fear-driven bandwagon to thrust upon you.