Big data – Yes, it is still a big deal
Big Data is going through Gartner’s “trough of disillusionment”. Endless articles espouse why big data is a fad, doesn’t matter or is failing.
Yes, many companies’ initial big data projects will not meet expectations. Yes, big data consultants will “overcharge” inexperienced customers. And yes, the bearded hipster with a PhD in applied mathematics, wearing skinny suit pants and carrying a steampunk laptop, will continue to annoy.
These are teething problems. Enterprise adoption of new technologies is always difficult – how many Fortune 1000 companies are still struggling with SAP upgrades years later? In fact, big data has had a colossal impact in a number of ways:
“Shared nothing architecture” has radically dropped the cost of storage. Big data file stores and databases use a “shared nothing architecture”, allowing a cluster of low cost “PC” boxes to replace extremely expensive NAS and SAN storage. In addition to saving many millions of dollars on hardware, many big data databases are open source, eliminating the need for expensive Oracle licenses.
The infrastructure Google, Yahoo, Facebook, Twitter and Amazon invented to handle web scale is available for free to startups. Google et alteri had to re-invent database and storage technology to meet web scale requirements cost effectively, investing thousands of man years to create the big data stack (MapReduce, BigTable, DynamoDB, etc). Startups can now tackle web scale problems using all of this investment for free.
A lot of new applications rely on “sparse” data, which cannot be stored in any traditional database but is easily handled by big data databases. Complicated, but very important. In a traditional, relational database application, say order entry or general ledger, most if not all fields are completed for a given transaction. A field is skipped rarely; and when it is, the database records it as “Null”. Now let’s think of a new application, say fitness tracking or monitoring social networks. In these applications, the fields are not necessarily even defined, let alone complete. If these applications were implemented in a traditional, relational database, too many fields would be recorded as “Null”, swamping the database before any meaningful data is recorded. Big data technologies handle sparse data with ease (they don’t write the “Nulls”), which enables billions of events to be captured in near real-time.
The above are very few examples of the changes big data architecture has already brought about. But the biggest change is how much big data architecture has leveled the playing field for startups, enabling small development teams to tackle webscale problems. Startups as diverse as WhatsApp and Criteo have demonstrated, solving webscale problems can be lucrative and result in $1bn+ exits.