The Unsung Hero of Big Data
Earlier this week I was reading a blog post regarding the recent Gartner Hype Cycle for Advanced Analytics and Data Science, 2015. The Gartner chart reminded me of the epigram, “Plus ça change, plus c’est la même chose” (asserting that history repeats itself by stating the more things change, the more they stay the same.)
To some extent that is true, as you could consider today’s Big Data as derivative of yesterday’s VLDBs (very large databases) and Data Warehouses. One of the biggest changes IMO is the shift away from Star Schemas and practices implemented for performance reasons such as aggregation of data sets, use of derived and encoded values, the use of surrogate and foreign keys to establish linkage, etc. Going forward it may not be possible to have that much rigidity and be as responsive as needed from a competitive perspective.
There are many dimensions to big data: Huge sample of data (volume), which becomes your universal set and supports deep analysis as well as temporal and spatial analysis; A variety of data (structured and unstructured) that often does not lend itself to SQL based analytics; and often data streaming in (velocity) from multiple sources – an area that will become even more important in the era of the Internet of Things. These are the “Three V’s” that people have been talking about for the past five years.
Like many people, my interest in Object Database technology initially waned in the late 1990’s. That is, until about four years ago when a project at work led me back in this direction. As I dug into the various products I learned that they were alive and doing very well in several niche areas. That finding led to a better understanding of the real value of object databases.
Some products try to be, “All Vs to all people,” but generally what works best is a complementary, integrated set of tools working together as services within a single platform. It makes a lot of sense. So, back to object databases.
One of the things I like most about my job is the business development aspect. One of the product families I’m responsible for is Versant. With the Versant Object Database (VOD – high performance, high throughput, high concurrency) and Fast Objects (great for embedded applications). I’ve met and worked with some brilliant people who have created amazing products based on this technology. Creative people like these are fun to work with, and helping them grow their business is mutually beneficial. Everyone wins.
An area where VOD excels is with the near real-time processing of streaming data. The reason it is so adept to this task is the way that object map out in the database. They do so in a way that essentially mirrors reality. So, optionality is not a problem – no disjoint queries or missed data, no complex query gyrations to get the correct data set, etc. Things like sparse indexing are no problem with VOD. This means that pattern matching is quick and easy, as well as more traditional rule and look-up validation. Polymorphism allows objects, functions, and even data to have more than one form.
VOD does more by allowing data to be more, which is ideal for environments where change is the norm. Cyber Security, Fraud Detection, Threat Detection, Logistics, and Heuristic Load Optimization. In each case, the key to success is performance, accuracy, and adaptability.
The ubiquity of devices generating data today, combined with the desire for people and companies to leverage that data for commercial and non-commercial benefit, is very different than what we saw 10+ years ago. Products like VOD are working their way up that Slope of Enlightenment because there is a need to connect the dots better and faster – especially as the volume and variety of those dots increases. It is not a, “one size fits all” solution, but it is often the perfect tool for this type of work.
These are indeed exciting times!