Month: August 2015

The Unsung Hero of Big Data

Posted on Updated on

Earlier this week, I read a blog post regarding the recent Gartner Hype Cycle for Advanced Analytics and Data Science, 2015. The Gartner chart reminded me of the epigram, “Plus ça change, plus c’est la même chose” (asserting that history repeats itself by stating the more things change, the more they stay the same.)

To some extent, that is true, as you could consider today’s Big Data as a derivative of yesterday’s VLDBs (very large databases) and Data Warehouses. One of the biggest changes, IMO is the shift away from Star Schemas and practices implemented for performance reasons, such as aggregation of data sets, using derived and encoded values, using surrogate and foreign keys to establish linkage, etc. Going forward, it may not be possible to have that much rigidity and be as responsive as needed from a competitive perspective.

There are many dimensions to big data: A huge sample of data (volume), which becomes your universal set and supports deep analysis as well as temporal and spatial analysis; A variety of data (structured and unstructured) that often does not lend itself to SQL based analytics; and often data streaming in (velocity) from multiple sources – an area that will become even more important in the era of the Internet of Things. These are the “Three V’s” people have talked about for the past five years.

Like many people, my interest in Object Database technology initially waned in the late 1990s. That is, until about four years ago when a project at work led me back in this direction. As I dug into the various products, I learned they were alive and doing well in several niche areas. That finding led to a better understanding of the real value of object databases.

Some products try to be “All Vs to all people,” but generally, what works best is a complementary, integrated set of tools working together as services within a single platform. It makes a lot of sense. So, back to object databases.

One of the things I like most about my job is the business development aspect. One of the product families I’m responsible for is Versant. With the Versant Object Database (VOD – high performance, high throughput, high concurrency) and Fast Objects (great for embedded applications). I’ve met and worked with brilliant people who have created amazing products based on this technology. Creative people like these are fun to work with, and helping them grow their business is mutually beneficial. Everyone wins.

An area where VOD excels is with the near real-time processing of streaming data. The reason it is so adept at this task is the way that objects are mapped out in the database. They do so in a way that essentially mirrors reality. So, optionality is not a problem – no disjoint queries or missed data, no complex query gyrations to get the correct data set, etc. Things like sparse indexing are no problem with VOD. This means that pattern matching is quick and easy, as well as more traditional rule and look-up validation. Polymorphism allows objects, functions, and even data to have multiple forms.

Image of globe with network of connected dots in the space above it.

VOD does more by allowing data to be more, which is ideal for environments where change is the norm. Cyber Security, Fraud Detection, Threat Detection, Logistics, and Heuristic Load Optimization. In each case, performance, accuracy, and adaptability are the key to success.  

The ubiquity of devices generating data today, combined with the desire for people and companies to leverage that data for commercial and non-commercial benefit, is very different than what we saw 10+ years ago. Products like VOD are working their way up that Slope of Enlightenment because there is a need to connect the dots better and faster – especially as the volume and variety of those dots increases. It is not a “one size fits all” solution, but it is often the perfect tool for this type of work.

These are indeed exciting times!