NoSQL
The Unsung Hero of Big Data
Earlier this week I was reading a blog post regarding the recent Gartner Hype Cycle for Advanced Analytics and Data Science, 2015. The Gartner chart reminded me of the epigram, “Plus ça change, plus c’est la même chose” (asserting that history repeats itself by stating the more things change, the more they stay the same.)
To some extent that is true, as you could consider today’s Big Data as derivative of yesterday’s VLDBs (very large databases) and Data Warehouses. One of the biggest changes IMO is the shift away from Star Schemas and practices implemented for performance reasons such as aggregation of data sets, use of derived and encoded values, the use of surrogate and foreign keys to establish linkage, etc. Going forward it may not be possible to have that much rigidity and be as responsive as needed from a competitive perspective.
There are many dimensions to big data: Huge sample of data (volume), which becomes your universal set and supports deep analysis as well as temporal and spatial analysis; A variety of data (structured and unstructured) that often does not lend itself to SQL based analytics; and often data streaming in (velocity) from multiple sources – an area that will become even more important in the era of the Internet of Things. These are the “Three V’s” that people have been talking about for the past five years.
Like many people, my interest in Object Database technology initially waned in the late 1990’s. That is, until about four years ago when a project at work led me back in this direction. As I dug into the various products I learned that they were alive and doing very well in several niche areas. That finding led to a better understanding of the real value of object databases.
Some products try to be, “All Vs to all people,” but generally what works best is a complementary, integrated set of tools working together as services within a single platform. It makes a lot of sense. So, back to object databases.
One of the things I like most about my job is the business development aspect. One of the product families I’m responsible for is Versant. With the Versant Object Database (VOD – high performance, high throughput, high concurrency) and Fast Objects (great for embedded applications). I’ve met and worked with some brilliant people who have created amazing products based on this technology. Creative people like these are fun to work with, and helping them grow their business is mutually beneficial. Everyone wins.
An area where VOD excels is with the near real-time processing of streaming data. The reason it is so adept to this task is the way that object map out in the database. They do so in a way that essentially mirrors reality. So, optionality is not a problem – no disjoint queries or missed data, no complex query gyrations to get the correct data set, etc. Things like sparse indexing are no problem with VOD. This means that pattern matching is quick and easy, as well as more traditional rule and look-up validation. Polymorphism allows objects, functions, and even data to have more than one form.
VOD does more by allowing data to be more, which is ideal for environments where change is the norm. Cyber Security, Fraud Detection, Threat Detection, Logistics, and Heuristic Load Optimization. In each case, the key to success is performance, accuracy, and adaptability.
The ubiquity of devices generating data today, combined with the desire for people and companies to leverage that data for commercial and non-commercial benefit, is very different than what we saw 10+ years ago. Products like VOD are working their way up that Slope of Enlightenment because there is a need to connect the dots better and faster – especially as the volume and variety of those dots increases. It is not a, “one size fits all” solution, but it is often the perfect tool for this type of work.
These are indeed exciting times!
The Future of Smart Interfaces
Recently I was helping one of my children research a topic for a school paper. She was doing well, but the results she was getting were overly broad. So, I taught her some “Google-Fu,” explaining how you can structure queries in ways that yield better results. She replied that search engines should be smarter than that. I explained that sometimes the problem is that search engines look at your past searches and customize results as an attempt to appear smarter or to motivate someone to do or believe something.
Unfortunately, those results can be skewed and potentially lead someone in the wrong direction. It was a good reminder that getting the best results from search engines often requires a bit of skill and query planning, as well as occasional third-party validation.
Then the other day I saw this commercial from Motel 6 (“Gas Station Trouble”) where a man has problems getting good results from his smart phone. That reminded me of seeing someone speak to their phone, getting frustrated by the responses received. His questions went something like this:
“Siri, I want to take my wife to dinner tonight, someplace that is not too far away, and not too late. And she likes to have a view while eating so please look for something with a nice view. Oh, and we don’t want Italian food because we just had that last night.”
Just as amazing as the question being asked was watching him ask it over and over again in the exact same way, each time becoming even more frustrated. I asked myself, “Are smartphones making us dumber?” Instead of contemplating that question I began to think about what future smart interfaces would or could be like.
I grew up watching Sci-Fi computer interfaces like “Computer” on Star Trek (1966), “HAL” on 2001 : A Space Odyssey (1968), “KITT” from Knight Rider (1982), and “Samantha” from Her (2013). These interfaces had a few things in common:
- They responded to verbal commands;
- They were interactive – not just providing answers, but also asking qualifying questions and allowing for interrupts to drill-down or enhance the search (e.g., with pictures or questions that resembled verbal Venn diagrams);
- They often provided suggestions for alternate queries based on intuition. That would have been helpful for the gentleman trying to find a restaurant.
Despite having 50 years of science fiction examples, we are still a long way off from realizing that goal of a truly intelligent interface. Like many new technologies, they were originally envisioned by science fiction writers long before they appeared in science.
There seems to be a spectrum of common beliefs about modern interfaces. On one end there are products that make visualization easy, facilitating understanding, refinement and drill-down of data sets. Tableau is a great example of this type of easy to use interface. At the other end of the spectrum the emphasis is on back-end systems – robust computer systems that digest huge volumes of data and return the results to complex queries within seconds. Several other vendors offer powerful analytics platforms. In reality, you really need a strong front-end and back-end if you want to achieve the full potential of either.
But, there is so much more potential…
I predict that within the next 3 – 5 years we will see business and consumer interface examples (powered by Natural Language Processing, or NLP) that are closer to the verbal interfaces from those familiar Sci-Fi shows (albeit with limited capabilities and no flashing lights).
Within the next 10 years I believe we will have computer interfaces that intuit our needs and facilitate the generation of correct answers quickly and easily. While this is unlikely to be at the level of “The world’s first intelligent Operating System” envisioned in the movie “Her,” and probably won’t even be able to read lips like “HAL,” it should be much more like HAL and KITT than like Siri (from Apple) or Cortana (from Microsoft).
Siri was groundbreaking consumer technology when it was introduced. Cortana seems to have taken a small leap ahead. While I have not mentioned Google Now, it is somewhat of a latecomer to this consumer smart interface party, and in my opinion is behind both Siri and Cortana.
So, what will this future smart interface do? It will need to be very powerful, harnessing a natural language interface on the front-end with an extremely flexible and robust analytics interface on the back-end. The language interface will need to take a standard question (in multiple languages and dialects) – just as if you were asking a person, deconstruct it using Natural Language Processing, and develop the proper query based on the available data. That is important but only gets you so far.
Data will come from many sources – things that we consider today with relational, object, graph, and NoSQL databases. There will be structured and unstructured data that must be joined and filtered quickly and accurately. In addition, context will be more important than ever. Pictures and videos could be scanned for facial recognition, location (via geotagging), and in the case of videos analyze speech. Relationships will be identified and inferred based on a variety of sources, using both data and metadata. Sensors will collect data from almost everything we do and (someday) wear, which will provide both content and context.
The use of Stylometry will identify outside content likely related to the people involved in the query and provide further context about interests, activities, and even biases. This is how future interfaces will truly understand (not just interpret), intuit (so it can determine what you really want to know), and then present results that may be far more accurate than we are used to today. Because the interface is interactive in nature it will provide the ability to organize and analyze subsets of data quickly and easily.
So, where do I think that this technology will originate? I believe that it will be adapted from video game technology. Video games have consistently pushed the envelope over the years, helping drive the need for higher bandwidth I/O capabilities in devices and networks, better and faster graphics capabilities, and larger and faster storage (which ultimately led to flash memory and even Hadoop). Animation has become very lifelike and games are becoming more responsive to audio commands. It is not a stretch of the imagination to believe that this is where the next generation of smart interfaces will be found (instead of from the evolution of current smart interfaces).
Someday it may no longer be possible to “tweak” results through the use or omission of keywords, quotation marks, and flags. Additionally, it may no longer be necessary to understand special query languages (SQL, NoSQL, SPARQL, etc.) and syntax. We won’t have to worry as much about incorrect joins, spurious correlations and biased result sets. Instead, we will be given the answers we need – even if we don’t realize that this was what we needed in the first place. At that point computer systems may appear nearly omniscient.
When this happens parents will no longer need to teach their children “Google-Fu.” Those are going be interesting times indeed.