Recently I was helping one of my children research a topic for a school paper. She was doing well, but the results she was getting were overly broad. So, I taught her some “Google-Fu,” explaining how you can structure queries in ways that yield better results. She commented that the searches should be smarter than that, and I explained that sometimes the problem is that search engines look at your past searches and customize results as an attempt to appear smarter. Unfortunately, those results can be skewed and potentially lead someone in the wrong direction. It was a good reminder that getting the best results from search engines often requires a bit of skill and query planning.
Then the other day I saw this commercial from Motel 6 (“Gas Station Trouble”) where a man has problems getting good results from his smart phone. That reminded me of seeing someone speak to their phone, getting frustrated by the responses received. His questions went something like this: “Siri, I want to take my wife to dinner tonight, someplace that is not too far away, and not too late. And she likes to have a view while eating so please look for something with a nice view. Oh, and we don’t want Italian food because we just had that last night.” Just as amazing as the question being asked was watching him ask it over and over again in the exact same way, each time becoming even more frustrated. I asked myself, “Are smart phones making us dumber?” Instead of contemplating that question I began to think about what future smart interfaces would or could be like.
I grew up watching Sci-Fi computer interfaces like “Computer” on Star Trek (1966), “HAL” on 2001 : A Space Odyssey (1968), “KITT” from Knight Rider (1982), and “Samantha” from Her (2013). These interfaces had a few things in common: They responded to verbal commands; They were interactive – not just providing answers, but also asking qualifying questions and allowing for interrupts to drill-down or enhance the search (e.g., with pictures or questions that resembled verbal Venn diagrams); They often provided suggestions for alternate queries based on intuition. Despite having 50 years of science fiction examples we are still a long way off from realizing that goal. Like many new technologies, they were originally envisioned by science fiction writers long before they appeared in science.
There seems to be a spectrum of common beliefs about modern interfaces. On one end there are products that make visualization easy, facilitating understanding, refinement and drill-down of data sets. Tableau is a great example of this type of easy to use interface. At the other end of the spectrum the emphasis is on back-end systems – robust computer systems that digest huge volumes of data and return the results to complex queries within seconds. The Actian Analytics Platform is a great example of a powerful analytics platform. In reality, you really need both if you want to maximize the full potential of either.
But, there is so much more to be done. I predict that within the next 3 – 5 years we will see business and consumer examples that are closer to the verbal interfaces from those familiar Sci-Fi shows (albeit with limited capabilities and no flashing lights). Within the next 10 years I believe we will have computer interfaces that intuit our needs and facilitate generating the correct answers quickly and easily. While this is unlikely to be at the level of “The world’s first intelligent Operating System” envisioned in the movie “Her,” and probably won’t even be able to read lips like “HAL,” it should be much more like HAL and KITT than like Siri (from Apple) or Cortana (from Microsoft). Siri was groundbreaking consumer technology when it was introduced. Cortana seems to have taken a small leap ahead. While I have not mentioned Google Now, it is somewhat of a latecomer to this consumer smart interface party, and in my opinion is behind both Siri and Cortana.
So, what will this future smart interface do? It will need to be very powerful, harnessing a natural language interface on the front-end with an extremely flexible and robust analytics interface on the back-end. The language interface will need to take a standard question (in multiple languages and dialects) – just as if you were asking a person, deconstruct it using Natural Language Processing (NLP), and develop the proper query based on the available data. That is important but only gets you so far.
Data will come from many sources – things that we consider today with relational, object, and graph databases. There will be structured and unstructured data that must be joined and filtered quickly and accurately. In addition, context will be more important than ever. Pictures and videos could be scanned for facial recognition, location (via geotagging), and in the case of videos analyze speech. Relationships will be identified and inferred based on a variety of sources, using both data and metadata. Sensors will collect data from almost everything we do and (someday) wear, which will provide both content and context. The use of Stylometry will identify outside content likely related to the people involved in the query and provide further context about interests, activities, and even biases. This is how future interfaces will truly understand (not just interpret), intuit (so it can determine what you really want to know), and then present results that may be far more accurate than we are used to today. Because the interface is interactive in nature it will provide the ability to organize and analyze subsets of data quickly and easily.
So, where do I think that this technology will originate? I believe that it will be adapted from video game technology. Video games have consistently pushed the envelope over the years, helping drive the need for higher bandwidth I/O capabilities in devices and networks, better and faster graphics capabilities, and larger and faster storage (which ultimately led to flash memory and even Hadoop). Animation has become very lifelike and games are becoming more responsive to audio commands. It is not a stretch of the imagination to believe that this is where the next generation of smart interfaces will be found (instead of from the evolution of current smart interfaces).
Someday it may no longer be possible to “tweak” results through the use or omission of keywords, quotation marks, and flags. Additionally, it may no longer be necessary to understand special query languages (SQL, NoSQL, SPARQL, etc.) and syntax. We won’t have to worry as much about incorrect joins, spurious correlations and biased result sets. Instead, we will be given the answers we need – even if we don’t realize that this was what we needed in the first place. At that point computer systems may appear nearly omniscient.
When this happens parents will no longer need to teach their children “Google-Fu.” Those are going be interesting times indeed.
Being in Sales I have the opportunity to speak to a lot of customers and prospects about many things. Most are interested in both Cloud Computing and Big Data, but often they don’t fully understand how they will leverage the technology to maximize the benefit. There is a simple three-step process that I use:
1. Explain that there is no single correct answer. There are still many definitions, so it is more important to focus on what you need than on what you call it.
2. Relate the technology to something people are likely already familiar with (extending those concepts). For example: Cloud computing is similar to virtualization, and has many of the same benefits; Big Data is similar to data warehousing.
3. Provide a high-level explanation of how “new and old” are different. For example: Cloud computing often occurs in an external data center – possibly one that you may not even know where it is, so security is even more complex and important than with in-house systems and applications; Big Data often uses data that is not from your environment – possibly even data that you do not know will have value or not, so robust integration tools are very important.
Big Data is a little bit like my first house. I was newly married, anticipated having children, and anticipated moving into a larger house in the future. My wife and I started buying things that fit into our vision of the future and storing it in our basement. We were planning for a future that was not 100% known.
But, our vision changed over time and we did not know exactly what we needed until the very end. After 7 years our basement was very full and it was difficult to find things. When we moved to a bigger house we did have a lot of what we needed. We also had things in storage that we no longer wanted or needed. And, there were a few things we wished that we had purchased earlier. We did our best, and most of what we did was beneficial.
How many of you would have thought that Social Media Sentiment Analysis would be important 5 years ago? How many would have thought that hashtag usage would have become so pervasive in all forms of media? How many understood the importance of location information (and even the time stamp for that location)? My guess is that it would not be many.
This ambiguity is both the good and bad thing about big data. In the old data warehouse days you knew what was important because this was your data about your business, systems, and customers. While IT may have seemed tough before, it can be much more challenging now. But, the payoff can also be much larger so it is worth the effort.
Now we care about unstructured data (website information, blog posts, press releases, tweets, etc.), streaming data (stock ticker data is a common example), sensor data (temperature, altitude, humidity, location, lateral and horizontal forces – think logistics), etc. So, you are getting data from multiple sources having multiple time frame references (e.g., constant streaming versus hourly updates), often in an unknown or inconsistent format. Many times you don’t know what you don’t know – and you just need to accept that.
In a future post I will discuss scenarios that take advantage of Big Data, and why allowing some ambiguity and uncertainty in your model could be one of the best things that you have ever done. But for now take a look at the links below for more basic information:
This article discusses why Big Data matters, and how you can get value without needing complex analytics.
Big Data article that discusses the importance of taking action quickly to gain a competitive advantage. Note: Free registration to the site may be required to view this article.
This article (Big Data is the Tower of Babel) discusses the importance of data integration.
This short article discusses three important considerations for a Big Data project. While correct, the first point is really the key when getting started.
This is a good high-level article on Hadoop 2.0. Remember how I described the basement in my first house? That’s how Hadoop is utilized in many cases.