BI
To Measure is to Know
Lord William Thomson Kelvin was a pretty smart guy who lived in the 1800s. He didn’t get everything right (e.g., he supposedly stated, “X-rays will prove to be a hoax.”), but his success ratio was far better than most, so he possessed useful insight. I’m a fan of his quote, “If you can not measure it, you can not improve it.”
Business Intelligence (BI) systems can be very powerful, but only when embraced as a catalyst for change. What you often find in practice is that the systems are not actively used or do not track the “right” metrics (i.e., those that highlight something important – ideally something leading – that you have the ability to adjust and impact the results), or provide the right information – only too late to make a difference.
The goal of any business is to develop a profitable business model and execute extremely well. So, you need to have something people want, deliver high-quality goods and/or services, and finally make sure you can do that profitably (it’s amazing how many businesses fail to understand this last part). Developing a systematic approach that allows for repeatable success is extremely important. Pricing at a competitive level with a healthy profit margin provides the means for sustainable growth.
Every business is systemic in nature. Outputs from one area (such as a steady flow of qualified leads from Marketing) become inputs to another (Sales). Closed deals feed project teams, development teams, support teams, etc. Great jobs by those teams will generate referrals, expansion, and other growth – and the cycle continues. This is an important concept because problems or deficiencies in one area can negatively affect others.
Next, the understanding of cause and effect is important. For example, if your website is not getting traffic, is it because of poor search engine optimization or bad messaging and/or presentation? If people visit your website but don’t stay long, do you know what they are doing? Some formatting is better for printing than reading on a screen (such as multi-column pages), so people tend to print and go. And external links that do not open in a new window can hurt the “stickiness” of a website. Cause and effect are not always as simple as they seem, but having data on as many areas as possible will help you identify which ones are important.
When I had my company, we gathered metrics on everything. We even had “efficiency factors” for every Consultant. That helped with estimating, pricing, and scheduling. We would break work down into repeatable components for estimating purposes. Over time we found that our estimates ranged between 4% under and 5% over the actual time required for nearly every work package within a project. This allowed us to profitably fix bid projects, which in turn created confidence for new customers. Our pricing was lean (we usually came in about the middle of the pack from a price perspective, but a critical difference was that we could guarantee delivery at that price). More importantly, it allowed us to maintain a healthy profit margin to hire the best people, treat them well, invest in our business, and create sustainable profitability.
There are many standard metrics for all aspects of a business. Getting started can be as simple as creating sample data based on estimates, “working the model” with that data, and seeing if this provides additional insight into business processes. Then ask, “When and where could I have made a change to positively impact the results?” Keep working until you have something that seems to work, then gather real data and validate (or fix) the model. You don’t need fancy dashboards (yet). When getting started, it is best to focus on the data, not the flash.
Within a few days, it is often possible to identify and validate the Key Performance Indicators (KPIs) that are most relevant to your business. Then, start consistently gathering data, systematically analyzing it, and then work on presenting it in a way that is easy to understand and drill-into in a timely manner. To measure the right things really is to know.
Spurious Correlations – What they are and Why they Matter
In an earlier post, I mentioned that one of the big benefits of geospatial technology is its ability to show connections between complex and often disparate data sets. As you work with Big Data, you tend to see the value of these multi-layered and often multi-dimensional perspectives of a trend or event. While that can lead to incredible results, it can also lead to spurious data correlations.
First, let me state that I am not a Data Scientist or Statistician, and there are definitely people far more expert on this topic than myself. But, if you are like the majority of companies out there experimenting with geospatial and big data, it is likely that your company doesn’t have these experts on staff. So, a little awareness, understanding, and caution can go a long way in this scenario.
Before we dig into that more, let’s think about what your goal is:
- Do you want to be able to identify and understand a particular trend – reinforcing actions and/or behavior? –OR–
- Do you want to understand what triggers a specific event – initiating a specific behavior?
Both are important, but they are both different. My focus has been identifying trends so that you can leverage or exploit them for commercial gain. While that may sound a bit ominous, it is really what business is all about.
A popular saying goes, “Correlation does not imply causation.” A common example is that you may see many fire trucks for a large fire. There is a correlation, but it does not imply that fire trucks cause fires. Now, extending this analogy, let’s assume that the probability of a fire starting in a multi-tenant building in a major city is relatively high. Since it is a big city, it is likely that most of those apartments or condos have WiFi hotspots. A spurious correlation would be to imply that WiFi hotspots cause fires.
As you can see, there is definitely the potential to misunderstand the results of correlated data. A more logical analysis would lead you to see the relationships between the type of building (multi-tenant residential housing) and technology (WiFi) or income (middle-class or higher). Taking the next step to understand the findings, rather than accepting them at face value, is very important.
Once you have what looks to be an interesting correlation, there are many fun and interesting things you can do to validate, refine, or refute your hypothesis. It is likely that even without high-caliber data experts and specialists, you will be able to identify correlations and trends that can provide you and your company with a competitive advantage. Don’t let the potential complexity become an excuse for not getting started. As you can see, gaining insight and creating value with a little effort and simple analysis is possible.
What’s the prize if I win?
In consulting and in business, there is a tendency to believe that if you show someone how to find that proverbial “pot of gold at the end of the rainbow,” they will be motivated to do so. Seasoned professionals will tend to ask, “What problem are you trying to solve?” to understand whether there is a real opportunity. If you cannot quickly, clearly, and concisely articulate the problem, and why this helps solve it, it is often game over then and there (N.B. It pays to be prepared). But, having the right answer is not a guarantee of moving forward.
Unfortunately, sometimes a mere pot of gold just isn’t enough to motivate. Sometimes it takes something different, and usually something personal. It’s more, “What’s in this for me?” No, I am not talking about bribes, kickbacks, or anything illegal or unethical. This is about determining what is really important to the decision maker and in what priority, and then demonstrating that the proposed solution will bring them closer to achieving their personal goals. What’s in it for them?
Case in point. Several years ago I was trying to sell a packaged Business Intelligence (BI) system developed on our database platform to customers most likely to have a need. Qualification performed – check. Interested – check. Proof of value – check. Quick ROI – check. Close the deal – not so fast…
This application was a set of dashboards with 150-200 predefined KPIs (key performance indicators). The premise was that you could quickly tailor and deploy the new BI system with little risk (finding and validating the data needed was available to support the KPI was the biggest risk, but one that could be identified up-front) and about half the cost of what a similar typical implementation would cost. Who wouldn’t want one?
I spent several days onsite with the prospect, identified areas of concern and opportunity, and used their data to quantify the potential benefit. Before the end of the week, I was able to show the potential to get an 8x ROI in the first year. Remember, this was estimated using their data, not figures I just created. Being somewhat conservative, I suggested that even half that amount would be a big success. Look – we found the pot of gold!
Despite this, the deal never closed. This company had a lot of money, and this CIO had a huge budget. Saving $500K+ would be nice but was not essential. What I learned later was that this person was pushing forward an initiative of his own that was highly visible. This new system had the potential to become a distraction, and he did not need that. Had I made this determination sooner, I could have easily repositioned it to align with his agenda.
For example, the focus of the system could have shifted from financial savings to project and risk management for his higher priority initiative. The KPIs could be on earned value, scheduling, and deliverables. This probably would have sold as it would have been far more appealing to this CIO and supported what was important to him (i.e., his prize if he wins). The additional financial savings initially identified would be the icing on the cake, to be applied later.
There were several lessons learned from this effort. In this instance, I focused on my personal pot of gold (based on logic and common sense) rather than on my customer’s priorities and prize for winning. That mistake cost me this deal, but it is one I have not made since – helping me win many other deals.
What’s so special about Spatial?
Two years ago, I was assigned some of the product management responsibilities and product marketing work for a new version of a database product we were releasing. To me, this was the trifecta of bad fortune. I didn’t mind product marketing, but I knew it took a lot of work to do well. I didn’t feel that product management was a real challenge (I was so wrong here), and even though we saw more demand for products supporting Esri’s ArcGIS, I wasn’t interested in working with maps.
I was so wrong in so many ways. I didn’t realize real product management was as much work as product marketing. And I learned that geospatial was far more than just maps. It was quite an eye-opening experience for me – one that also turned out to be very valuable.
First, let me start by saying that I now greatly appreciate Cartography. I never realized how complex mapmaking is and how there is just as much art as science (a lot like programming). Maps can be so much more than just simple drawings.
I had a great teacher when it came to geospatial – Tyler Mitchell (@spatialguru). He showed me the power of overlaying tabular business data with common spatial data (addresses, zip / postal codes, coordinates) and presenting the “conglomeration of data” in layers that made things easier to understand. I believe that “people buy easy,” which makes this a good thing in my book.
The more I thought about this technology – simple points, lines, and areas combined with powerful functions, the more I began to think about other uses. I realized that you could use it to correlate very different data sets and graphically show relationships that would otherwise be extremely difficult to make.
For example, think about having access to population data, demographic data, business and housing data, crime data, health/disease data, etc. Now, consider a simple, easy-to-use graphical dashboard that overlaps as many data sets as needed. Within seconds, you see very specific clusters of geographically correlated data, which may bring attention to other correlations.
Some data may only be granular to a zip code or city, but others will allow you to identify patterns in specific streets and neighborhoods. Just think of how something so simple can help you make decisions that are so much better. It’s interesting how few businesses take advantage of this cost-effective technology.
If that wasn’t enough, just think about location-aware applications and the proliferation of smart devices and IoT that completely lend themselves to many helpful and lucrative mobile applications. Even more than that, they make those devices more helpful and user-friendly. Just think about how easy it is to find the nearest Indian restaurant when the thought of curry for lunch hits you. And these things are just the tip of the iceberg.
What a lucky day for me when I was assigned this work that I did not want. Little did I know that it would change my thoughts about many things. That’s just the way things work out sometimes.
My perspective on Big Data
Ever since I worked on redesigning a risk management system at an insurance company (1994-1995) I was impressed at how better decisions could be made with more data – assuming it was the right data. The concept of “What is the right data?” has intrigued me for years, as what may seem common sense today could have been unknown 5-10 years ago and could be completely passé 5-10 years from now. Context becomes very important because of the variability and relevance of data over time.
This is what makes Big Data interesting. There really is no right or wrong answer or definition. Having a framework to define, categorize, and use that data is important. And at some point, being able to refer to the data in context will also be very important. Just think about how challenging it could be to compare scenarios or events from 5 years ago with those of today. It’s likely not an apples-to-apples comparison, but it could certainly be done. The concept of maximizing the value of data is pretty cool stuff.
The way I think of Big Data is similar to a water tributary system. Water enters the system in many ways – rain from the clouds, sprinkles from private and public supplies, runoff, overflow, etc. It also has many interesting dimensions, such as quality/purity (not necessarily the same due to different aspects of need), velocity, depth, capacity, and so forth. Not all water gets into the tributary system (e.g., some is absorbed into the groundwater tables, and some evaporates) – just as some data loss should be anticipated.
If you think of streams, ponds, rivers, lakes, reservoirs, deltas, etc., many relevant analogies can be made. And just like the course of a river may change over time, data in our “big data” water tributary system could also change over time.
Another part of my thinking is based on my experience of working on a project for a Nanotech company about a decade ago (2002 – 2003 timeframe). In their labs, they were testing various products. There were particles that changed reflectivity based on the temperature that were embedded in shingles and paint. There were very small batteries that could be recharged quickly tens of thousands of times, were light, and had more capacity than a common 12-volt car battery.
And there was a section where they were doing “biometric testing” for the military. I have since read articles about things like smart fabrics that could monitor a soldier’s health and apply basic first aid and notify others once a problem is detected. This company felt that by 2020, advanced nanotechnology would be widely used by the military, and by 2025, it would be in wide commercial use. Is that still a possibility? Who knows…
Much of what you read today is about the exponential growth of data. I agree with that, but as stated earlier, and this is important, I believe that the nature and sources of that data will change significantly. For example, nanoparticles in engine oil will provide information about temperature, engine speed, load, and even rapid changes in motion (fast take-off or stops, quick turns). The nanoparticles in the paint will provide weather conditions. The nanoparticles on the seat upholstery will provide information about occupants (number, size, weight). Sort of like the “sensor web” from the original Kevin Delin perspective. A lot of “Information of Things” (IoT) data will be generated, but then what?
I believe that time will become an essential aspect of every piece of data and that location (X, Y, and Z coordinates) will be just as important. However, not every sensor collects location (spatial) data. I believe multiple data aggregators will be in everyday use at common points (your car, your house, your watch). Those aggregators will package the available data into something akin to an XML object, allowing flexibility. From my perspective, this is where things become very interesting relative to commercial use and data privacy.
Currently, companies like Google make a lot of money by aggregating data from multiple sources, correlating it with various attributes, and then selling knowledge derived from that data. I believe there will be opportunities for individuals to use “data exchanges” to manage, sell, and directly benefit from their own data. The more interesting their data, the more value it has and the more benefit it provides to the person selling it. This could have a significant economic impact, fostering both the use and expansion of the commercial ecosystems needed to manage this technology’s commercial and privacy aspects, especially as it relates to machine learning.
The next logical step in this vision is “smart everything.” For example, you could buy a shirt that is just a shirt. But you could turn on medical monitoring or refractive heating/cooling for an extra cost. And, if you felt there was a market for extra dimensions of data that could benefit you financially, you could also enable those sensors. Just think of the potential impact that technology would have on commerce in this scenario.
I believe this will happen within the next decade or so. This won’t be the only type of use of big data. Instead, there will be many valid types and uses of data – some complementary and some completely discrete. It has the potential to become a confusing mess. But, people will find ways to ingest, categorize, and correlate data to create value – today or in the future.
Utilizing data will become an increasingly competitive advantage for people and companies, knowing how to do something interesting and useful. Who knows what will be viewed as valuable data 5-10 years from now, but it will likely be different than what we view as valuable data today.
So, what are your thoughts? Can we predict the future based on the past? Or, is it simply enough to create platforms that are powerful enough, flexible enough, and extensible enough to change our understanding as our perspective of what is important changes? Either way, it will be fun!



