causation
To Measure is to Know
Lord William Thomson Kelvin was a pretty smart guy who lived in the 1800s. He didn’t get everything right (e.g., he supposedly stated, “X-rays will prove to be a hoax.”), but his success ratio was far better than most so he possessed useful insight. I’m personally a fan of his quote, “If you can not measure it, you can not improve it.”
Business Intelligence (BI) systems can be very powerful, but only when they are embraced as a catalyst for change. What you often find in practice is that the systems are not actively used, or do not track the “right” metrics (i.e., those that provide insight into something important that you have the ability to adjust and impact the results), or provide the right information – only too late to make a difference.
The goal of any business is developing a profitable business model and then executing extremely well. So, you need to have something that people want, then need to be able to deliver high-quality goods and/or services, and finally need to make sure that you can do that profitably (it’s amazing how many businesses fail to understand this last part). Developing a systematic approach that allows for repeatable success is extremely important. Pricing at a level that is competitive and provides a healthy profit margin provides the means for growth and sustainability.
Every business is systemic in nature. Outputs from one area (such as a steady flow of qualified leads from Marketing) become inputs to another (Sales). Closed deals feed project teams, development teams, support teams, etc. Great jobs by those teams will generate referrals, expansion, and other growth – and the cycle continues. This is an important concept to understand because problems or deficiencies in one area can manifest themselves in other areas.
Next, the understanding of cause and effect is important. For example, if your website is not getting traffic is it because of poor search engine optimization or is it bad messaging and/or presentation? If people come to your website but don’t stay long do you know what they are doing? Some formatting is better for printing than reading on a screen (such as multi-column pages), so people tend to print and go. And, external links that do not open in a new window can hurt the “stickiness” of a website. Cause and effect are not always as simple as they would seem, but having data on as many areas as possible will help you understand which ones are really important.
When I had my company we gathered metrics on everything. We even had “efficiency factors” for every Consultant. That helped with estimating, pricing, and scheduling. We would break work down into repeatable components for estimating purposes. Over time we found that our estimates ranged between 4% under and 5% over the actual time required for nearly every work package within a project. This allowed us to fix bid projects to create confidence, and price them at a level that was lean (we usually came-in about the middle of the pack from a price perspective, but the difference was that we could guarantee delivery for that price). More importantly, it allowed us to maintain a healthy profit margin that let us hire the best people, treat them well, invest in our business, and create sustainable profitability as well.
There are many standard metrics for all aspects of a business. Getting started can be as simple as creating some sample data based on estimates, “working the model” with that data, and seeing if this provides additional insight into business processes. Then ask, “When and where could I have made a change to positively impact the results?” Keep working and when you have something that seems to work gather some real data and re-work the model. You don’t need fancy dashboards (yet).
Within a few days, it is often possible to identify the Key Performance Indicators (KPIs) that are most relevant for your business. Then, start consistently gathering data, systematically analyzing it, and present it in a way that is easy to understand and drill-into in a timely manner. To measure the right things really is to know.
Spurious Correlations – What they are and Why they Matter
In an earlier post I mentioned that one of the big benefits of geospatial technology is its ability to show connections between complex and often disparate data sets. As you work with Big Data you tend to see the value of these multi-layered and often multi-dimensional perspectives of a trend or event. While that can lead to incredible results, it can also lead to spurious correlations of data.
First, let me state that I am not a Data Scientist or Statistician, and there are definitely people far more expert on this topic than myself. But, if you are like the majority of companies out there experimenting with geospatial and big data it is likely that your company doesn’t have these experts on-staff. So, a little awareness, understanding, and caution can go a long way in this type of scenario.
Before we dig into that more, let’s think about what your goal is:
- Do you want to be able to identify and understand a particular trend – reinforcing actions and/or behavior? –OR–
- Do you want to understand what triggers a specific event – initiating a specific behavior?
Both are important, but they are both different. My personal focus has been on identification of trends so that you can leverage or exploit them for commercial gain. While that may sound a big ominous, it is really what business is all about.
There is a popular saying that goes, “Correlation does not imply causation.” A common example is that for a large fire you may see a large number of fire trucks. There is a correlation, but it does not imply that fire trucks cause fires. Now, extending this analogy, let’s assume that in a major city the probability of multi-tenant buildings starting on fire is relatively high. Since they are a big city, it is likely that most of those apartments or condos have WiFi hotspots. A spurious correlation would be to imply that WiFi hotspots cause fires.
As you can see, there is definitely potential to misunderstand the results of correlated data. More logical analysis would lead you to see the relationships between the type of building (multi-tenant residential housing) and technology (WiFi) or income (middle-class or higher). Taking the next step to understand the findings, rather than accepting them at face value, is very important.
Once you have what looks to be an interesting correlation there are many fun and interesting things you can do to validate, refine, or refute your hypothesis. It is likely that even without high-caliber data experts and specialists you will be able to identify correlations and trends that can provide you and your company with a competitive advantage. Don’t let the potential complexity become an excuse for not getting started, because as you can see above it is possible to gain insight and create value with a little effort and simple analysis.