Uncategorized

My perspective on Big Data

Posted on Updated on

Ever since I worked on redesigning a risk management system at an insurance company (1994-1995) I was impressed at how better decisions could be made with more data – assuming it was the right data.  The concept of “What is the right data?” has intrigued me for years, as what may seem common sense today could have been unknown 5-10 years ago and could be completely passé 5-10 years from now. Context becomes very important because of the variability and relevance of data over time.

This is what makes Big Data interesting. There really is no right or wrong answer or definition. Having a framework to define, categorize, and use that data is important. And at some point, being able to refer to the data in context will also be very important. Just think about how challenging it could be to compare scenarios or events from 5 years ago with those of today. It’s likely not an apples-to-apples comparison, but it could certainly be done. The concept of maximizing the value of data is pretty cool stuff.

The way I think of Big Data is similar to a water tributary system. Water enters the system in many ways – rain from the clouds, sprinkles from private and public supplies, runoff, overflow, etc.  It also has many interesting dimensions, such as quality/purity (not necessarily the same due to different aspects of need), velocity, depth, capacity, and so forth. Not all water gets into the tributary system (e.g., some is absorbed into the groundwater tables, and some evaporates) – just as some data loss should be anticipated.

Image of the world with a water hose wrapped around it.

If you think of streams, ponds, rivers, lakes, reservoirs, deltas, etc., many relevant analogies can be made. And just like the course of a river may change over time, data in our “big data” water tributary system could also change over time.

Another part of my thinking is based on my experience of working on a project for a Nanotech company about a decade ago (2002 – 2003 timeframe). In their labs, they were testing various products. There were particles that changed reflectivity based on the temperature that were embedded in shingles and paint. There were very small batteries that could be recharged quickly tens of thousands of times, were light, and had more capacity than a common 12-volt car battery.

And there was a section where they were doing “biometric testing” for the military. I have since read articles about things like smart fabrics that could monitor a soldier’s health and apply basic first aid and notify others once a problem is detected.  This company felt that by 2020, advanced nanotechnology would be widely used by the military, and by 2025, it would be in wide commercial use.  Is that still a possibility? Who knows…

Much of what you read today is about the exponential growth of data. I agree with that, but as stated earlier, and this is important, I believe that the nature and sources of that data will change significantly.  For example, nano-particles in engine oil will provide information about temperature, engine speed, load, and even things like rapid changes in movement (fast take-off or stops, quick turns). The nanoparticles in the paint will provide weather conditions. The nanoparticles on the seat upholstery will provide information about occupants (number, size, weight). Sort of like the “sensor web” from the original Kevin Delin perspective. A lot of “Information of Things” (IoT) data will be generated, but then what?

I believe that time will become an important aspect of every piece of data and that location (X, Y, and Z coordinates) will be just as important. However, not every sensor will collect location (spatial data). I believe multiple data aggregators will be in common use at common points (your car, your house, your watch). Those aggregators will package the available data in something akin to an XML object, which allows flexibility.  From my perspective, this is where things become very interesting relative to commercial use and data privacy.

Currently, companies like Google make a lot of money from aggregating data from multiple sources, correlating it to various attributes, and then selling knowledge derived from that plethora of data. I believe there will be opportunities for individuals to use “data exchanges” to manage, sell, and directly benefit from their own data. The more interesting their data, the more value it has and the more benefit it provides to the person selling it. This could have a huge economic impact, and that would foster both the use and expansion of various commercial ecosystems required to manage this technology’s commercial and privacy aspects.

The next logical step in this vision is “smart everything.” For example, you could buy a shirt that is just a shirt. But, you could turn on medical monitoring or refractive heating/cooling for an extra cost. And, if you felt there was a market for extra dimensions of data that could benefit you financially, you could also enable those sensors. Just think of the potential impact that technology would make on commerce in this scenario.

I believe this will happen within the next decade or so. This won’t be the only type of use of big data. Rather, there will be many valid types and uses of data – some complementary and some completely discrete. It has the potential to become a confusing mess. But, people will find ways to ingest, categorize, and correlate data to create value – today or in the future.

Utilizing data will become an increasingly competitive advantage for people and companies, knowing how to do something interesting and useful. Who knows what will be viewed as valuable data 5-10 years from now, but it will likely be different than what we view as valuable data today.

So, what are your thoughts? Can we predict the future based on the past? Or, is it simply enough to create platforms that are powerful enough, flexible enough, and extensible enough to change our understanding as our perspective of what is important changes? Either way, it will be fun!

Why I Love Technology

Posted on Updated on

Technology was not native to me, at least relative to children and young adults today. Simple four-function calculators started becoming popular when I was in Elementary School. I only had a single computer course in High School (it was the only one offered). We had a Timex Sinclair and, later, a Commodore 64 computer at home. It was fun, but I wasn’t hooked yet.

I started a car and motorcycle parts business when I was 18. Initially, I was looking for a way to get cheaper parts for myself and thought if I could make money doing it, then all the better. Nearly everything I did was manual. Then I learned about a Radio Shack TRS-80 at college that had a word processing program. I used that to create mailings to parts companies, distributors, and potential customers. Before long, I had a catalog of products I could sell and a small but loyal customer base buying products and services from me. If Quickbooks had been available back then, I may have kept the business running. Doing everything manually just took too much time. Even so, this was my first technology win, and I liked it.

A few years later, I was programming at a local marketing company. The MIS Director (what IT used to be called) purchased a new relational database product with a 4GL application language. This was in 1987, and this technology was very new. The product was sold as saving “75% of your development time and effort.” Most seasoned people in the group did not want to risk their reputations on something that might not work.

I was new and had nothing to lose, so for the next month, I read every manual cover-to-cover. Before long, I worked on new applications and soon became the in-house RDBMS/4GL expert. This led to a fast track of promotions and being selected to develop the majority of new custom applications sold by our company. It was not easy, but it was fun and good for my career.

My first and arguably most influential mentor was my manager at this job (Jim). He taught me about designing parameter-driven systems that were flexible and extensible. He also taught me that “good enough usually isn’t good enough.” Most people are lucky to have one really good mentor during their career. I’ve been blessed with four of them at different stages of my career. It has motivated me to return the favor and help others whenever possible. This job helped me grow in so many ways.

A few years later, I worked at a software company creating a new standard product on this database platform. Nobody was trained on the product, and most wrote their embedded C / SQL programs like any other 3GL program (i.e., non-transactionally). I pointed out to the VP of Development that this would be a problem. He didn’t want to hear that. I pushed for a concurrency test, and everything locked up. Many people were suddenly upset with me, but the longer you wait to solve problems like these, the more expensive it becomes.

We spent the next two months creating functions to manage transactions, optimizing everything (even table structures to get the best byte alignment), and making this new packaged system work. The VP now liked and respected me, which changed our working dynamics. That shifted the focus from people and personalities to technologies and results.

We also worked on other aspects of the system to enhance performance. We created a system much like Memcached in Perl (back in 1990) that allowed us to handle the workflow of even the fastest warehouses in near real-time. We did many leading-edge things at the time (HA clusters with automatic failover, automated restart of remote devices to resume work in progress to the point of failure, outsourcing to India using an X.400 connection that I configured, distributed systems, client/server systems, etc.) I learned a lot from that experience and was proud of the results.

Later, I worked for that database company (Ingres). This was in the heyday of consulting, where projects were huge, and rates were high. My first project (started on my second day on the job) was being assigned to redesign a Risk Management System at an insurance company that started using our products. I soon found that the project had been in progress for two years and had binders full of specifications, but nothing was actionable. I did not make many friends those first two weeks, as I pointed these things out.

I offered to facilitate a JAD (joint application design) session with multiple lines of business. This pointed out issues that even they were unaware of and allowed us to begin designing a flexible system that would accommodate all lines of business. We used an agile approach to prototype the new system, demonstrations to get buy-in, and moved the project forward quickly. Six months later, the first part of that functionality went live. The system was fully functional within a year!

I had the opportunity to work on some of the largest databases at the time (roughly 300 GB total, which is small by today’s measures), work on leading-edge technology (Clustering, VLDB, and Enterprise Unix systems), and really become a true Consultant along the way (with the help of another mentor – Bill). I was sent to several Unix Internals courses and then worked with our Engineering team to improve our products and create configurations supporting other large companies with similar problems.

A few years later,, I worked at a small start-up company that created the world’s first commercial JDBC driver. I have worked with many very smart people before, but now I worked with a couple of very brilliant people. My main contribution this time was on the business side, but we learned a lot from each other as we grew the business to over $1M in sales within the first year.

One thing that sticks with me is that I became interested in VRML (virtual reality modeling language) during this time. I had an idea (1997) that we could create a website to show the insides of buildings, productize them, and sell them to real estate companies and larger apartment complex owners. My idea was not well received by the team, but a few years later, systems like this were being developed, and a few people were making a lot of money. That taught me to have more faith in ideas based on new technology, regardless of what others thought. It also brought me back to an important concept in Business and Consulting, which is being able to communicate ideas and benefits in ways that are easy enough for everyone to understand as opposed to focusing on the technology itself.

Over the years, these lessons learned have helped with BI (business intelligence) – building dashboards using relevant KPIs tailored to the specific audience, mobile computing, cloud computing, IoT, and big data. Most people think these things are “not important until they become important,” often 6 – 12 months (or more) later. From my perspective, the real trick isn’t in trying to understand the next big thing but rather in considering better, easier, and more efficient ways of doing things you do today.

This is why I love technology. It has helped me accomplish many things that have had a tangible impact on the businesses I have worked for and consulted with. It has taught me to think about problems and ideas from various perspectives and to leverage lessons learned in one area to help solve problems in another (i.e., transfer knowledge and skills from one area to another). Technology has provided me opportunities to learn about and work on solving business and technical problems in several industries as I ponder, “Why not?” 

My interest in technology has allowed me to meet and work with many interesting and incredible people throughout my career in many industries and settings. That’s much more than I ever expected when I took my first programming course so long ago, and it has become a significant aspect of almost everything I do.

Welcome to this journey of discovery and sharing.