A white paper commissioned by EMC, and written by IDC (and available at
http://www.emc.com/about/destination/dig...rse/index.jsp?hpid=1), is creating buzz in some interesting corners of the blogosphere. The document reports that a total of 161 exabytes (or 161,000,000,000,000,000,000 bytes) of information was created/captured/replicated in 2006, and that nearly 1 zettabyte (one billion billion kilobytes – 1,000,000,000,000,000,000,000 bytes, for those of you who like to count the zeroes) will be generated worldwide in 2010. Noted pundits Roger Kay and Rob Enderle have weighed in on the report – Roger with a discussion about the nature of this data, and Rob with an observation that EMC is on course to position itself as an industry heavyweight on par with Microsoft, Intel, and Cisco.
I suppose this is all unfolding as one would expect – people are always impressed by large numbers (no matter how speculative), and I suppose that both Roger and Rob have reasons to call attention to EMC’s position as a supplier who can help manage the “data tsunami”.
When I see these numbers, though, I’m primarily reminded of why I believe that data governance is one of the five issues that will re-shape the IT industry – and the way that IT is perceived within society – over the next decade. The highlight from the report underscores the fact that we are storing far more information than ever before, and other aspects of the white paper mention that compliance legislation and privacy concerns drive scrutiny over how this data is used and protected. However, these observations don’t go nearly far enough.
When most of us are growing up, we experiment with behaviours that are not ideal. We associate with people who have different ideas, we go to parties and drink or do drugs, we engage in pranks or reckless activities. In the normal course of life, we grow past these dalliances, and become sensible, well-rounded adults. When we attain a level of maturity, we engage in all manner of interactions – with banks, with physicians and pharmacists, with Amazon.com, Wal-Mart and local retailers – that attach transactional data to our individual/consumer identity.
Most of us assume that we are free to evolve beyond our youthful selves as we establish adult identities, and that our individual/consumer transactions are individual events that fade over time. However, all that transactional data lives somewhere in EMC’s zettabyte – and in the camera-phone/YouTube universe, it may be increasingly difficult for today’s children to leave their pasts behind.
Over the next decade, it is inevitable that the need to safeguard our personal information and identities – as well as a desire to limit our exposure to our own pasts – will crash headfirst into our ability to store and distribute essentially-limitless volumes of material.
At a practical level, the reality of needing to manage quite a lot of information is a growing issue, and the ability of suppliers like EMC to give us the tools needed to manage that information is important. Underneath the infrastructure diagrams and associated statistics, though, there is a real question as to how, at a human level, we will find a way to balance this ability to store and retrieve data with our need to ensure that our transactions do not define our identities, and our need to lead lives in which experimentation and mistakes help us to learn, rather than create permanent evidence of our inability to lead blameless lives.