When I first encountered data as an IT practitioner, we spoke in hushed whispers about kilobytes. Megabytes were where the future lay. My employer enthralled us all with the thrill of a bright future by branding our staff canteen ‘Megabytes’ – we all smiled and laughed conspiratorially – the future was exciting AND it smelled of bacon! The bullet train carrying us all forward into the dataplex left the Gigabyte station a long time ago, Terabytes became the new norm – PCs and TV set top boxes routinely had 1 Tb or more storage available. Accelerating past Tera and into Peta – we started to understand the truth of data proliferation. And then we visited exabytes, but the track had extended to Zettabyte territory.
‘Big data’ is a term freely used by analysts and vendors to describe advances in our ability to store, process and access vast volumes of data. The growth from mainframes and megabytes to the cloud and zettabytes has been remarkable, but how can these volumes of data be useful at home, at work or at school?
IDC recently published its annual DataSphere forecast, which measures the amount of data created, consumed, and stored in the world every year.
According to their report, in 2020, the amount of data created and replicated experienced significantly higher than forecast growth. How could we create 64 zettabytes of data? Turns out that this was due to the dramatic increase in the number of people using data hungry applications for work, for education and for entertainment from home.
The really interesting aspect of this was that only 2% of this new data was saved– the rest was either created or copied for consumption, or temporarily stored then replaced with newer data. But with more than 64 zettabytes created (that’s roughly equal to the storage of sixty-four billion 1 Terabyte PC hard drives) but only 2 zettabytes added to the global cache, there is an accelerating growth in global data storage.
IDC still stand by their prediction that the global data storage will grow to 175 Zettabytes by 2025. More interesting facts arose – the cloud was a net receiver of data rather than being a net generator – people were moving stuff to the cloud in huge volumes. Businesses are twice as likely as consumers to store their data in the cloud. The data has valuable – keeping data allows for the fleet of foot business to mine for value, to be fleet of foot when markets change, to preserve undiscovered relationships and trends.
Why do I care? It directly impacts on the challenges faced by our customers. Advances in the requirements of our global customer base mean that at Vamosa, we now routinely migrate terabytes of data per customer engagement – when we first did our content migrations, we built up from tens to hundreds of gigabytes – from thousands of webpages to millions of intranet stories. But now terabytes are the norm – and petabytes are where we see our stretch. Our first true petabyte project is in flight – billions of files, millions of users, and all of the associated superstructure of comments and metadata, or context and relationships. And the move to exabytes then zettabytes may be in our future. And what about yottabytes? Well that’s for another day to worry about.