Mar 22 2018

Big Data – Long Data

2,800,000,000,000,000,000,000 bytes

That’s 2.8 zettabytes, or 2.8 billion terabytes.

That is roughly how much data there is in the world today, with 2.5 exabytes (quintillion bytes) being added every day. This increase is not linear, but geometric. Ninety percent of all the data ever produced by humanity was produced in the last two years. There are increasingly science projects coming online that are using massive amounts of data. The Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative to map the human brain will have to deal with yottabytes of data (a yottabyte is 1000 zettabytes).

Storing all this data is already a challenge, one which will get orders of magnitude more challenging. There are several hurdles. The first is that we need the physical space to store all this data – we need the hard drives, optical discs, tape storage, or whatever media we use.

Second, storing and migrating all this data uses energy. The current estimate is that 3% of global energy output is used to store data. And again, this will only increase. On a separate but related issue, we also have to think carefully about global processing and storage intensive tasks, such as cryptocurrency. They may have a certain utility, but the blockchain process uses a lot of energy.

Third, we need to consider how long our data storage methods will last. We have to consider not just big data, but long data – the ability to practically store massive amounts of data for a long time.

Storing data on hard drives consumes a lot of energy, and this data needs to be migrated over to new drives every two years so it is not lost through drive crashes.

Optical discs have some beneficial features. They are small, have high density of data, do not need energy to maintain the data, and have longer lifespans than hard drives. However, their lifespans are still relatively short for archival purposes – about 50 years.

As someone who has been using computers since literally the 1970s, I have noticed that optical discs have passed their prime. In the 90s and early 2000s every computer I owned had an optical disc reader and writer, and I would back up documents and media files onto optical discs.

My latest computer doesn’t even have an optical disc reader, let alone writer. Software is downloaded, movies can be streamed, and I use a combination of large hard drives and online storage for backup. Optical discs just don’t really have a niche anymore. And I miss them. I liked having my critical data stored securely (offline) in a convenient medium that should last for at least 20 years, or more with the higher end discs.

Optical disc technology has simply not kept up. When blu ray discs had about 25gb of storage capacity, it just wasn’t enough to justify the price, and they started to fall out of favor. They have advanced since then, but have continued to fall behind data need. Current blu rays can hold 100gb on a disc, which is nice, but still not enough to make them worth while. There are companies marketing 300gb discs for archiving purposes, but now the market is aimed at large companies, not consumers, and they are just not cost effective.

Sony has a 3.3 tb optical disc archival system, but it looks like the drive itself costs $6,500, and the cartridges are selling for $188 each. This is just not priced for the consumer. This also may not be cost effective even for large data storage companies.

The technology is still advancing, but I don’t know if we will have the kind of breakthrough needed to make optical discs once again a good option for consumers. This article was prompted by a press release of a new optical disc technology that claims 10tb on a disc, with a lifespan of 600 years. The technology is still at the university level, not ready for mass production. The discs also use gold, which even if a tiny amount will likely make them relatively expensive.

Depending on cost, this may be a good option for mass storage. The long lifespan means that the data does not have to be migrated frequently, and it can be used for historical archiving.

However, by the time this new disc gets to the market, 10tb won’t be that impressive anymore. It still feels like optical disc technology, while progressing, is lagging behind the massive increase in big data. 4k video is the latest trend to drive the need for more data storage, and eventually we will be going to 8k. We are taking more pictures with more megapixels, and making and uploading more videos.

Now virtual reality is gaining in popularity, with increasing resolution, driving a massive increase in processing power and data storage needs. Why shoot regular video, when you can shoot high definition 360 degree video?

For me personally, I like having all my critical personal data backed up multiple times. I used to keep a case of DVDs with data at my brother’s house, in case of fire or other disaster. Online storage is OK, but I have read reports of people losing their data because they were late with a payment, or because of data loss at the company, so choose your service well. And upload speeds are still limiting, if you generate a lot of data like I do.

Now, if there were an optical disc system that could store 10tb on one optical disc, and was cost competitive for the consumer, we would have a winner. But I don’t see that happening. So for now I just have my critical data on multiple solid state drives, and my documents online. If I’m missing something, please let me know.


