Frozen Data

By Chris Stones


How safe is your data?

Seriously, even NASA and JPL have trouble keeping data safe and they
are multimillion dollar funded organizations. It's just that their
magnetic tape based backup system is slowly losing all that space
probe data. Demagnetization is just nature's way, so don't let what
is happening to them happen to you.

That's right. Your backup hard drive or at least the one you 'should'
have is at risk as well. In fact, it's quite a challenge to preserve
digital data in the present era. If ancient societies relied on
floppy's we might never have read Shakespeare's works.

But they didn't.

They used much more solid forms of data storage from chiseled stone to
papers of sorts. It's just easier for pigment to sit on a surface for
a much longer time than all of our recent data storage advancements.
Of course, this issue has sprouted numerous pay-for off-site data
backup services. I prefer to keep my data backed up for free and safe
from on site disasters.

So, when I came across someone that had written a paper back up system
I was intrigued.

They wrote software to efficiently print digital data to paper with
nothing more than a regular laser printer. And the data is restored
with the help of a regular consumer grade scanner. Although, it still
remains quite a challenge to pack the data into paper. You get a
yield of approximately 3 MB per page.

That doesn't make it very viable for backing up movies or other multi
gig files. But it's almost perfect for straight text. Just imagine
that all the text you have ever written can be compressed into a small
stack of papers. Even a quarter million words, a file weighing in at
about 1.6 Megabytes would easily fit on a single page. All you would
need is friends to hold onto copies thus protecting your data from
fires and electrical disasters the modern age is so prone to.[1]

Plus, how many storage mediums do you know that can meet a few blows
of a hammer and still be ready to deliver information?

I'm pretty interested in preserving my hard work as I'm sure over the
next decade I'll be making some very valuable stuff. But like all good
things, it is going to take some effort to really utilize his code for
you see even the original author marked this as his "new open source
joke"[2] And as luck would have it, he wrote the code for Windows.

But, joke or not, he spent a lot of time on the source and added in
many important data backup features such as error correction and CRC
checks. Of course I looked for any other variants of the software
that had successfully ported the thing to Linux or Mac OS X but
nothing satisfactory turned up. If I was going to protect my most
important data I was going to have to go at it alone.

So I began to work.

Truthfully, I didn't realize what I was in for at the start of this
project. I thought I could just read some code and spit out a pretty
alright working prototype minus everything Windows related. But as
life would dictate, it turns out the code uses a lot more than just
the Windows Graphical Elements. I've spent the better part of this
week combing through gibberish in an attempt to purge the Windows from
it. Once I exorcize that demon, I can polish the other parts into
something worth sharing.

This isn't easy since the original developer didn't really describe
the terminology behind the overall process. Each night, I'm solving a
murder mystery, but rather than a who-dun-it I've got a how-does-it to
solve. I may not be a fan of murder mysteries but don't get me wrong.
I'm still grateful someone somewhere started this project even if I do
have to cut it to pieces to glean anything portably useful from the
bits. Perhaps, the most intriguing question still awaits.

Just how much data can you put on a page with off the shelf

The fellow never used color.

If the scanner-software combo could reliably spot color codes I
think I can multiply the paper payload by 250 or so. The result
would be solid pages of random rainbow spots. And in theory, I just
might be able to fit in a gig of data.

Imagine that, a gig of data per piece of paper.

Dear god, that's a wonderful little Everest and probably just as
difficult to reach if it is even possible. Trying to distinguish
colored dots so close together with so many deviations from each other
is a real challenge. We'll just have to see how deep the rabbit hole



[1] The paper backup is encrypted so your friends wouldn't be able to
read it or anything. They would just have a stack of papers sitting
around with a lot of gray colored gibberish.


[*] A Page of dots would be really cool to print at the end of a book.
The entire contents of that book on just a few pages in the back. All
the source code examples and 3rd party libs ready to be loaded.