by Brian Tomasik
First written: 12 Apr. 2016; last update: 11 Dec. 2016
This piece describes some ways to back up your most important electronic data against the risk of magnetic disasters. Since "NASA puts the likelihood of [...] a geomagnetic super-storm at 12 percent per decade", the expected benefit of making backups seems worth the cost.
- 1 Summary
- 2 Types of disasters
- 3 Backup options
- 4 Backing up other people's websites
Types of disasters
An electromagnetic pulse (EMP) attack could potentially destroy electronic equipment in a wide region of the US. The probability of such an attack happening in my lifetime seems pretty low but maybe not less than a few percent? Basically no data centers are EMP-proof, so if Google/Facebook/etc. have only one copy of your data (do they?) and it was stored in an affected data center, the data would be lost.
This article reports:
According to [former Director of Central Intelligence James] Woolsey, a solar super-storm like the Carrington Event today would “collapse electric grids and life-sustaining critical infrastructures worldwide, putting at risk the lives of billions.”
Of course, if a geomagnetic storm was bad enough that society collapsed, then my writings would be less useful than in scenarios where that doesn't happen.
This article quotes "Doug Biesecker of the National Oceanic and Atmospheric Administration’s (NOAA) Space Weather Prediction Center" as saying:
The possibility of an extreme CME causing a very powerful geomagnetic storm is real. There’s considerable uncertainty to how frequent such storms are at the level where we worry about huge impacts on the power grid and the resulting impacts that a lack of electricity would have. Is it a 1 in 50, 1 in 100, or 1 in 1,000 year event? We just don’t know.
Is a flash drive enough?
This article says that "A flash drive stored well away from any external electrical lines would very likely survive an EMP strike." One commenter on this thread echoes that "Flash drives and optical disks would be completely safe too." However, other people on that thread use Faraday cages -- is that because Faraday cages actually are necessary to protect a flash drive?
Backup across continents
For a local EMP, an easy solution could be to send a duplicate flash drive to a friend on another continent (maybe only once every decade or whatever, to reduce the annoyance of doing so). One answer here recommends this solution, but only for "Events much less severe as the Carrington event".
TODO: I plan to look more into how to buy a Faraday cage.
One concern I've seen discussed is that the cage itself my destroy the contents of a flash drive. So it seems prudent to have two copies of your data: one on a flash drive stored in a Faraday cage, and one on a flash drive not in a Faraday cage.
One answer here says that for big enough geomagnetic storms: "Theoretically, if your Faraday cage is good enough (enough layers, thick enough conducting layers), it might pull it off. But I'm not sure if that will work with such violent events."
Optical storage (CDs, DVDs)
Someone on this forum suggests: "Files can be put in encrypted 7z archives before giving to friends to store. as a CD is an optical medium (hard drives are magnetic, USB drives use 'flash memory') it can survive electromagnetic hazards better than other types of storage, it ought to survive EMP or [coronal mass ejection] CME."
One answer here says that CD/DVD storage is "The best solution" and would work even for "Events similar or bigger than the Carrington event".
Many new laptops don't have CD drives, but you can buy an external USB drive.
Unfortunately, regular CDs/DVDs don't last indefinitely:
It is hard to predict exactly how long an optical disc will last since it depends on so many different factors. Nevertheless, estimations are floating around that predict a life span of up to 200 years for recorded CD-Rs and Blu-Ray discs. The shortest life span with 5-10 years is predicted for unrecorded CD-Rs and CD-RWs, followed by recorded DVD-RWs with up to 30 years. Recorded CD-RWs and DVD-Rs have a predicted lifetime of 20-100 years. In other words, you should not rely on any of these media for lifelong storage of your precious data, as they are likely to fail sooner rather than later.
The longevity problems of regular optical disks are overcome by the M-DISC, which claims to last 1000 years. This article says that the M-DISCs tested could be read by 6 out of 8 of the DVD drives tested. However, a comment on that article claims that "MDISC is mostly marketing hype" and "the government recognizes that standard Blu-Ray are just as acceptable."
This article says "If you’re worried about optical drives disappearing, know that optical retains a very strong presence in the archival community, as well as the enterprise, so that should give you some reassurance."
Printing out on paper
You could collect the text that you most want to save and have it printed out (probably by a professional printing service to avoid wearing out your home printer). Printed pages can get lost, burn up in a house fire, etc., but their risks are pretty complementary to the risks that electronic data storage faces. (It's good to have at least one cloud backup and at least one in-your-home backup.)
Paper copies of documents aren't as flexible as digital data, but the pages could be scanned back to electronic format in the long run.
Backing up other people's websites
Usually you can get backups of other people's websites through Internet Archive. But would Internet Archive's data survive a Carrington Event-scale geomagnetic storm?
The answer here says "The main Internet Archive storage is a custom technology called the PetaBox, which is now in its second generation. It’s designed and built from the bottom up and the inside out to be pretty robust." But I'm currently not sure whether PetaBox storage would be resistant to a Carrington Event. (I'd like to explore this further.)
This answer, regarding Internet Archive's protection against EMPs, says Internet Archive uses "Globally distributed backups." But I'm not sure if this is sufficient protection against a Carrington Event.
There also remains some risk that Internet Archive will go away eventually, perhaps due to lack of funding, though one hopes that its content would remain accessible to those willing to pay for access.
In order to be on the safe side, I wanted to do my own backups of websites whose content I consider important. Many entire websites are less than 1 GB when compressed, which makes backing up hundreds of them potentially feasible, given that the M-DISC service mentioned above has a huge size limit: "The overall limit for our $8 [per month] plan is 1TB of data per year. If you plan to upload more than that in a year, we suggest you consider our Pro Plan".
I'm downloading other people's websites using HTTrack, which is free. Once you download a site, you can zip its folder and then back that up the way you would any of your other files.
I'm still a novice at HTTrack, but from my experience so far, I've found that it captures only ~90% of website pages on average. For some websites (like the one you're reading now), HTTrack seems to capture everything, but for other sites, it misses some pages. Maybe this is because of complications with redirects? I'm not sure. Still, ~90% backup is much better than 0%.
You can verify which pages got backed up by opening the domain's index.html file from HTTrack's download folder and browsing around using the files on your hard drive. It's best if you disconnect from the Internet when doing this because I found that if I was online when browsing around the downloaded file contents, some pages got loaded from the Internet, not from the local files that I was testing.
Pictures don't seem to load offline, but you can check that they're still being downloaded. For example, for WordPress site downloads, look at the \wp-content\uploads folder.
I won't explain the full how-to steps of using HTTrack, but below are two problems that I ran into.
Troubleshooting: gets too many pages
When I tried to use HTTrack to download a single website using the program's default settings (as of Nov. 2016), I downloaded the website but also got some other random files from other domains, presumably from links on the main domain. In some cases, the number of links that the program tried to download grew without limit, and I had to cancel. In order to download files only from the desired domain, I had to do the following.
Step 1: Specify the domain(s) to download (as I had already been doing).
Step 2: Add a Scan Rules pattern like this: +https://*animalcharityevaluators.org/* . This way, only links on that domain will be downloaded.
Including a * before the main domain name is useful in case the site has subdomains. For example, the site https://animalcharityevaluators.org/ has a subdomain http://researchfund.animalcharityevaluators.org/ , which would be missed if you only used the pattern +https://animalcharityevaluators.org/* .
Troubleshooting: Error: "Forbidden" (403)
Some pages gave me a "Forbidden" error, which prevented any content from being downloaded. I was able to fix this by clicking on "Set options...", choosing the "Browser ID" tab, and then changing "Browser 'Identity'" from the default of "Mozilla/4.5 (compatible: HTTrack 3.0x; Windows 98)" to "Java1.1.4". I chose the Java identity because it didn't contain the substring "HTTrack", which may have been the reason I was being blocked.