Posts

March 31 is World Backup Day

March 31st is World Backup Day, an idea that spawned from this Reddit thread 3 years ago. There’s no shortage of reasons why it’s important to backup your digital data and I feel most people already know that by now.

However, most people are also doing it wrong:

  • Backing up your PC to a hard drive in the same PC is #notabackup
  • Backing up to a hard drive that is 30 cm away from your computer is #notabackup
  • The photos that are still in your phone is #notabackup
  • Putting your stuff systematically in Dropbox is #notabackup

The backup 3-2-1 rule is regarded as a rule of thumb and best practice:

  • 3 copies of anything you care about – Two isn’t enough if it’s important.
  • 2 different formats – Combination of Dropbox / DVDs / Hard Drive / Memory Stick / CrashPlan
  • 1 off-site backup – If the house burns down, how will you get your memories back?

Protect your legacy and backup your digital valuables properly.

Backup strategies

Windows 8 File History

This is by far the easiest way to have continuous backups of your personal files with Windows 8. You can even set your destination to a remote location on the network.

For more information on how to activate it, I highly recommend Scott Hanselman’s blog

Windows 8 System Image Backup

I like to have a full system image of machines I really care about such as my own PC. It would take weeks to fully restore all my software in the case of a system crash. That’s why I create this full system image weekly. With that, I can go from total crash to full restore in just a few hours.

It’s almost hidden, but Windows 8 has a built-in System Image backup tool. Even better: there is a command-line version of this tool that can be used with scheduled tasks.

[code]wbadmin start backup -backuptarget:\spockbackups -include:C:[/code]

Windows 8 Storage Spaces

The natural evolution of Drive Extender, a feature I know quite well from the good Windows Home Server days. Windows 8 Storage Spaces improves upon the original idea of software raid. You allocate physical drives as part of your “storage pool” and create “storage spaces” (buckets) that can have different raid-like configuration.

Here’s an example on how I do it with 8 physical drives

  • Multimedia: Parity
  • Backup: Simple (no resiliency)
  • Documents: Two-way mirror
  • Photos: Three-way mirror

This is not a proper backup per se, but data redundancy makes recovery quick and easy in case of a hard drive failure (which happens more than you’d like).

Acronis True Image

Acronis True Image is a commercial software that specializes in full system image. It features multiple backup schemes (full, incremental, differential), “encryption” and a wide range of other things.

I used True Image for two years, and while I find that is a rich and powerful backup suite, the consolidation algorithm is very poor. Unusable, even. You’ve been warned.

Cloud storage (Dropbox / Google Drive)

Most people won’t be able to fit all their pictures in 2 GB, which is what most cloud providers are offering in their free tier.

However, Google recently announced a substantial price cut for their Google Drive service. 2$ / month will get you 100 GB of storage, more than enough for most people willing to have an off-site backup.

CrashPlan

For about 5$ / month, CrashPlan offers unlimited online storage for 1 PC. Yeah, your red that right: unlimited online storage! That means that if I can have a single location for ALL my backups and push them to CrashPlan Central from there, it will count as a single PC… brilliant!

This works perfect for work laptops that are not always connected to my home network. They will remotely sync with my home server at night between 2am and 8am, while the Internet usage does not count towards my monthly limit. (Thanks, TekSavvy!). This is also during that time that I upload the backups to CrashPlan Central.

To this date, they don’t seem to mind the 2 TB of data I’ve uploaded over the last 2 years.

Going further

Encryption

What if your hard drives are stolen? You might think your personal data is of no interest but believe me, it is. Your backups should be encrypted.

SpiderOak, Acronis True Image and  CrashPlan all have some kind of encryption capabilities in their offering. However, when it comes to encryption, I only trust software that is open source. Which is why I highly recommend the use of TrueCrypt, a long time favourite in data privacy circles. (TrueCrypt has been discontinued. Use VeraCrypt for now)

Backup cloud services

You should not trust the cloud. I would be sad if I lost all those years of email. Tools like Gmvault can help creating local copies of your Gmail account.

Email is just an example; backup your data that’s only in the cloud.

Remember when Megaupload was one of the biggest cloud storage providers? They were raided and went out of business overnight.

Practice restoration

Backups always succeed. It’s restores that fail. Make sure you’ve tested your restore procedure. How do you test that procedure? Restore to a virtual machine! Visualization software like VirtualBox can help you in that regard.

My own solution

What Source Destination Scheduling Technology
Full system image Simon-PC On-premise home server Weekly Windows 8 System Image Backup
File-level backup Personal PCs On-premise home server Continuous Windows 8 File History
File-level backup Work PCs Home server Nightly CrashPlan
Off-site backup Home server CrashPlan Central Nightly CrashPlan

In my case, my home server is a big part of the puzzle. All the data is in one physical place for on-premise restoration of files and system image, protected from (1-2 simultaneous) hard drive failure thanks to Windows 8 Storage Spaces. Combined with CrashPlan for local/remote backups and off-site backups to the cloud, this is a true backup solution.

Happy World Backup Day! What’s your backup strategy?

Go Paperless at Home with Google Drive

Every day, I’m handling and processing a lot of information that’s coming through that glowy rectangle that we call a computer screen. Even so, I’m having a hard time managing the few pages of paper I get in my mailbox on a daily basis.

Some of that paper is getting lost, important things that I should keep are recycled, graymail is lying on my desk indefinitely (not to mention spam), sensitive information is not handled properly… I don’t want to deal with any of this!

What I’m really targeting here is a paper-less home. I want to be able to scan, archive, shred the physical copy and forget about it… right until I need it where I must be able to find what I’m looking for quickly.

Strategy

I have a multi-function printer that has the ability to scan to a specified FTP location at the touch of a single button. From that location, some software could pick it up and automatically process it.

I evaluated Google Drive against other services and software. Here is what I found:

Feature PDF from scanner Foxit Phantom Evernote* Google Drive*
Storage location Local Local Cloud Cloud
Maximum storage size Local Local +1 GB / Month Up to 5 GB
Maximum PDF size None None 25 MB 2 MB
Maximum number of pages Infinite Infinite 100 10
OCR None Manual Automatic Automatic
Price Free 80$ 5$ / month Free

The text in the PDF must be searchable so I can find what I’m looking for quickly (that’s the whole point). That means the solution I choose must have OCR capabilities.

There is some commercial software that’s doing exactly what I want but those are targeted to businesses and cost quite a lot. Foxit PhantomPDF Standard is the cheapest PDF editor that includes an OCR engine I found but there is no way to automate the process. A real deal breaker.

Let’s be realistic when comparing the remaining two cloud services: most if not all the paper I scan is at most a few pages. Longer documents are processed through a separate workflow and will be stored somewhere else. Evernote would have worked great but the 5$/month premium isn’t justifiable if I don’t use what it has over Google Drive.

How to

I think all the scanners have to way to “one-button scan” to a target machine by now. Check the user’s manual of your scanner or multi-function printer. Once it’s done, all you have to do is to install the Google Drive client on that machine, configure it so your scanner outputs the files in the right folder and you’re good to go!

Once you press the button, the paper will be scanned, moved to the cloud, OCR’ed and archived. Don’t forget to shred the original.

Looking for something? Hit https://drive.google.com, type some keywords in the search bar and you’ll instantly get it. Google magic!

I can finally find that banana carrot muffin recipe, T4 tax form or receipt for that hard drive without going through that stack of paper that’s lying on my desk.

That makes me wonder: how do you manage your paper?

Archiver ses connaissances

J’ai accumulé beaucoup de documentation papier au cours de mes 8 années d’études postsecondaire. Des livres, des notes de cours autant avec encre numérique sur Tablet PC que manuscrite. Inutile de dire que tous ces livres et cartables prennent un espace physique considérable.

Dans son état actuel, cette connaissance est presque perdue puisque je n’ai aucune façon d’indexer mon contenu afin de le retrouver facilement. Imaginez une bibliothèque sans système Dewey. Dans quel garde-robe, quelle boite, quel cartable, séparateur, quelle page, dans quel paragraphe se trouve mon information? Heureusement, s’il y a un domaine dans lequel les ordinateurs excellent, c’est bien l’indexion de contenu!

Le contenu de mon garde-robe: que des notes de cours et des livres.

Les avantages de passer à un support numérique

  • L’accessibilité est bien meilleure: un fichier PDF est beaucoup plus simple à ouvrir qu’un cartable dans le fond d’une boite quelconque.
  • La recherche y est aussi grandement simplifiée. De plus, si je prends la peine de passer mes documents dans engin OCR (reconnaissance optique de caractères), j’aurai la possibilité de trouver réponse à mes questions très rapidement.
  • La durabilité du support numérique est supérieure au papier qui s’endommage avec le temps. Lorsqu’en concert avec une bonne stratégie de backup, les fichiers sont beaucoup plus résilients face aux accidents et désastres naturels.
  • La possibilité de partager ces documents est aussi un avantage inhérent au format numérique. Je pourrais rendre disponible pour téléchargement la totalité de mes notes manuscrites.

Méthodes de numérisation

Numériser des notes de cours, c’est facile. J’ai juste à aller la bibliothèque, mettre mon paquet de feuilles dans le feeder automatique, spécifier mon courriel et voilà! Un beau PDF dans ma boîte de réception.

Pour des livres, c’est un peu plus complexe car il faut couper la reliure afin d’avoir des feuilles mobiles. C’est une méthode destructive mais c’est la seule façon de faire à faible coût car les méthodes non destructives sont soit très complexes ou très dispendieuses. Comme mes livres ne sont pas des ouvrages précieux ou rare, je peux me passer de leur forme physique (et c’est un peu le but de l’opération ici).

L’aventure commencera donc par trouver un endroit qui offre un service de coupe. J’ai fini par aller chez Bureau en gros.

Le grosse tranche chez Bureau en gros

Difficultés rencontrées

En réalité, l’opération s’est avéré plus complexe que prévue (surprise). J’ai utilisé un livre de 1104 pages pour mon premier test et je me suis heurté à plusieurs problèmes.

  • La taille maximale du courriel. Le scanneur peut envoyer des PDF par courriel, mais la taille du PDF ne doit pas excéder la taille maximale permise. Dans mon cas, j’ai procédé par paquet de 100 pages recto-verso, ce qui me faisait des fichiers d’environ 17 Mb. Ça m’a fait 11 fichiers séparés que j’ai dû assembler au final. La numérisation du livre a pris 45 minutes.
  • Le format de page non standard et la rotation des pages. C’est surement une mauvaise utilisation du scanneur de ma part, mais mon fichier résultat était au format était au format 8.5×11 alors qu’en réalité, le livre faisait une demie page. La rotation des pages était aussi un problème. Les pages pairs avaient une rotation de 90 degrés d’un sens et les pages impairs avaient une rotation de 90 degrés de l’autre. Rien de grave, simplement quelques ajustements mineurs que FoxIt PhantomPDF a su bien gérer.
  • OCR plus long que prévu. Ça ne sert à rien d’avoir une copie numérique si le texte reste une image. La reconnaissance optique des caractères a pris un bon deux heures avant de terminer.
  • Bookmarks. J’étais motivé et je voulais recréer la table des matières dans les bookmarks pour faciliter la navigation à l’intérieur du fichier. Ça m’a pris une demi-heure pour faire le tout.

Conclusion

C’est un processus long qui nécessite plus de temps et de travail que ce que je m’attendais au départ. Je suis tout de même très satisfait du résultat. Cette connaissance est maintenant beaucoup plus facile d’accès et j’ai réduit la quantité de mes biens physiques. Ce n’est évidemment pas encore à la portée de tous, mais je crois que mon professeur d’économie va être content d’enfin pouvoir enfin transporter sa référence en tout temps sur son iPad.

Ce qu’il reste du livre

Téléchargements