IT Nerd Space

Offsite Linux backup to Mega with duplicity

Offsite Linux backup to Mega with duplicity

Offsite Linux backup to Mega with duplicity

This is about a small personal initiative: I was looking for a solution to backup some Linux servers that I have, at home and in the cloud.

Ideally, my requirements were:

  • Backup should be offsite, preferably in the cloud
  • Solution should support some kind of backup policies (full, incremental, and retention time)
  • Solution should be secure, so support encryption
  • Solution should be portable, so same solution would play nicely with the servers, not mess with their OS (dependencies,…)
  • Solution should be as cheap as possible (or ideally free!)

For the offsite storage I first though of using AWS Glacier, a cloud storage meant for cold data that you archive, and don’t need to retrieve often, but the cost was not so cheap. While storage in itself was not too expensive, the cost of retrieving the data (in case you want to restore, which is the point of having a backup, right?), was kind of prohibitive (for the budget I had in mind). So I started to look for alternatives.

For the backup solution, I wanted to use duplicity. Duplicity supports full and incremental backups, using the rsync protocol, and have support for a lot of storage backend: file, FTP, SSH, AWS S3, Azure, Dropbox, OneDrive, Mega,… Most of those backends are either illimited but paid services, or free but rather limited (in capacity) storage. All but Mega, which offer for free 50GB of storage, which is quite nice for backup purpose. Perfect fit in my case.

Regarding the portability requirement, I love Docker containers, and all I deploy now for my personal projects is Dockerized. This wasn’t going to be an exception. Especially I’d hate to install all kind of dependencies for duplicity and the storage backend plugins in all my servers!

So Docker it would be.

Now back to the storage layer in our solution: although duplicity supposedly have support for a Mega backend, it seems Mega changed their API/SDK, and the current plugin is not working anymore, and would need to be reworked totally. So as an alternative, I turned to using MegaFuse, a Fuse module to mount Mega storage in Linux: so the idea is we first mount the Mega account as a filesystem in Linux, and then we use it as destination of our backup with duplicity (using the file backend). Not as cool as having duplicity talk directly to Mega, but that seems to work equally.

So to recap, we have a container with duplicity and MegaFuse. As the container is stateless, we’ll map some volumes from the host to the container so the container gets the needed information:

  • /vol/dupmega/megafuse.conf, containing some config for MegaFuse, like the credentials to the Mega account (see below),
  • As duplicity and MegaFuse both keep a local cache with some metadata, having those stored in the container would do no good, so I also put that in host mapped folder (/vol/dupmega/cache/duplicity/ and /vol/dupmega/cache/megafuse/)
  • Of course, we want to backup the host, not the container, so we need to map that as well into the container to /source

The megafuse.conf contains:

USERNAME = [email protected]
PASSWORD = aV3ryComplexMegaP4ssw0rd
MOUNTPOINT = /mega
CACHEPATH = /dupmega/cache/megafuse

So the Mega account is mounted in /mega in the container, and the MegaFuse cache will be in /dupmega/cache/megafuse (host mounted volume).

Here is the Dockerfile I have used to create my duplicity container with MegaFuse support. Right now it’s not yet published to Docker Hub. Right now it’s very rudimentary, there’s not even a CMD.

FROM ubuntu:14.04
MAINTAINER Alexandre Dumont <[email protected]>

ENV DEBIAN_FRONTEND=noninteractive

RUN sed -i -e ‘/^deb-src/ s/^/#/’ /etc/apt/sources.list && \
echo “force-unsafe-io” > /etc/dpkg/dpkg.cfg.d/02apt-speedup && \
echo “Acquire::http {No-Cache=True;};” > /etc/apt/apt.conf.d/no-cache && \
apt-get update && \
apt-get -qy dist-upgrade && \
apt-get install -y libcrypto++-dev libcurl4-openssl-dev libfreeimage-dev libreadline-dev libfuse-dev libdb++-dev duplicity git g++ make && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
git clone https://github.com/matteoserva/MegaFuse && \
cd MegaFuse && \
make

I use the following command to build the image:

docker build -t adumont/dupmega --no-cache=true .

And here’s the image:

REPOSITORY TAG IMAGE ID CREATED SIZE
adumont/dupmega latest 3bf1d313aa56 7 days ago 581.8 MB

I still have to work on simplifying the whole process of running a backup. For now, I do it rather manually, but it will become a script and be scheduled in cron most likely.

Right now that’s how I run it, notice the volume mapping from host to containers. Also notice the container has to run as privileged, so it can use Fuse from inside the container. The source to be backed up is mounted read-only (least privilege).

host# docker run --rm -h $( hostname ) -ti --privileged \
-v /vol/dupmega:/dupmega \
-v /root/.gnupg:/root/.gnupg \
-v /vol/dupmega/cache/duplicity:/root/.cache/duplicity \
-v /:/source:ro adumont/dupmega

Then from inside the container, I run:

mkdir /mega; MegaFuse/MegaFuse -c /dupmega/megafuse.conf &>/dev/null &
sleep 10
[ -d /mega/backups/$(hostname) ] || exit 1

export PASSPHRASE=AnotherVeryComplexPassphaseForGPGEncrypti0n

and finally the duplicity command which will backup /source to /mega. The command is different for each server, as I tweak which files/folders I want to include/exclude in the backup:

duplicity --asynchronous-upload \
--include=/source/var/lib/docker/containers \
--include=/source/var/lib/plexmediaserver/Library/Application\ Support/Plex\ Media\ Server/Preferences.xml \
--exclude=/source/dev \
--exclude=/source/proc \
--exclude=/source/run \
--exclude=/source/sys \
--exclude=/source/zfs \
--exclude=/source/mnt \
--exclude=/source/media \
--exclude=/source/vol/dupmega/cache \
--exclude=/source/tank \
--exclude=/source/vbox \
--exclude=/source/var/lib/docker \
--exclude=/source/var/lib/plexmediaserver/ \
--exclude=/source/tmp \
--exclude=/source/var/tmp \
--exclude=/source/var/cache \
--exclude=/source/var/log \
/source/ file:///mega/backups/$(hostname)/ -v info

And this would be a sample output:

Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Thu Oct 20 22:51:23 2016
Deleting /tmp/duplicity-Cwz5Ju-tempdir/mktemp-0lEM40-2
Using temporary directory /root/.cache/duplicity/185b874acbbae73c5807a4cc767e4967/duplicity-kYzeVT-tempdir
Using temporary directory /root/.cache/duplicity/185b874acbbae73c5807a4cc767e4967/duplicity-jWJXmd-tempdir
AsyncScheduler: instantiating at concurrency 1
M boot/grub/grubenv
A etc
A etc/apparmor.d/cache
M etc/apparmor.d/cache/docker
[...]
---------------[ Backup Statistics ]--------------
StartTime 1477499247.82 (Wed Oct 26 16:27:27 2016)
EndTime 1477499514.97 (Wed Oct 26 16:31:54 2016)
ElapsedTime 267.15 (4 minutes 27.15 seconds)
SourceFiles 449013
SourceFileSize 3239446764 (3.02 GB)
NewFiles 75
NewFileSize 189457 (185 KB)
DeletedFiles 135
ChangedFiles 68
ChangedFileSize 176144101 (168 MB)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 278
RawDeltaSize 9317325 (8.89 MB)
TotalDestinationSizeChange 1913375 (1.82 MB)
Errors 0
-------------------------------------------------

And backup are really stored on Mega 😉 :

backup files on Mega