Bastrama - Backup Strategy Manager

Bastrama is a command-line tool to manage backup files that are stored on random access memory (eg. hard drives). It implements an infinite grandfather-father-son strategy by deleting a defined subset of the backup files, therefore saving storage space while still keeping some of the older backups in case something went wrong (and needs to be restored from) a long time ago.

The idea is: Bastrama lets you run full daily backups to hard drive and you will just never run out of space. - Of course you'll have to do the backups yourself (using the backup tool of your choice). Bastrama is just managing the resulting backup files.



Backup Strategies

From you backup files, numbered linear starting from 0, Bastrama builds a "tree", where every node has n children. Then from every level of the tree, Bastrama keeps the latest k files. The rest is deleted.

Here are some charts to make it clear. A sequence of backups is drawn from left to right, numbered from 0 to 68, where 0 denotes the oldest backup and 68 the latest (eg. "today's"). Green files are kept, gray ones are deleted.

n=2, k=2
Chart for n=2, k=2

n=2, k=4
Chart for n=2, k=4

n=3, k=3
Chart for n=3, k=3

n=3, k=6
Chart for n=3, k=6

It can be seen that the oldest and the latest backup are always kept. Between them, the probability of a backup being kept decreases exponentially with its age.

As a consequence, the required storage space increases only logarithmically over time, which is very, very slow. (This is assuming, that the later backup files aren't larger than the older ones.)

An example: After 10000 backup cycles with a n=3, k=3 strategy, 18 files are kept:

file kept age
# 10000 0 cycles
# 9999 1 cycles
# 9998 2 cycles
# 9996 4 cycles
# 9993 7 cycles
# 9990 10 cycles
# 9981 19 cycles
# 9963 37 cycles
# 9936 64 cycles
# 9882 118 cycles
# 9801 199 cycles
# 9720 280 cycles
# 9477 523 cycles
# 8748 1252 cycles
# 8019 1981 cycles
# 6561 3439 cycles
# 4374 5626 cycles
# 0 10000 cycles

If you have more space, do a n=3, k=6 strategy on those 10000 files and keep 33 of them:

file kept age
# 10000 0 cycles
# 9999 1 cycles
# 9998 2 cycles
# 9997 3 cycles
# 9996 4 cycles
# 9995 5 cycles
# 9993 7 cycles
# 9990 10 cycles
# 9987 13 cycles
# 9984 16 cycles
# 9981 19 cycles
# 9972 28 cycles
# 9963 37 cycles
# 9954 46 cycles
# 9936 64 cycles
# 9909 91 cycles
# 9882 118 cycles
# 9855 145 cycles
# 9801 199 cycles
# 9720 280 cycles
# 9639 361 cycles
# 9558 442 cycles
# 9477 523 cycles
# 9234 766 cycles
# 8991 1009 cycles
# 8748 1252 cycles
# 8019 1981 cycles
# 7290 2710 cycles
# 6561 3439 cycles
# 5832 4168 cycles
# 4374 5626 cycles
# 2187 7813 cycles
# 0 10000 cycles

to SourceForge project page Valid HTML 4.01! Valid CSS!