Skip to content
Bruno Sonnino
Menu
  • Home
  • About
Menu

Poor man’s backup

Posted on 20 February 2017

Introduction

When you have something digital, having backups is something fundamental to keep your data safe. There are many threats over there that can destroy your data: hardware failures, viruses, natural disasters are some of the ways to make all your data vanish from one moment to the other.

I use to keep my data in several places (you can never be 100% sure 😃), making cloud and local backups, and I can say that they have saved me more than once. For cloud backups, there are several services out there and I won’t discuss them, but for local backups, my needs are very specific, and I don’t need the fancy stuff out there (disk images, copying blocked data, and so on). I just need a backup that has these features:

  • Copies data in a compressed way – it would be better that it’s a standard format, like zip files, so I can open the backups with normal tools and don’t need to use the tool to restore data.
  • Allows the copy of selected folders in different drives. I don’t want to copy my entire disk (why keep a copy of the Windows installation, or the installed programs, if I can reinstall them when I need).
  • Allows the exclusion of files and folders in the copy (I just want to copy my source code, there is no need to copy the executables and dlls).
  • Allows incremental (only the files changed) or full backup (all files in a set)
  • Can use backup configurations (I want to backup only my documents or only my source code, and sometimes both)
  • Can be scheduled and run at specified times without the need of manually starting it.

With these requirements, I started to look for backup programs out there and I have found some free ones that didn’t do everything I wanted and some paid ones that did everything, but I didn’t want to pay what they were asking for. So, being a developer, I decided to make my own backup with the free tools out there.

The first requirement is a compressed backup, with a standard format. For zip files, I need zip64, as the backup files can be very large and the normal zip files won’t handle large files. So, I decided to use the DotNetZip library (https://github.com/DinoChiesa/DotNetZip), an open source library that is very simple to use and supports Zip64 files. Now I can go to the next requirements. That can be done with a normal .NET console program.

Creating the backup program

In Visual Studio, create a new Console Program and, in the Solution Explorer, right-click the References node and select “Manage NuGet packages” and add the DotNetZip package. I don’t want to add specific code for working with the command line options, so I added a second package, CommandLineParser (https://github.com/gsscoder/commandline), that does this for me. I just have to create a new class with the options I want and it does all the parsing for me:

class Options
{
    [Option(DefaultValue = "config.xml", 
      HelpText = "Configuration file for the backup.")]
    public string ConfigFile { get; set; }

    [Option('i', "incremental", DefaultValue= false,
      HelpText = "Does an increamental backap.")]
    public bool Incremental { get; set; }

    [HelpOption]
    public string GetUsage()
    {
        return HelpText.AutoBuild(this,
          (HelpText current) => HelpText.DefaultParsingErrorsHandler(this, current));
    }
}
C#

To use it, I just have to pass the command line arguments and have it parsed:

var options = new Options();
CommandLine.Parser.Default.ParseArguments(args, options);
C#

It will even give me a –help command line for help:

The next step is to process the configuration file. Create a new class and name it Config.cs:

public class Config
{
    public Config(string fileName)
    {
        if (!File.Exists(fileName))
            return;
        var doc = XDocument.Load(fileName);
        if (doc.Root == null)
            return;
        IncludePaths = doc.Root.Element("IncludePaths")?.Value.Split(';');
        ExcludeFiles = doc.Root.Element("ExcludeFiles")?.Value.Split(';') ?? new string[0] ;
        ExcludePaths = doc.Root.Element("ExcludePaths")?.Value.Split(';') ?? new string[0];
        BackupFile = $"{doc.Root.Element("BackupFile")?.Value}{DateTime.Now:yyyyMMddhhmmss}.zip";
        ExcludeFilesRegex =
            new Regex(string.Join("|", string.Join("|", ExcludeFiles), string.Join("|", ExcludePaths)));
    }

    public Regex ExcludeFilesRegex { get; }
    public IEnumerable IncludePaths { get; }
    public IEnumerable ExcludeFiles { get; }
    public IEnumerable ExcludePaths { get; }
    public string BackupFile { get; }
}
C#

To make it easy to select the paths and files to be excluded, I decided to give it a Regex style and create a Regex that will match all files. For example, if you want to remove all mp3 files, you would add something like “.mp3$” (starts with a “.”, then mp3 and then the end of the string). If you want to remove mp3 and mp4 files, you can add this: “.mp[34]$”. For the paths, you get the same thing, but they start and end with a slash (double slash, for the regex).

With this in place, we can start our backup. Create a new class and call it Backup.cs. Add this code to it:

class Backup
{
    private readonly FileFinder _fileFinder = new FileFinder();

    public async Task DoBackup(Config config, bool incremental)
    {
        var files = await _fileFinder.GetFiles(config.IncludePaths.ToArray(), 
               config.ExcludeFilesRegex, incremental);
        using (ZipFile zip = new ZipFile())
        {
            zip.UseZip64WhenSaving = Zip64Option.AsNecessary;
            foreach (var path in files)
                zip.AddFiles(path.Value, false, path.Key);
            zip.Save(config.BackupFile);
        }
        foreach (var file in files.SelectMany(f => f.Value))
            ResetArchiveAttribute(file);
        return 0;
    }

    public void ResetArchiveAttribute(string fileName)
    {
        var attributes = File.GetAttributes(fileName);
        File.SetAttributes(fileName, attributes & ~FileAttributes.Archive);
    }
}
C#

This class uses a FileFinder class to find all files that match the pattern we want and creates a zip file. The GetFiles method from FileFinder returns a dictionary structured like this:

  • The key is a path related to the search path. As the paths can be on any disk of your system and they can have the same names (ex C:\Temp and D:\Temp), and that would not be ok in the zip file, the paths are changed to reflect the same structure, but their names are changed to allow to be added to the zip files. That way, if I am searching in C:\Temp and in D:\Temp, the keys for this dictionary would be C_Drive\Temp and D_Drive\Temp. That way, both paths will be stored in the zip and they wouldn’t clash. These keys are used to change the paths when adding the files to the zip
  • The value is a list of files found in that path

The files are added to the zip and, after that, their Archive bit is reset. This must be done, so the incremental backup can work in the next time: incremental backups are based on the Archive bit: if it’s set, the file was modified and it should be backed up. If not, the file was untouched. This is not a foolproof method, but it works fine for most cases. A more foolproof way to do this would be to keep a log file every full backup with the last modified dates of the files and compare them with the current file dates. This log should be updated every backup. For my case, I think that this is too much and the archive bit is enough.

The FileFinder class is like this one:

class FileFinder
{
    public async Task<ConcurrentDictionary<string, List>> GetFiles(string[] paths, 
        Regex regex, bool incremental)
    {
        var files = new ConcurrentDictionary<string, List>();
        var tasks = paths.Select(path =>
            Task.Factory.StartNew(() =>
            {
                var rootDir = "";
                var drive = Path.GetPathRoot(path);
                if (!string.IsNullOrWhiteSpace(drive))
                {
                    rootDir = drive[0] + "_drive";
                    rootDir = rootDir + path.Substring(2);
                }
                else
                    rootDir = path;
                var selectedFiles = Enumerable.Where(GetFilesInDirectory(path), f => 
                     !regex.IsMatch(f.ToLower()));
                if (incremental)
                    selectedFiles = selectedFiles.Where(f => (File.GetAttributes(f) & FileAttributes.Archive) != 0);
                files.AddOrUpdate(rootDir, selectedFiles.ToList(), (a, b) => b);
            }));
        await Task.WhenAll(tasks);
        return files;
    }

    private List GetFilesInDirectory(string directory)
    {
        var files = new List();
        try
        {
            var directories = Directory.GetDirectories(directory);
            try
            {
                files.AddRange(Directory.EnumerateFiles(directory));
            }
            catch
            {
            }
            foreach (var dir in directories)
            {
                files.AddRange(GetFilesInDirectory(Path.Combine(directory, dir)));
            }
        }
        catch
        {
        }

        return files;
    }
}
C#

The main method of this class is GetFiles. It is an asynchronous method, I will create a new task for every search path. The result is a ConcurrentDictionary, and it has to be so, because there are many threads updating it at once and we could have concurrency issues. The ConcurrentDictionary handles locking when adding data from different threads.

The GetFilesInDirectory finds all files in one directory and, after all files are found, the data is filtered according to the Regex and, if the user asks for an incremental backup, the files are checked for their archive bit set. With this set of files, I can add them to the zip and have a backup file that can be read with standard programs.

Just one requirement remains: to have a scheduled backup. I could make the program stay in the system tray and fire the backup at the scheduled times, but there is an easier way to do it: use the Windows task scheduler. You just need to open a command prompt and type the command:

schtasks /create /sc daily /st "08:15" /tn "Incremental Backup" /tr "D:\Projetos\Utils\BackupData\BackupData\bin\Debug\Backupdata.exe -i"
PowerShell

That will create a scheduled task that will run the incremental backup every day at 8:15. The main program for this backup is very simple:

static void Main(string[] args)
{
    var options = new Options();
    CommandLine.Parser.Default.ParseArguments(args, options);
    if (string.IsNullOrWhiteSpace(options.ConfigFile))
        return;
    if (string.IsNullOrWhiteSpace(Path.GetDirectoryName(options.ConfigFile)))
    {
        var currentDir = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location);
        if (!string.IsNullOrWhiteSpace(currentDir))
            options.ConfigFile = Path.Combine(currentDir, options.ConfigFile);
    }
    var config = new Config(options.ConfigFile);
    var backup = new Backup();
    var result = backup.DoBackup(config, options.Incremental).Result;

}
C#

I will parse the arguments, read and parse the config file, create the backup and exit. As you can see, the last line calls DoBackup.Result. This is because the Main method cannot be async and, if I just run it without calling async, it would not wait and would exit without running the backup. Calling result, the program will wait for the task completion.

Just one issue, here – if you wait for the task schedule to fire, you will see that a console window appears, and we don’t want that this happens while we are doing something else. One way to hide the console window is to go to the app properties and set the output type as a Windows application. That will be enough to hide the console window:

Conclusions

As you can see, it’s not too difficult to make a backup program using open source tools we have available. This program is very flexible, small and not intrusive. It can run anytime you want and have different configuration files. Not bad, huh?

The full source code for the project is in https://github.com/bsonnino/BackupData

1 thought on “Poor man’s backup”

  1. Pingback: Backup Revisited – Bruno Sonnino

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • May 2025
  • December 2024
  • October 2024
  • August 2024
  • July 2024
  • June 2024
  • November 2023
  • October 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • June 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • October 2020
  • September 2020
  • April 2020
  • March 2020
  • January 2020
  • November 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • June 2017
  • May 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • October 2015
  • August 2013
  • May 2013
  • February 2012
  • January 2012
  • April 2011
  • March 2011
  • December 2010
  • November 2009
  • June 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • July 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • Development
  • English
  • Português
  • Uncategorized
  • Windows

.NET AI Algorithms asp.NET Backup C# Debugging Delphi Dependency Injection Desktop Bridge Desktop icons Entity Framework JSON Linq Mef Minimal API MVVM NTFS Open Source OpenXML OzCode PowerShell Sensors Silverlight Source Code Generators sql server Surface Dial Testing Tools TypeScript UI Unit Testing UWP Visual Studio VS Code WCF WebView2 WinAppSDK Windows Windows 10 Windows Forms Windows Phone WPF XAML Zip

  • Entries RSS
  • Comments RSS
©2025 Bruno Sonnino | Design: Newspaperly WordPress Theme
Menu
  • Home
  • About