Skip to content

regex

Selective pruning of old rsync backups

In yesterday's post, I described a couple rsync oddities, and how they'd led me to this modified command for pruning old (older than four days) backups:

find /path/to/backups/ -d 1 -type d -Bmin +$((60*4*24)) -maxdepth 1 -exec rm -r {} +

After getting this working, though, I wondered if it'd be possible to keep my backups from the first day of each month, even while clearing out the other dates. After some digging in the rsync man page, and testing in Terminal, it appears it's possible, with some help from regex.

My backup folders are named with a trailing date and time stamp, like this:

back-2017-05-01_2230
back-2017-05-02_0534
back-2017-05-02_1002

To keep any backups made on the first of any month, for my folder naming schema, the modified find command would look like this:

find /path/to/backups/ -d 1 -type d -Bmin +$((60*4*24)) -maxdepth 1 -not -regex ".*-01_.*" -exec rm -r {} +

The new bits, -not -regex ".*-01_.*" basically say "find only files that do not contain anything surrounding a string that is 'hyphen 01 underscore.' And because only backups made on the first of the month will contain that pattern, they're the only ones that will be left out of the purge.

This may be of interest to maybe two people out there; I'm documenting it so I remember how it works!



The little I know about regex…and where to learn more

First off, regex is shorthand for a regular expression. And what, exactly, is a regular expression? According to the linked Wikipedia page, a regular expression is…

…in theoretical computer science and formal language theory, a sequence of characters that define a search pattern. Usually this pattern is then used by string searching algorithms for "find" or "find and replace" operations on strings.

That's a mouthful, but what it means is that you can write some really bizarre looking code that will transform text from one form to another form. And if you know just a bit of regex, and where to go to look up what you don't know, then you can use regex to do many useful things.

For example, consider this filename on a scanned-to-PDF receipt:

The Party Place [party supplies] - 02-06-2017

Perhaps you'd prefer it if the date came first, in year-month-day order, so that your receipts were ordered by date, like this:

2017-02-06 - The Party Place [party supplies]

Sure, you could manually rename this one file, but what if you have 500 receipts that you need to rename? Enter regular expressions—they'll let you do this text manipulation, and many more. What follows is a very brief summary of my knowledge of regex, along with pointers to sites where I go when (very often) the problem I need to solve is beyond my regex skill level.

[continue reading…]