Loci is a python script that can backup a directory to a server using rsync - It keeps track of the backups that have been done. Multiple backups may be kept. Rsync is used to handle the backups so only the needfull is copied and single files can be recovered from the backup if needed. loci -b tag : Backup under the tag given (I used days of the week)
loci -l : List backups showing those tags unused, backups that are needed, and backups that been run more than 5 times. I refresh these.
loci -r tag : Refresh a tag’s backup - delete the files under that tag and backuplog entries to prepare for a fresh backup using loci -b
~/.backuplog a file in .csv format that keeps track of backups done.
~/.config/loci/settings Settings file. Fully commented.
Multiple backups may be kept.
Nice work, but if I may suggest - it lacks hardlink support, so’s quite wasteful in terms of disk space - the number of ‘tags’ (snapshots) will be extremely limited.
At least two robust solutions that use rsync+hardlinks already exist: rsnapshot.org and dirvish.org (both written in perl). There’s definitely room for backup tools that produce plain copies, instead of packed chunk data like restic and Duplicacy, and a python or even bash-based tool might be nice, so keep at it.
However, I liken backup software to encryption - extreme care must be taken when rolling and using your own. Whatever tool you use, test test test the backups. :)
@droolio@feddit.uk I see what you’re asking. You’re wondering if, instead of storing a duplicate file when another backup set already contains it, I could use a hardlink to point to the file already stored in that other set?
I have a system where I create a backup set for each day of the week. When I do a backup for that day, I update the set, or if it’s out of date, I replace it entirely with a fresh backup image (After 7 backups to that set). But if the backup sets became inter-dependent, removing or updating one set could lead to problems with others that rely on files in the first set.
Does that make sense? I am asking because I am not familiar with the utilities you mentioned and may be taking your post wrong.
A hilariously unnecessary Python script that could have easily been done in bash since it’s literally just a wrapper around rsync. 😅
When you’ve only got a Python-sized hammer in your toolbox, everything looks like a Python nail, I guess.
#!/bin/bash # Function to read settings # Settings file format: # ~/.config/loci/settings # [backup] # # server = <<Name of server>> # user = <<server user login>> # backup_root = <<Directory off user's home Directory>> # taglist = mon tue wed thu fri sat sun spc # exclude_files = <<not implemented yet - leave blank>> # source_dir = <<the local directory we are backing up>> read_settings() { settings_file="$HOME/.config/loci/settings" if [[ -f "$settings_file" ]]; then while IFS='=' read -r key value || [[ -n "$key" ]]; do if [[ ! -z "$key" && ! "$key" =~ ^# && ! "$key" =~ ^\[ ]]; then key=$(echo "$key" | xargs) value=$(echo "$value" | xargs) declare -g "$key"="$value" fi done < "$settings_file" else echo "Settings file not found: $settings_file" exit 1 fi } # Function to perform the backup backup() { local tag="$1" read_settings # Create backup directory if it doesn't exist backup_dest="$backup_root/$tag" mkdir -p "$backup_dest" 2>/dev/null # Rsync command for backup target="$user@$server:/home/$user/$backup_root/$tag" rsync_cmd="rsync -avh $source_dir $target" # If exclude_files is defined and not empty, add it to rsync command if [[ -n "$exclude_files" ]]; then rsync_cmd="rsync -avh --exclude='$exclude_files' $source_dir $target" fi echo "Command:$rsync_cmd" eval "$rsync_cmd" # Log the backup information log_path="$HOME/.backuplog" timestamp=$(date +"%Y-%m-%d %H:%M") echo "\"$tag\",$timestamp,$rsync_cmd,$timestamp" >> "$log_path" echo "Backup for '$tag' completed and logged." } # Function to remove the backup remove_backup() { local tag="$1" read_settings # Rsync remove command rmfile="/home/$user/$backup_root/$tag" rm_cmd="ssh $user@$server rm -rf $rmfile" eval "$rm_cmd" # Remove log entries log_path="$HOME/.backuplog" if [[ -f "$log_path" ]]; then # Create a temporary file temp_file=$(mktemp) # Copy lines not starting with the tag to temp file grep -v "^\"$tag\"," "$log_path" > "$temp_file" # Replace the original with filtered content mv "$temp_file" "$log_path" fi echo "Backup '$tag' removed." } # Function to list the backups list_backups() { read_settings log_path="$HOME/.backuplog" # Loop through each tag in the taglist for tag in $taglist; do # Count occurrences of the tag in the log count=0 youngest="" if [[ -f "$log_path" ]]; then # Get count of tag occurrences count=$(grep -c "^\"$tag\"," "$log_path") # Get the newest backup date for this tag if [[ $count -gt 0 ]]; then # Extract dates and find the newest one dates=$(grep "^\"$tag\"," "$log_path" | cut -d',' -f2) youngest=$(echo "$dates" | sort -r | head -1) fi fi # Determine status if [[ $count -eq 0 ]]; then status="Missing" elif [[ $count -gt 5 ]]; then status="Needs renewal" elif [[ ! -z "$youngest" ]]; then # Calculate days since last backup youngest_seconds=$(date -d "$youngest" +%s) now_seconds=$(date +%s) days_diff=$(( (now_seconds - youngest_seconds) / 86400 )) if [[ $days_diff -gt 7 ]]; then status="Needs to be run" else status="Up to date" fi else status="Missing" fi echo "Tag: $tag, Status: $status, Count: $count, Last Backup: ${youngest:-N/A}" done } # Main function main() { if [[ "$1" == "-b" || "$1" == "--backup" ]] && [[ ! -z "$2" ]]; then backup "$2" elif [[ "$1" == "-r" || "$1" == "--remove" ]] && [[ ! -z "$2" ]]; then remove_backup "$2" elif [[ "$1" == "-l" || "$1" == "--list" ]]; then list_backups else echo "Usage: loci -b <tag> | loci -r <tag> | loci -l" fi } # Execute the script main "$@"
It’s also to help me learn python. And it works for me. : ^ )
Bash does seem like a better fit for this kind of script since it is a lot more portable.
I.e.: It comes by default for many Linux distributions. For windows, a Git bash install will get you most utilities needed for large reliable scripts (grep, scp, find, sort, uniq, cat, tr, ls, etc.).
With that said, you should write it on whatever language you want, especially if it is for learning purposes, that’s where the fun comes from :)
Don’t mind him. Any time someone shares code, there’s always someone else who did nothing talking about how much better your code could have been. Just noise from the peanut gallery.
Yeah, no problem… I started out with just bare rsync - but I did the backup infrequently and needed my notes to know the command. Then I wrote a simple shell script to run the rsync for me. Then I decided I needed more than one backup, redundancy is good. Then I wanted to keep track of the backups so I had it write to .backuplog then that file started getting dated (every time I run a “sun” backup the record of the previous one is useless) so Finally TaDa! loci is born.
lol, you’re braver than me. No one ever sees the “code” I’ve written.
That’s ok Like any landing you can walk away from. Any code that runs to spec is good, much could be better.
Looks like a line by line translation from the python. Will you use it to backup your home directory?
No.
It doesn’t really do anything I particularly need.
No need to be a dick
Do you wanna share a bash script, then?
Especially one that lets you know how long it’s been since you took time to run a backup, keeps track of which set of backups could be updated, and which should be refreshed, and keeps a log file up to date and in .csv format so you can mess with it in a spreadsheet?
#!/bin/bash read_settings() { settings_file="$HOME/.config/loci/settings" if [[ -f "$settings_file" ]]; then while IFS='=' read -r key value || [[ -n "$key" ]]; do if [[ ! -z "$key" && ! "$key" =~ ^# && ! "$key" =~ ^\[ ]]; then key=$(echo "$key" | xargs) value=$(echo "$value" | xargs) declare -g "$key"="$value" fi done < "$settings_file" else echo "Settings file not found: $settings_file" exit 1 fi } # Function to perform the backup backup() { local tag="$1" read_settings log_path="$HOME/.backuplog" # Check if header exists in log file, if not, create it if [[ ! -f "$log_path" ]]; then echo "\"tag\",\"timestamp\",\"command\",\"completion_time\"" > "$log_path" elif [[ $(head -1 "$log_path") != "\"tag\",\"timestamp\",\"command\",\"completion_time\"" ]]; then # Add header if it doesn't exist temp_file=$(mktemp) echo "\"tag\",\"timestamp\",\"command\",\"completion_time\"" > "$temp_file" cat "$log_path" >> "$temp_file" mv "$temp_file" "$log_path" fi # Create backup directory if it doesn't exist backup_dest="$backup_root/$tag" mkdir -p "$backup_dest" 2>/dev/null # Rsync command for backup target="$user@$server:/home/$user/$backup_root/$tag" rsync_cmd="rsync -avh $source_dir $target" # If exclude_files is defined and not empty, add it to rsync command if [[ -n "$exclude_files" ]]; then rsync_cmd="rsync -avh --exclude='$exclude_files' $source_dir $target" fi echo "Starting backup for tag '$tag' at $(date '+%Y-%m-%d %H:%M:%S')" echo "Command: $rsync_cmd" # Record start time start_timestamp=$(date +"%Y-%m-%d %H:%M:%S") # Execute the backup eval "$rsync_cmd" backup_status=$? # Record completion time completion_timestamp=$(date +"%Y-%m-%d %H:%M:%S") # Calculate duration start_seconds=$(date -d "$start_timestamp" +%s) end_seconds=$(date -d "$completion_timestamp" +%s) duration=$((end_seconds - start_seconds)) # Format duration if [[ $duration -ge 3600 ]]; then formatted_duration="$((duration / 3600))h $((duration % 3600 / 60))m $((duration % 60))s" elif [[ $duration -ge 60 ]]; then formatted_duration="$((duration / 60))m $((duration % 60))s" else formatted_duration="${duration}s" fi # Log the backup information as proper CSV echo "\"$tag\",\"$start_timestamp\",\"$rsync_cmd\",\"$completion_timestamp\"" >> "$log_path" if [[ $backup_status -eq 0 ]]; then echo -e "\e[32mBackup for '$tag' completed successfully\e[0m" echo "Duration: $formatted_duration" echo "Logged to: $log_path" else echo -e "\e[31mBackup for '$tag' failed with status $backup_status\e[0m" fi } # Function to remove the backup remove_backup() { local tag="$1" read_settings echo "Removing backup for tag '$tag'..." # Rsync remove command rmfile="/home/$user/$backup_root/$tag" rm_cmd="ssh $user@$server rm -rf $rmfile" # Execute the removal command eval "$rm_cmd" rm_status=$? if [[ $rm_status -ne 0 ]]; then echo -e "\e[31mError: Failed to remove remote backup for tag '$tag'\e[0m" echo "Command failed: $rm_cmd" return 1 fi # Remove log entries while preserving header log_path="$HOME/.backuplog" if [[ -f "$log_path" ]]; then # Create a temporary file temp_file=$(mktemp) # Copy header (first line) if it exists if [[ -s "$log_path" ]]; then head -1 "$log_path" > "$temp_file" # Only copy non-matching lines after header tail -n +2 "$log_path" | grep -v "^\"$tag\"," >> "$temp_file" else # If log is empty, add header echo "\"tag\",\"timestamp\",\"command\",\"completion_time\"" > "$temp_file" fi # Replace the original with filtered content mv "$temp_file" "$log_path" echo -e "\e[32mBackup '$tag' removed successfully\e[0m" echo "Log entries for '$tag' have been removed from $log_path" else echo -e "\e[32mBackup '$tag' removed successfully\e[0m" echo "No log file found at $log_path" fi } # Function to list the backups with detailed timing information list_backups() { read_settings log_path="$HOME/.backuplog" echo "Backup Status Report ($(date '+%Y-%m-%d %H:%M:%S'))" echo "=========================================================" printf "%-8s %-15s %-10s %-20s %-15s\n" "TAG" "STATUS" "COUNT" "LAST BACKUP" "DAYS AGO" echo "--------------------------------------------------------" # Check if header exists in log file, if not, create it if [[ ! -f "$log_path" ]]; then echo "\"tag\",\"timestamp\",\"command\",\"completion_time\"" > "$log_path" echo "Created new log file with CSV headers." elif [[ $(head -1 "$log_path") != "\"tag\",\"timestamp\",\"command\",\"completion_time\"" ]]; then # Add header if it doesn't exist temp_file=$(mktemp) echo "\"tag\",\"timestamp\",\"command\",\"completion_time\"" > "$temp_file" cat "$log_path" >> "$temp_file" mv "$temp_file" "$log_path" echo "Added CSV headers to existing log file." fi # Loop through each tag in the taglist for tag in $taglist; do # Count occurrences of the tag in the log count=0 youngest="" days_ago="N/A" if [[ -f "$log_path" ]]; then # Skip header when counting count=$(grep -c "^\"$tag\"," "$log_path") # Get the newest backup date for this tag if [[ $count -gt 0 ]]; then # Extract dates and find the newest one dates=$(grep "^\"$tag\"," "$log_path" | cut -d',' -f2) youngest=$(echo "$dates" | sort -r | head -1) # Calculate days since last backup if [[ ! -z "$youngest" ]]; then youngest_seconds=$(date -d "$youngest" +%s) now_seconds=$(date +%s) days_diff=$(( (now_seconds - youngest_seconds) / 86400 )) days_ago="$days_diff days" fi fi fi # Determine status with colored output if [[ $count -eq 0 ]]; then status="Missing" status_color="\e[31m$status\e[0m" # Red elif [[ $count -gt 5 ]]; then status="Needs renewal" status_color="\e[33m$status\e[0m" # Yellow elif [[ ! -z "$youngest" ]]; then # Calculate days since last backup youngest_seconds=$(date -d "$youngest" +%s) now_seconds=$(date +%s) days_diff=$(( (now_seconds - youngest_seconds) / 86400 )) if [[ $days_diff -gt 7 ]]; then status="Needs to be run" status_color="\e[33m$status\e[0m" # Yellow else status="Up to date" status_color="\e[32m$status\e[0m" # Green fi else status="Missing" status_color="\e[31m$status\e[0m" # Red fi printf "%-8s %-15b %-10s %-20s %-15s\n" "$tag" "$status_color" "$count" "${youngest:-N/A}" "$days_ago" done echo "--------------------------------------------------------" echo "CSV log file: $log_path" echo "Run 'loci -l' to refresh this status report" } # Function to show backup stats show_stats() { read_settings log_path="$HOME/.backuplog" if [[ ! -f "$log_path" ]]; then echo "No backup log found at $log_path" return 1 fi echo "Backup Statistics" echo "=================" # Total number of backups total_backups=$(grep -v "^\"tag\"" "$log_path" | wc -l) echo "Total backups: $total_backups" # Backups per tag echo -e "\nBackups per tag:" for tag in $taglist; do count=$(grep "^\"$tag\"," "$log_path" | wc -l) echo " $tag: $count" done # Last backup time for each tag echo -e "\nLast backup time:" for tag in $taglist; do latest=$(grep "^\"$tag\"," "$log_path" | cut -d',' -f2 | sort -r | head -1) if [[ -z "$latest" ]]; then echo " $tag: Never" else # Calculate days ago latest_seconds=$(date -d "$latest" +%s) now_seconds=$(date +%s) days_diff=$(( (now_seconds - latest_seconds) / 86400 )) echo " $tag: $latest ($days_diff days ago)" fi done echo -e "\nBackup log file: $log_path" echo "To view in a spreadsheet: cp $log_path ~/backups.csv" } # Function to export log to CSV export_csv() { read_settings log_path="$HOME/.backuplog" export_path="${1:-$HOME/backup_export.csv}" if [[ ! -f "$log_path" ]]; then echo "No backup log found at $log_path" return 1 fi # Copy the log file to export location cp "$log_path" "$export_path" echo "Backup log exported to: $export_path" echo "You can now open this file in your spreadsheet application." } # Main function main() { if [[ "$1" == "-b" || "$1" == "--backup" ]] && [[ ! -z "$2" ]]; then backup "$2" elif [[ "$1" == "-r" || "$1" == "--remove" ]] && [[ ! -z "$2" ]]; then remove_backup "$2" elif [[ "$1" == "-l" || "$1" == "--list" ]]; then list_backups elif [[ "$1" == "-s" || "$1" == "--stats" ]]; then show_stats elif [[ "$1" == "-e" || "$1" == "--export" ]]; then export_csv "$2" elif [[ "$1" == "-h" || "$1" == "--help" ]]; then echo "Loci Backup Management Tool" echo "Usage:" echo " loci -b, --backup <tag> Create a backup with the specified tag" echo " loci -r, --remove <tag> Remove a backup with the specified tag" echo " loci -l, --list List all backup statuses" echo " loci -s, --stats Show backup statistics" echo " loci -e, --export [path] Export backup log to CSV (default: ~/backup_export.csv)" echo " loci -h, --help Show this help message" else echo "Usage: loci -b <tag> | loci -r <tag> | loci -l | loci -s | loci -e [path] | loci -h" fi }
Ah, Improvements!
Can you please articulate why Python and Bash are so different in your eyes?
One needs to be
compiledinstalled and the other is literally the de facto scripting language installed everywhere and intended for exactly this purpose.My system came with Python3 installed. Debian 12.
Python does not need to be compiled, have you ever used it?
Saved for trying out later, ty!