File Organization Troubles

Backstory

So here’s the story. I have a ton of files that I’ve accumulated over the decades, and since they’ve been disorganized from the start (which was as far back as 1998) it has been very difficult to keep it all together in a way that makes sense. Compounding my problem is the fact that I have so many interests, I kinda have a library of different stuff, if you put a tornado in that library. So last year, I bought 2 big hard drives with the intent of FINALLY getting it organized. I took everything from my other drives and copied it to both the new ones. Then I put one in my server and the other in my workstation. Then I started shuffling things around on the workstation drive, but didn’t have time to finish it.

So here I am, a year later, and I have a bunch of new files on each, that isn’t on the other, and I can’t remember which is which. Normally, you’d just use rsync in both directions, but since I have most of the same files, just in different paths, that’s not going to work.

I needed a script. Something that can read the contents of both drives, compare JUST the files, regardless of the path, and spit out the files that are on drive 1 but not drive 2, and vice versa.

The Solution

#!/bin/bash
 
DRIVE1=/path/to/drive1
DRIVE2=/path/to/drive2

find "$DRIVE1" -type f > drive1filesfullpath.txt
find "$DRIVE2" -type f > drive2filesfullpath.txt

cat drive1filesfullpath.txt | sed 's!.*/!!' | sort  > \
drive1filenames.txt
cat drive2filesfullpath.txt | sed 's!.*/!!' | sort  > \
drive2filenames.txt
 

comm -23 <(sort drive1filenames.txt) <(sort drive2filenames.txt) \
> drive1missingfiles.txt
comm -23 <(sort drive2filenames.txt) <(sort drive1filenames.txt) \
> drive2missingfiles.txt
 
echo "Files on Drive 1 that aren't on Drive 2: "
echo "============================================"
cat drive1missingfiles.txt | while read -r line;
do 
 cat drive1filesfullpath.txt | grep "$line";
done

echo ;

echo "Files on Drive 2 that aren't on Drive 1: "
echo "============================================"
cat drive2missingfiles.txt | while read -r line;
do 
 cat drive2filesfullpath.txt | grep "$line";
done

There really isn’t that much going on here, and it could use some improvements, but it seems to work in this case. The next step would be to either take command line arguments for the input drives, or to write a simple GUI frontend for it.

Of course, this doesn’t move any files, it only reads out which ones are not on the other drive, but It’s hugely helpful for my case, where MOST of the files are present on both drives, but not all.

Design a site like this with WordPress.com
Get started