File Organization Troubles

Backstory

So here’s the story. I have a ton of files that I’ve accumulated over the decades, and since they’ve been disorganized from the start (which was as far back as 1998) it has been very difficult to keep it all together in a way that makes sense. Compounding my problem is the fact that I have so many interests, I kinda have a library of different stuff, if you put a tornado in that library. So last year, I bought 2 big hard drives with the intent of FINALLY getting it organized. I took everything from my other drives and copied it to both the new ones. Then I put one in my server and the other in my workstation. Then I started shuffling things around on the workstation drive, but didn’t have time to finish it.

So here I am, a year later, and I have a bunch of new files on each, that isn’t on the other, and I can’t remember which is which. Normally, you’d just use rsync in both directions, but since I have most of the same files, just in different paths, that’s not going to work.

I needed a script. Something that can read the contents of both drives, compare JUST the files, regardless of the path, and spit out the files that are on drive 1 but not drive 2, and vice versa.

The Solution

#!/bin/bash
 
DRIVE1=/path/to/drive1
DRIVE2=/path/to/drive2

find "$DRIVE1" -type f > drive1filesfullpath.txt
find "$DRIVE2" -type f > drive2filesfullpath.txt

cat drive1filesfullpath.txt | sed 's!.*/!!' | sort  > \
drive1filenames.txt
cat drive2filesfullpath.txt | sed 's!.*/!!' | sort  > \
drive2filenames.txt
 

comm -23 <(sort drive1filenames.txt) <(sort drive2filenames.txt) \
> drive1missingfiles.txt
comm -23 <(sort drive2filenames.txt) <(sort drive1filenames.txt) \
> drive2missingfiles.txt
 
echo "Files on Drive 1 that aren't on Drive 2: "
echo "============================================"
cat drive1missingfiles.txt | while read -r line;
do 
 cat drive1filesfullpath.txt | grep "$line";
done

echo ;

echo "Files on Drive 2 that aren't on Drive 1: "
echo "============================================"
cat drive2missingfiles.txt | while read -r line;
do 
 cat drive2filesfullpath.txt | grep "$line";
done

There really isn’t that much going on here, and it could use some improvements, but it seems to work in this case. The next step would be to either take command line arguments for the input drives, or to write a simple GUI frontend for it.

Of course, this doesn’t move any files, it only reads out which ones are not on the other drive, but It’s hugely helpful for my case, where MOST of the files are present on both drives, but not all.

Samba Domain Controller Woes

Long ago I found myself at work, needing a place to put my work files. At the time, everything was saved to the desktop of a sketchy gateway computer. I set out to have a single network storage area for both of our computers at the time, and that took the form of a 500GB hard drive in a USB enclosure connected to a hacked PogoPlug. Over time, I saw the growth of my department start to require separate user accounts and that eventually meant creating a Windows Domain. That was big and scary, at the time, so I left it as a side project. Much later on we had some people come in and set up a “proper” server complete with MS Windows Server and a domain controller with it. When the IT company that built it dissolved, it eventually came down to me to figure out what other people had setup. Turns out, they had absolutely NO redundancy built in. There was a single VM running as a domain controller, a single fileserver, and a single DNS and DHCP server. I also didn’t like the way it was setup, as it used a .local domain suffix. Our company owns a proper domain name for our website, so it makes sense to use that as our internal domain, since that’s generally regarded as better practice than using bogus suffix. So I set out to finally understand how this works.

First off, I wanted to get away from Microsoft, since it’s expensive, and very restrictive. If I build a Linux server, I can make 1000 copies of it and nobody cares. Enter Turnkey Linux. They have a ton of prebuilt appliances just for servers. They also have a domain controller appliance that saves me the 100 lines of bash to setup all the prerequisites for a samba DC.

So I quickly spun up one of the TKL VM’s, and set it up as my primary domain controller, added all the users and done! I added a fileserver as well, copied our stuff to it, and its been running happily ever since. Except for one thing. It’s running on a VMware host, and the vCenter server appliance that’s running on it was setup by the aforementioned dissolved IT company, so I don’t have the login to get in and fully control this host. After much searching, it looks like the only way around this is to wipe the host and start over. Great. And all our services are running on that host with no redundancy. Thanks guys. Further, it’s all Windows Server based, and being that Windows VM licensing is about as clear to understand as an alien language written in the quantum vacuum, I don’t really know if I can just copy them to my other host (that I set up), and have them off until I’m ready to make the transfer. WS2K12 is too expensive to buy extra licenses just to have them off until I can make the switch, and afterwards, they would be extras I don’t need.

So anyway, I’ve finally come to a point where I feel it’s very wise to start building in redundancy, so I can work toward rebuilding the host I don’t like.

I call it Lenny.

After a ton of researching, I found nothing that actually helps with the failover part of this equation, since almost all the pages I find on “failover samba AD DC” setups are of the form:

“install all this stuff before you install samba. Setup all the kerberos stuff. Add second server. Done.”

What I needed was between the “Add second server” and “Done” parts.

Also, as much as I like turnkey linux, I have a very hard time finding specific info for solving problems with their appliances, so I have to look for general linux advice.

The first step I took was to start with a copy of my existing DC that I made once it was setup and working, and import it into VirtualBox, keeping the network set to internal, so it doesn’t interfere with the rest of the work network. This machine can’t be taken down or reprovisioned, or I’d have to rejoin ALL the machines in our network. I really don’t want to do that. So I imported a second domain controller appliance. It’s worth noting that they have to be the same version, specifically the same samba version, or they won’t sync properly.

When the DC2 is running for the first time, it wants to set itself as the primary controller, which I don’t want, so I just accepted the defaults to get through the setup process. After that, delete everything in /var/lib/samba/ , as well as the smb.conf file in /etc/samba/

# rm -rf /var/lib/samba/*

# rm /etc/samba/smb.conf

Change the hostname.

#nano /etc/hostname

This defaults to dc1, change it to dc2.

Then sync their clocks.

# service ntp stop

# ntpdate -s dc1

# service ntp start

Reboot.

# reboot

Then join the domain as a domain member server. Note the uppercase domain name in the username.

# samba-tool domain join example.com DC -U”EXAMPLE\administrator”

This should now show up in the Domain Controllers list in the Windows Active Directory Users And Groups tool.

At this point you’d think we are done. There it is in the list, after all. But no. It’s not that simple. And I could NOT find anywhere that said how to fix the next parts. The last two problems at this point are server replication, and failover. I found several pages showing how to check if replication is working, with MANY step and workarounds to try to diagnose them, but it turns out it’s pretty simple. in my case, it boiled down to the DNS server on DC1. When it was provisioned, it added itself all through the DNS records in many different places that I really don’t fully understand, but when you add DC2 to the directory, those records are not automatically created. After setting these records, replication was working, and failover appeared to work as well. If I disconnect DC1 from the internal network, my windows machine can’t see it and hangs for a minute until DC1 times out, then it looks to DC2 for whatever it needs.

It’s worth noting at this point, that if the DHCP server is running on DC1, as it is in my case, once DC1 goes down, so does DHCP and DNS with it. For that to work, you need a second DHCP failover server that I won’t cover here, or just move the service to another linux VM, since DHCP is dead-simple by comparison. Also, the DHCP server must have DC2 in its DNS Servers list as well.

Raspberry Pi 4 Blender Benchmarks

So I got my hands on a Raspberry Pi 4 and wanted to see how it compares to the Pi 3 in terms of cpu power. Turns out there really is quite a practical difference!

Very Simple Scene

In this very simple example scene, I have the standard “Default Cube” with a basic principled shader and no effects, and a 1K hdri map from www.hdrihaven.com as the background. I left it at 128 samples, and 50% size, which is 960×540.

The Pi 3 renders this example scene in 2 minutes and 8 seconds.

The Pi 4 does this in just over 45 seconds!

As a point of comparison, my Ryzen 2700x will take 20.2 seconds to render this, single threaded. Not really a fair comparison, but it does show that the Pi4 can be almost half as fast as a single desktop thread, in some workloads. It does, however get pretty hot doing this if you only have a heatsink, and no fan attached.

Archiviz

Here is my latest work in blender. It’s taken longer than it probably should, but I took a few diversions to study GI probes in Eevee, and light baking. I ended up just rendering with cycles, however I found much better results using the branched path tracer.

Condo kitchen cycles

I found a few resources to help with the lighting, which I think helped me out a lot. It turns out cycles lights are physically based. I never knew what the values meant before. Sun lamps and environment maps are measured in Watts/square meter, and according to one site, the sun should be set around 441, with the environment map at around 27. That corresponds to real light measurements. Initially I thought it was too high, but after tweaking the exposure, I think it turned out really good.

first render

This was my first render, and the lighting doesn’t feel right to me. This was before I set the environment lighting to correct values, and the interior lights are too white. To fix them, I used the blackbody node for color, set to around 4100K. They also use some IES light textures to make them light more like real light bulbs.

This is an old render, from December 2018. It was my second attempt at an archiviz render.

Here is version 3, made shortly after version 2. I think there is a lot of progress from these to the latest render, even though the newest one doesn’t yet have any of the extra items like stools, coffee maker etc.

The birth of chaos

I have finally decided to put some of these rattling things in one place. I’ve tried for years to organize the things I find for my projects and ideas, but it has never been built permanently. I have setup wordpress sites in my home lab several times, but I always change stuff around and lose whatever it was I built. Well, today I decided to actually break the ice and start this up as a place to organize my projects, or whatever it turns into over time.