• Setting up a Storj node on Alpine Linux 3.23.4 using iSCSI and a Synology NAS

    I recently replaced my old Synology NAS with a newer model. The old NAS is still in great shape, has good disks, and more than a terabyte of storage, so I figured I’d try setting up a Storj storage node with it.

    However, there’s a limitation: the older NAS can’t run Docker containers, and so cannot host the storage node software directly. Instead, I need to run it on another local system that uses the NAS as storage. Storj is pretty clear that iSCSI is the only proper option for accessing storage hardware over the network:

    A network-attached storage location may work, but this is neither supported nor recommended! Please consider running the node locally on your file server/NAS instead. If that is not possible, then the only working network protocol for network storage is iSCSI.

    Fair enough. Let’s get things started. This post documents my process for future reference, and I’m sharing it publicly in the hopes that it will be useful for others.

    Here’s my setup:

    • A Proxmox cluster running a variety of VMs and LXC containers for my homelab.
    • The old Synology NAS.
    • A switched gigabit ethernet network with several VLANs (one for internal services, one for internet-exposed services, etc.).

    For security, I want to run an internet-facing service like a Storj node on a VM that runs its own kernel (rather than an LXC, which shares the host kernel), in the internet-service VLAN (40, in my case), and use Alpine Linux because it’s extremely lightweight.

    Basic Setup & Networking

    1. After creating a VM, assigning its interface to vlan40, and installing and updating Alpine, I have a basic VM running on 10.200.40.151.
    2. My NAS is running on a separate VLAN as 192.168.3.65.
    3. I have a dynamic DNS address that automatically updates to my public IPv4 address.
    4. Following the instructions from Storj, I set up port forwarding on my firewall (running opnsense, for what it’s worth) so that TCP and UDP port 28967 is forwarded to that VM.
    5. I also increase the UDP buffer size on the VM using the steps here, which work as expected on Alpine.
    6. Since the NAS and VM are on separate VLANs that are isolated by a firewall, configure a firewall rule to allow the VM to access TCP port 3260 on the NAS.

    Install & Configure iSCSI

    iSCSI lets you connect to a remote network device and mount a logical storage device as a block device, that is, as a local block-based hard disk. This differs from, for example, NFS, which is a file-based system, and is apparently needed by the Storj storage node software.

    The terminology can be a bit overwhelming, but in brief:

    • An iSCSI “initiator” is the client. In my case, the Alpine Linux system.
    • An iSCSI “target” is the “server” running on the NAS.
    • An iSCSI “LUN” is the virtual disk I want to share.
    • An iSCSI “IQN” is a unique name used to identify initiators and targets on the network.

    Setup Synology NAS as an iSCSI Target

    1. Log into the NAS, open the menu, and select iSCSI Manager.
    2. Select “Target”.
    3. Select “Create”
    4. In the “Create a new iSCSI Target” menu, enter a descriptive name in the Name field. I chose “storj-target”. Leave the IQN as-is, but make a note of it for later use (mine is iqn.2000-01.com.synology:DiskStation.storj-target.9bdcdcf219 but yours will be different). I also leave CHAP authentication disabled since the NAS will only have this single iSCSI target and there’s a firewall rule that allows it to only be accessed by the VM. Your mileage may vary and you may need to enable it. Click Next.
    5. Select “Create a new iSCSI LUN” and click Next.
    6. Pick a descriptive name for the LUN. Mine is “storj-lun”. I wanted to share 1.5 TB, so I specified the size of the LUN as (1.5 TB * 1024 GB/TB) + 1 GB = 1537 GB to leave a little buffer. I selected Thin Provisoning, as I read that performance is essentially identical to Thick while not having to preallocate all the storage required by the LUN. Click “Next” and “Apply”.
    7. If all goes well, the target and LUN are created and the target has the LUN mapped. By default, the target is set to allow anyone to access the target. I use firewall rules to restrict access.

    Setup Alpine Linux as an iSCSI Initiator

    Alpine Linux is extremely lightweight and comes with minimal software. We need to install and configure the iSCSI package. All commands in this section are run as root on the VM.

    1. Read the iSCSI instructions on the Alpine Linux wiki, then:
      • Run apk add open-iscsi
      • Start the iSCSI service by running rc-service iscsid start
      • Set the iSCSI service to start at boot: rc-update add iscsid
      • Query the NAS to verify connectivity and list the LUNs by running iscsiadm --mode discovery --type sendtargets --portal IP_OF_TARGET
        • Specific example: iscsiadm --mode discovery --type sendtargets --portal 192.168.3.65
        • It should return something like this, 192.168.3.65:3260,1 iqn.2000-01.com.synology:DiskStation.storj-target.9bdcdcf219, which indicates it was able to connect to the NAS and find the target.
      • Next, let’s mount the target as a local disk: iscsiadm --mode node --targetname NAME_OF_TARGET --portal IP_OF_TARGET --login
        • Specific example: iscsiadm --mode node --targetname iqn.2000-01.com.synology:DiskStation.storj-target.9bdcdcf219 --portal 192.168.3.65 --login
        • It should return something like this:
          Logging in to [iface: default, target: iqn.2000-01.com.synology:DiskStation.storj-target.9bdcdcf219, portal: 192.168.3.65,3260]
          Login to [iface: default, target: iqn.2000-01.com.synology:DiskStation.storj-target.9bdcdcf219, portal: 192.168.3.65,3260] successful.
    2. Next, we need to partition the new disk.
      • Let’s confirm the device name of the mounted iSCSI target. Running iscsiadm -m session -P 3 will show a lot of detail about the current iSCSI session, including device names. On the last line you should see something like Attached scsi disk sdb State: running — this tells us that the iSCSI disk is mounted to /dev/sdb. It may be different on your system.
      • Alpine doesn’t come with lsblk, which is a handy utility, so let’s install it: apk add lsblk
      • Now, run lsblk to show all block devices connected to the computer. I see /dev/sda as my VM’s main disk (along with its several partitions) and /dev/sdb as an unpartitioned disk.
      • Again, on my system the iSCSI disk is mounted to /dev/sdb it may be different on your system. Please verify that you’re referring to the correct device in the following setps, as partitioning it will destroy any existing data on the device.
      • Let’s partition the disk: fdisk /dev/sdb
        • Type “n” for a new partition, then follow the prompts and select “p” for primary partition, then “1” for the partition number, and accept the default size.
        • Type “p” to print the partition table. Confirm that everything is in order.
        • Note: If you wish to quit without saving your changes, type “q” and press Enter.
        • (WARNING: THIS IS DESTRUCTIVE) Save and write the changes to disk by typing “w” and press Enter.
    3. Now that we have the iSCSI disk partitioned, let’s format it. ext4 is fine.
      • Run mkfs.ext4 /dev/sdb1 and follow the prompts.
    4. In some cases the device name or partition name (e.g. /dev/sdb and /dev/sdb1, respectively) might change, so it’s useful to unambiguously identify the partition by in configuration files using a UUID. The partition’s UUID can be found by running blkid /dev/sdb1. In my case, it’s f88fc19f-f4ab-4a33-9383-ce9a72b3d6ca. Yours will be different. Make a note of it for now.
    5. Disconnect and reconnect the iSCSI connection by running iscsiadm --mode node --targetname NAME_OF_TARGET --portal IP_OF_TARGET -u and iscsiadm --mode node --targetname NAME_OF_TARGET --portal IP_OF_TARGET --login so the system will recognize it as a partitioned disk.
    6. Make a mount point where you’ll mount the partition. In my case, I want it to be in /mnt/storj-iscsi, so I run mkdir -p /mnt/storj-iscsi (the “-p” ensures it creates any needed intermediate directories like /mnt if they don’t already exist).
    7. Let’s configure the /etc/fstab file so the system can mount the partition. Open it in a text editor and add the following line: /dev/disk/by-uuid/YOUR_DISK_UUID /path/to/mount/point ext4 defaults,_netdev 0 0
      • Specific example: /dev/disk/by-uuid/f88fc19f-f4ab-4a33-9383-ce9a72b3d6ca /mnt/storj-iscsi ext4 defaults,_netdev 0 0
      • Note: The _netdev option tells the system that it’s a network drive and should wait until after the networking stack is up before trying to mount it. However, it can take a few seconds for the iSCSI system to connect to the target after networking is up, and if it’s not connected by the time the system tries to mount the partition then it will remain unmounted. We’ll set up a more robust mounting method later, so stay tuned.
    8. Test that the partition mounts by running mount -a -v. Assuming it works, let’s unmount it to finish up a few more things by running umount /mnt/storj-iscsi/.
    9. Configure the iSCSI client to make the connection persistent (that is, so it’ll start at boot) by running iscsiadm -m node -T NAME_OF_TARGET -p IP_OF_TARGET --op update -n node.conn[0].startup -v automatic
      • Specific example: iscsiadm -m node -T iqn.2000-01.com.synology:DiskStation.storj-target.9bdcdcf219 -p 192.168.3.65 --op update -n node.conn[0].startup -v automatic
      • Note: “node.conn[0].startup” is the term used in the Alpine Linux wiki. The “conn[0]” refers to the first TCP connection in that session. The wiki calls it out to be explicit since it’s possible that iSCSI sessions have multiple connections, but in practice this is rarely used so you can also simply use “node.startup” instead if that’s simpler to understand.
      • Note: To remove the connection’s persistence, run the previous command with -n node.conn[0].startup -v manual instead.
      • Note: To remove the whole connection entirely, run iscsiadm -m node -T iqn.2000-01.com.synology:DiskStation.storj-target.9bdcdcf219 --op delete
      • Node: On Alpine Linux, iSCSI node information and configurations are available in /var/lib/iscsi/nodes and it’s subdirectories.
    10. Configure the iSCSI system to automatically start connections marked as “automatic” by opening the /etc/iscsi/iscsid.conf file and changing node.startup = manual to node.startup = automatic.

    I noticed that that localmount (which mounts fstab entries) was running before the iSCSI target was connected, so the system would read the fstab and try to mount /dev/sdb1 to /mnt/storj-iscsi/ but the mount would fail since /dev/sdb1 wasn’t available yet. I didn’t want to change the default init scripts for the system, so I just added a new init script called iscsi-wait that verifies that the iSCSI target is connected and the /dev/sdb1 partition (identified by UUID) is available and then tries to mount entries in fstab again.

    Create the file /etc/init.d/iscsi-wait, make it executable with chmod +x /etc/init.d/iscsi-wait, and giving it the following contents:

    #!/sbin/openrc-run
    
    description="Check iSCSI session and mount fstab entries"
    
    # Change the ISCSI_TARGET and DISK_UUID to values specific for your system.
    ISCSI_TARGET="iqn.2000-01.com.synology:DiskStation.storj-target.9bdcdcf219"
    DISK_UUID="f88fc19f-f4ab-4a33-9383-ce9a72b3d6ca"
    ISCSI_WAIT_TIMEOUT="10"
    UUID_WAIT_TIMEOUT="10"
    
    # Load this script after the network is up and iscsid has started. Be sure to run this before Docker.
    depend() {
        need net iscsid
        before docker
    }
    
    start() {
        ebegin "Waiting for iSCSI session for ${ISCSI_TARGET}"
        local retries="${ISCSI_WAIT_TIMEOUT}"
        while [ "${retries}" -gt 0 ]; do
            iscsiadm -m session 2>/dev/null | grep -q "${ISCSI_TARGET}" && break
            sleep 1
            retries=$((retries - 1))
        done
        if ! iscsiadm -m session 2>/dev/null | grep -q "${ISCSI_TARGET}"; then
            eend 1 "No active iSCSI session found for ${ISCSI_TARGET}"
            return 1
        fi
        eend 0
    
        ebegin "Waiting for /dev/disk/by-uuid/${DISK_UUID}"
        retries="${UUID_WAIT_TIMEOUT}"
        while [ "${retries}" -gt 0 ]; do
            [ -e "/dev/disk/by-uuid/${DISK_UUID}" ] && break
            sleep 1
            retries=$((retries - 1))
        done
        if [ ! -e "/dev/disk/by-uuid/${DISK_UUID}" ]; then
            eend 1 "Timed out waiting for /dev/disk/by-uuid/${DISK_UUID}"
            return 1
        fi
        eend 0
    
        ebegin "Mounting iSCSI fstab entries"
        local mount_output mount_status
        mount_output=$(mount -a 2>&1)
        mount_status=$?
        [ -n "${mount_output}" ] && einfo "${mount_output}"
        eend ${mount_status} "Failed to mount fstab entries"
    }

    Set the script to

    rc-update add iscsi-wait

    It’d probably be a good idea to reboot your system and make sure that the iSCSI disk connects as expected and the partition is mounted automatically. Repeating this several times for good measure to make sure it works probably wouldn’t hurt.

    Here’s what my startup process looks like with that script on a test system (that’s talking to Target-2 rather than storj-target):

    As you can see, localmount tries mounting the filesystem early on (lines 4-5) but fails. However, later, iscsid loads and the script above immediately loads, checks the iSCSI target is connected, then mounts the filesystem. Docker starts immediately after, and so the Storj storage node container will see the filesystem it needs as expected.

    Install and Configure the Storj Storage Node

    1. The Storj storage node requires docker, so install it with apk add docker, start the service with service docker start, and have it load when the system starts up with rc-update add docker default.
    2. Create an Storj identity file following the instructions here. This can be time-consuming and may be better done on a more powerful computer than the Alpine VM and then transferred over to the NAS. Be sure to back up the file.
    3. With the iSCSI disk connected and the partition mounted, create a subdirectory on the iSCSI directory: mkdir -p /mnt/storj-iscsi/storagenode/ — this can be named whatever you want, but I call it “storagenode”. Copy your identity directory into the storagenode directory.
    4. Install the Storj storage node in a Docker container in accordance with their directions. In the docker commands, be sure to correctly configure your email, dynamic DNS address, port, payout wallet address to receive your earnings (I entered my Storj account’s deposit address since that’s all I use STORJ tokens for.), and the amount of storage you’re offering (I selected 1.5TB, again, enter what you’re making available. Make sure it’s less than the iSCSI LUN size, including the space lost due to partitioning and formatting.).
      • In both of the docker run commands during the installation process, I set <identity-dir> to /mnt/storj-iscsi/identity/storagenode/ and the <storage-dir> to /mnt/storj-iscsi/config. I also changed -p 127.0.0.1:14002:14002 to -p 14002:14002 so the dashboard interface would be available from outside the docker container on my LAN (I set up a firewall rule so I could access the dashboard interface from my other VLAN by going to http://10.200.40.151:14002). With the 127.0.0.1 present you can only access it from within the docker container itself, which isn’t very useful.
        • Don’t expose the dashboard to the internet. Keep it local to your LAN.
        • You might also prefer to keep the identity file on the VM itself, which you can do. In my case, I keep it on the NAS so it’s easily accessible if I need to create a new VM or migrate it somewhere else.
    5. Install Watchtower to make sure the Storj storage node image is updated periodically.
    6. Check the dashboard to make sure everything’s running properly, you’re connected to the network, the vetting process has started, and you’re getting test data.

    Conclusion

    I hope you found this post to be useful. I welcome any feedback, suggestions, corrections, etc. Any bugs or errors are my own. Your mileage may vary. Please take care to understand the consequences of any commands rather than just blindly copy-pasting them.

  • I built a thing: an improved solar battery charger

    First off, my apologies for a long time away from the blog: between family, life, work, travel, and other commitments things have been busy and the blog had fallen by the wayside.

    Background

    That said, recently I’ve started getting involved with Meshtastic, a license-free, open-source, off-grid mesh network that can run on cheap, low-power hardware and allows people to exchange text messages with others, either through direct messages and private or public chat “channels” (akin to a “chat room” of old). I’m interested in this as both a fun way to socialize with nerdy people in the area, but it also looks very handy for communication in emergency situations like after an earthquake.

    I recently installed a solar-powered Meshtastic node on my roof’s antenna mount (currently also host to my house’s GPS antenna) using RAK Wireless WisBlock hardware which is known for being quite low-power. These modules can run from USB, a battery (e.g., a lithium-ion cell), and even have an onboard TP4054 lithium-ion battery charger module connected to a port labeled “solar”, and their product descriptions refer to the device being intended to charge a lithium-ion cell from a small, 5V nominal solar panel.

    RAK’s choice of the TP4054 as an onboard charger chip was a bit puzzling to me: it is a fine charger chip if used with a fixed supply voltage (like from USB), requires minimal external components (one resistor and two capacitors), and does a great job charging Li-ion cells, but it lacks any ability to reduce its current consumption on the fly to match a variable-output supply voltage like a solar panel that might be able to supply plenty of current during the peak sunny hours during the day, but only a limited amount during the mornings, evenings, or during cloudy days.

    Various charger chips exist to charge batteries from solar panels, including the CN3065 linear and CN3791 “MPPT” switch-mode charger. Both adjust their charging current automatically to not overload a solar panel, but the CN3065 has a more limited input range than the CN3791.

    CN3791-based charger ICs are available from various vendors for reasonable prices, I don’t like the most commonly available model for a variety of reasons. In particular, they try to keep the solar panel voltage above a value set by fixed resistors, but I wanted to be able to easily adjust the target voltage.

    My Chargers

    To rectify these shortcomings, I designed a charger board based on the CN3791 chip that has, among other improvements, an adjustable MPP voltage, reverse-polarity protection, an (optional) battery protection chip, and a separate load path so the charger can accurately measure the charging current and battery voltage.

    I have more information about the chargers here. Current pricing (no pun intended) for small quantities of the Mod 3’s are $20/each and are hand-made in the US by meafrom US-made PCBs and globally-sourced components. Larger orders and orders for different Mods (e.g. with different battery protection chips, or none at all) are available with a little more lead time. Contact me if you’re interested in buying some.

  • The DS3231 is dead, long live the DS3231#!

    Maxim Integrated has a handy website of various products that have reached the end of life (EOL), and break out some products into categories such as “not recommended for new designs” (in that they’re planning to be discontinued in the near future) and “no longer available”.

    It turns out that several DS3231 variants are on the list at the time of this post:

    Status Part Number
    NRNDDS3231MZ/V+
    NRNDDS3231MZ/V+T
    NLADS3231N/DIP
    NLADS3231S
    NLADS3231S#-W
    NLADS3231S#C14
    NLADS3231S/T&R
    NLADS3231S/T&R#C15
    NLADS3231SN
    NLADS3231SN-C16
    NLADS3231SN/T&R

    Uh-oh. Does this mean the DS3231 is no more, as Ed Mallon mentions in an addendum to his post?

    Fortunately, no. The DS3231 (both crystal based and MEMS-based) is still widely available in several variants.

    I contacted Maxim and inquired what was going on, and the response was that the non-RoHS versions (e.g. with lead solder) are being discontinued, while the RoHS-compliant versions (designed by a “#” mark, like “DS3231SN#”) are still being actively produced and are recommended for use in new designs.

    The RoHS-compliant versions have identical timekeeping specifications as the non-RoHS verisons.

    Hopefully this helps to clarify things.

  • 3 year update on a completely-offline DS3231N

    On July 31st 2017 at 2:31:00am CEST, I synced the time on a DS3231N to GPS via NTP (+/- a few microseconds). I had previously tuned the clock by setting the offset register to 0x09 and verifying the stability against my Trimble Thunderbolt using an oscilloscope.

    At the time I set it, my oscilloscope-and-Thunderbolt measurements indicated it had a short-term stability (over the course of a few minutes) of 1.38 ppm, which is within the +/- 2 ppm specs. I then removed the module (it had a coin cell battery backup) from the NTP server Raspberry Pi that had set it and transferred it the Raspberry Pi I have setup as a strictly “offline” system to store some PGP keys away from the prying eyes of the internet.

    Since this system would never see the internet, having an accurate RTC meant that I wouldn’t need to set the system clock from my wristwatch when I turned it on — that’d be inconvenient.

    Slightly over three years later on August 12th 2020 I turned on the offline Pi and logged in via a serial link (no network connectivity at all) to the terminal. At 11:18:47 PM PDT as measured by my GPS-backed NTP server, I compared the time between the NTP server and the Pi. Adjusting for time zones, the offline Pi reported the time as 11:19:13 PM, or 26 seconds fast.

    95,838,467 seconds of actual time elapsed between the two measurements and the clock only gained 26 seconds. That’s a long-term stability of 0.27 ppm.

    I’m impressed: considering the clock was running off a CR2032 battery as opposed to regulated power from the Pi (which had not been plugged in at all during that time). During this time it had undergone several flights and car journeys as part of an international move, and had not been in any sort of specifically-regulated thermal environment (e.g. ordinary residential rooms).

  • A look inside the DS3231M real-time clock

    A die shot of the DS3231M realtime clock chip
    A differential interference contrast (DIC) microscope image of the DS3231M real-time clock. Click to enlarge.

    I’ve previously written on what lies within the package of the DS3231 real-time clock, a temperature-compensated 32.768 kHz crystal oscillator and RTC. I’ve also discussed the difference between the crystal-based DS3231 and the MEMS oscillator-based DS3231M. Now it’s time to look within the DS3231M.

    This particular chip was a free sample from Maxim Integrated (thanks, guys!) direct from their factory/distribution system and so is guaranteed to be authentic. Unlike my previous attempt (which involved physically grinding the package down and delicately picking out the silicon die and crystal), this chip I dissolved in hot nitric acid in a fume hood for about 30-45 minutes. I then washed it with deionized water followed by an isopropanol rinse. Unfortunately, this process left the die either with some sort of residue, or incompletely etched certain parts, so there’s what looks to be “water marks” around the chip (this is distinctly noticeable in the large block in the top-left).

    The image is a composite. To make it, I first focus-stacked (using PICOLAY) a dozen or so images that cover the same, small area of the chip with different depths of field in each photo. I then moved the field of view to a neighboring region of the chip and repeated the process. The focus-stacked images were then stitched using the Microsoft Image Composite Editor resulting in a single large, high-resolution image whereby everything is (I hope) in focus.

    Unsurprisingly, the die layout is completely different from that of the DS3231, presumably due to the differences in driving and collecting signals from a crystal oscillator and a MEMS oscillator. The crystal-based DS3231 switches in or out tiny on-die capacitors to slightly change the frequency of the oscillator to compensate for variations in temperature while the DS3231M uses “digital adjustment controller logic” to adjust the final 1 Hz output that one can access on the SQW pin and which also is fed into the internal clock circuitry.

    I’m uncertain if the MEMS oscillator is supposed to be on the die (if so, I didn’t see it — perhaps readers might have better luck) or in the package itself (though I didn’t see it when I picked the die out of the acid, but there was a lot of gunk from the dissolved package).

    What’s the deal with the enormous metal section in the lower-middle-left? I haven’t the foggiest.

    Any thoughts or ideas? If you have any insight I’d love to hear it.

  • DS3231 Drift Results (5 months)

    It’s been just under 5 months since I simultaneously synchronized ten DS3231/DS3231M RTCs as part of a long-term experiment to measure their drift. Of the ten, seven are crystal-based DS3231 chips, while three are DS3231M chips. Since they’re all on the same breadboard connected to the same power supply, all of them have been subject to the same physical conditions of temperature, movement, voltage, etc. insofar as I can control for them in my apartment.

    In the table below “Number” is the identifying number of each chip I arbitrarially asigned to uniquely identify each one, “Type” refers to its type (all the DS3231 modules are marked as the wide-temperature-range SN type, while the DS3231Ms are listed as M. The official DS3231M chip (#2) I received directly from Maxim is marked with an asterix.), “Offset” is the aging offset in register 0x10 expressed in decimal form, “PPM” is the stability in parts per million, and “Drift” is the number of seconds the clock has drifted since the start. For both the PPM and Drift columns, a positive value indicates that the RTC has run faster than the NTP-synchronized system clock while a negative value indicates the RTC has run slower than the system clock.

    All results were collected over a 26 minute period starting 12904549 seconds after the clocks were first synchronized. Each clock was measured three times and the resulting values averaged and rounded to two decimal places. Keep in mind that this dataset consists of just two data points (zero drift at the start, and the measured drift now) for each clock: unlike Dan, who continuously collected data and made many nice graphs, I set the clocks and essentially ignored them for five months.

    Anyway, I digress. Here’s the results:

    Number	Type	Offset	PPM	Drift
    0	SN	-6	0.19	2.46
    1	SN	0	-0.69	-8.96
    2	M*	0	-1.62	-20.85
    3	M	0	-3.06	-39.54
    4	M	0	-2.76	-35.65
    5	SN	0	0.16	2.07
    6	SN	0	0.01	0.10
    7	SN	-15	0.33	4.32
    8	SN	0	0.08	0.98
    9	SN	0	-0.05	-0.60
    

    Some commentary, bullet-pointed for your reading pleasure:

    • All clocks are within their advertised tolerances (2 ppm for the crystal-based clocks and 5 ppm for the MEMS-based clocks).
    • Five of the seven crystal-based DS3231 chips run fast, while two run slow. All three of the MEMS-based DS3231M chips run slow.
    • Clock #6 has essentially no drift whatsoever. There’s nothing particularly noteworthy about it: just luck of the draw.
    • The Raspberry Pi used to set the clocks in September was also used to measure the offset today. It has run continuously since the clocks were set, and has been synchronized continuously to another Raspberry Pi (+/- 0.1 ms), which was in turn synchronized to GPS using a Motorola Oncore UT+ receiver (+/- 150 ns). Time errors on the Pis are negligible.
    • I have two PCA9548A I2C switches that allow me to wire up all the clock chips and switch between them using software commands rather than needing to physically move wires around. This makes life easy.
    • At the start of the measurement, I had observed each of the clocks’ outputs using my oscilloscope and compared them to a GPS-synchronized PPS signal. Clocks #0 and #7 drifted faster than the others but were still within the advertised specs (unfortunately I didn’t write down how fast they were drifting and have since forgotten). I adjusted the aging register until the short-term drift was minimal; the results are acceptable, though I note they drifted the most of any of the crystal-based clocks.
    • All ten clocks are part of a cheap module available on eBay from various Chinese sellers. The module comes with either a DS3231 or DS3231M chip (the sellers don’t sort them so you can get either type at random) of various vintages. The oldest I’ve seen is from 2006. None of the chips seem to be new, with various smudges and wear visible on the face of the chip, so they’re likely pulled from old equipment and reused on these boards. Even so, they work well.
    • Each board also comes with a 24C32A I2C EEPROM, which is nice, but not strictly necessary. I use them for storing the aging offset in case I remove the backup battery and want to tune it again without using the oscilloscope.
    • The boards also come with a holder for a backup coin cell battery. Critically, the boards are also wired with a 2N4148 diode and a 200-ohm series resistor feeding the positive terminal of the battery, presumably for use with a (not included) LIR2032 rechargeable coin cell. If you use a non-rechargeable CR2032 coin cell you must remove either the diode or resistor or else the circuit will try to charge the coin cell battery, which can damage the battery. I’ve removed the battery holder entirely from one or two other test boards and the charging circuit works well to charge a backup supercapacitor, but the charging circuit must be removed or disabled if you use non-rechargeable coin cell batteries.
    • The data here is just a brief summary; I have more detailed data in a spreadsheet that’s available upon request.
  • High Precision DS3231 Reads

    My ensemble of DS3231 and DS3231M RTCs has been running since early September 2017 and I’m looking to start gathering data on their performance in the near future. Thus, I’ve taken one of the DS3231 RTCs from the ensemble (leaving ten) and done some experiments with it to figure out how best to read the time from the clock chips at the highest precision I can using a Raspberry Pi.

    This is complicated by three issues:

    1. Although the DS3231 RTC has an internal 15-bit counter that counts individual crystal ticks, that counter is not accessible by the user. Instead, we can only access the timekeeping registers with a precision of one second.
    2. Due to weird historical quirks that trace back to the original MC146818 RTC used in the first PC/AT standard, the hwclock utility sets the RTC so there’s a half-second difference between the system clock and the RTC.
    3. With the reference NTP daemon running on a stock Raspbian system, the kernel enters an “eleven minute mode” where it will periodically set the RTC’s clock to the NTP-synchronized system clock. This is undesirable, but turning it off requires recompiling the kernel and I’m lazy.

    First Issue

    To address the first issue, I use the hwclock -c command, which repeatedly reads the timing registers on the RTC until the second changes. With the help of an oscilloscope and logic analyzer, I’ve confirmed that the seconds register (hex value 0x0, or “00h” in the notation the datasheet uses) is updated on the next read after the 1 Hz square wave output falls.

    It’s worth quoting this part of the DS3231 datasheet:

    On an I2C START or address pointer incrementing to location 00h, the current time is transferred to a second set of registers. The time information is read from these secondary registers, while the clock may continue to run. This eliminates the need to reread the registers in case the main registers update during a read.

    This means that nothing weird will happen if the 1 Hz counter ticks in the middle of a read and the next read from the seconds register after the tick will have the latest value. Perfect. This means we can mark the new second with a precision of +/- 1 I2C read packet, which is ~1 ms when using hwclock to read the data. Not bad.

    Second Issue

    For the second issue, we need to account for a half-second offset between the system clock and the RTC. If we let the kernel automatically set the RTC using the “eleven minute mode”, the RTC is a half-second ahead of the system clock. The output of hwclock -c looks like this:

    hw-time system-time freq-offset-ppm tick
    1517492769 1517492768.514489 -3 -0
    1517492780 1517492779.514081 -3 -0
    1517492791 1517492790.513630 -3 -0
    1517492802 1517492801.513253 -3 -0

    That is, the RTC is 0.486747 ahead of the system clock. The reported precision is far too high: I’ve measured a difference of 3-4 milliseconds between the value reported by hwclock and difference between the RTC’s 1 Hz pulse and the UTC-aligned one-pulse-per-second (PPS) signal from a timing GPS receiver.

    However, if we set the RTC manually using hwclock -w,the system clock is a half-second ahead of the RTC (at least until the eleven minute mode resets the RTC). Here’s an example:

    hw-time system-time freq-offset-ppm tick
    1517493232 1517493232.502916
    1517493244 1517493244.502209 -59 -1
    1517493255 1517493255.502850 -3 -0
    1517493266 1517493266.502547 -11 -0
    1517493277 1517493277.502220 -15 -0

    Keep this in mind if you decide to do a similar test. For consistency, I used hwclock -w to set all the clocks in the ensemble simultaneously. One nice thing about the DS3231 and DS3231M is that it resets its internal 15-bit fractional seconds counter whenever the seconds register is written, so once set you only need to account for the half-second offset and not any leftover fractional seconds. (See the “Clock and Calendar” section of the datasheet: “The countdown chain is reset whenever the seconds register is written.”)

    The RTC’s 1 Hz pulse goes high 500 ms after the seconds register is written, with the falling edge (which delineates the seconds) occurring 500 ms after that. Thus, the output of the 1 Hz pulse is synchronized to the seconds boundary (modulo the half-second offset discussed above) whenever the RTC is set.

    This is rather interesting, as I had assumed the 1 Hz output from the RTC to simply be the buffered, divided-by-32768 output of the crystal with no additional processing, but it’s actually something the chip can adjust the timing of based on user input. That’s really cool and could come in handy when measuring the phase drift of two or more such clocks (keeping in mind that the 1 Hz pulse is aligned to the closest crystal tick).

    As an experiment, I removed the battery from the RTC and reset it to factory defaults, enable the 1 Hz output, and measured its offset relative to the GPS PPS pulse (it was about 115 ms, but it was free-wheeling After setting the RTC with hwclock -w, the rising edge was within 1 ms of the GPS pulse. Yup, it works.

    Let’s look at this visually (click to enlarge):

    In this image, the RTC time is set using hwclock -w at the A1 marker (there’s a short burst of SDA/SCL data that’s a bit hard to see at this level of zoom). The RTC’s 1 Hz signal (“DS3231 SQW”) goes high 500 ms later, with the rising edge aligned with the GPS PPS marker (A2). At ~1.9 seconds hwclock starts repeatedly reading the RTC (the long burst of SDA/SCL data) to catch the boundary between RTC seconds, which occurs at the falling edge of of the 1 Hz signal — note the 0.5 second offset between the rising edge of the GPS PPS (which marks the UTC second) and the falling edge of the RTC pulse.

    Third Issue

    The solution to this issue is easy: I basically ignore it. I have two PCA9548A I2C multiplexers connected to my Raspberry Pi’s I2C bus, so I can programmatically switch between each RTC (all of which have the same fixed I2C address, necessitating the multiplexers). Thus, I can rapidly scan through each of the RTCs, gather data from each, and finally set the multiplexer to a position where there is no RTC.

    The probability of the kernel setting the RTC on its eleven minute mode schedule during this quick burst of activity is non-zero but low enough that I don’t worry. If I wanted to turn off the eleven minute mode entirely, I could recompile the kernel with that option disabled, but I’m lazy.

    Conclusion

    Based on measurements of both the software and hardware timing, I can conservatively read the time from the RTC within +/- 10 ms. That’s both better than I expected and more than adequate for my testing. Going into this, I was expecting around 200-500 ms precision.

    The major (and I use that relatively) issue I face is the half-second offset. Since I know I used hwclock -w to set all the RTCs simultaneously, I know they were all set 0.5 seconds behind the system time at the moment they were set. Also, since the output of hwclock -c is based on the tick of the RTC clock, the system clock would show as 0.5 seconds ahead. Later, when I measure the difference between the RTC and system clock I need to make sure to subtract 0.5 seconds from the reported system time to get a proper comparison.

    Lastly, datasheets have a lot of interesting details that are really easy to overlook. Having a logic analyzer (I really like Saleae) and a cheap timing GPS with a PPS output (an old Motorola Oncore UT+ I use for my house NTP server; it’s old but keeps ticking along) makes it really easy and fun to explore these details. I enjoyed digging into the behavior of the hwclock software and DS3231 hardware and hope this information is of some use to you.

  • My DS3231 test setup

    I wanted to test several DS3231 (M and non-M variants) boards for drift, so I mounted eleven of them (including one known-genuine DS3231M, the leftmost one on the front row, with a green bodge wire) to a breadboard, connected a regulated power supply (AMS1117) at 3.3V to both power rails, and made sure they all worked.

    Eleven DS3231 (including 3 M variants) on a breadboard for testing.

    Yup, they all work. The boards have either orange or red LEDs, so they emit a pleasing glow at night that prevents me from crashing into things in my office at home.

    Why use 3.3V? One, it makes interfacing with the 3.3V I2C pins on a Raspberry Pi easy since I don’t need a level-shifter, and two, it minimizes drift in the event that I need to disconnect the power and have the clocks run on their coin cell backups. The CR2032 batteries have a nominal voltage of 3.0V, but all currently measure 3.3V (they’re brand-new, Energizer-brand cells from Digi-Key). The DS3231 datasheet says the drift can change by up to 1ppm/volt, so I want to minimize the voltage difference between the normal power supply and the coin cells.

    To ease the comparison of drift, I want to ensure all the clocks start counting at the same moment. I could set them all one at a time, but this is complicated because (a) I don’t have an I2C multiplexer chip, and (b) setting them sequentially means they’re not all set at the same moment. It probably wouldn’t matter much in the long run, but it would make me happy to set them at the same time.

    The DS3231 modules all have the I2C address of 0x68, and it cannot be changed. Normally, you cannot have multiple chips with the same address on the same I2C bus, as they’ll talk all over each other and the resulting signals will be garbage.

    Fortunately, we don’t need the DS3231s to talk; they need only listen to the master and make the appropriate ACK/NAK signals as needed. They should all send the same ACK/NAK signals at the same time so, in theory, there shouldn’t be a problem.

    Next, we need to worry about bus current. Each module has a 4.7k ohm pull-up resistor for the I2C bus. With eleven modules, that means the effective pull-up resistance is ~430 ohms. At 3.3V working voltage, a device would need to sink nearly 8mA to correctly signal a logic low. The Raspberry Pi I have can sink 16mA per GPIO pin, so that’s fine. The DS3231 datasheet says the IOL is 3mA, though I spoke with an engineer at Maxim Semiconductor and they said the absolute maximum current the process used on the chips is 10mA. 8mA is close to that limit, but the current would hopefully be spread across many devices and would only be for a few microseconds in total, so it should be fine.

    I was satisfied I wasn’t going to blow anything up (and if I did, replacements are cheap), so I connected all eleven modules in parallel to the same I2C bus and commanded them to set their date and time to an arbitrary date in the past. If this was successful, I could send a command to read the time and, if all the modules had the same time, it would come through without an error. If things didn’t work, garbage would come in and I’d have to check them individually for the correct time. One read to all of the devices simultaneously, and I had valid data for that arbitrary time and date. Excellent. It worked!

    Using the Raspberry Pi synchronized to a local NTP server (another Raspberry Pi running NTP with a GPS reference clock) within less than a millisecond, I send the command to set the date and time on all the modules to the current time on Friday 8 Sep 11:18:16 UTC 2017 (unix time: 1504869496). Reading the date and time from all the modules confirms they all have the correct date and time with no errors.

    Now I’ll let them run for a while to see how they drift. A few have hand-tuned aging registers, so they should hopefully drift less than the others, while others use the default aging register of 0.

     

  • Major differences between the DS3231 and DS3231M RTC chips

    As should be clear from one of my earlier posts, I’m really interested in clocks and precision timekeeping. In particular, I rather like the Dallas Semiconductor DS3231 series of temperature compensated RTC/TCXO (real-time clock/temperature compensated crystal oscillator) modules.

    Recently, I had ordered several DS3231 boards from my regular eBay vendor in Shenzhen for some testing, only to find two oddities: first, the factory had evidently gotten an incorrect chip with the same sized 0.300″ SOIC package as the DS3231. This chip was the wholly-incompatible DS1315. It happens, particularly at this price point and via gray market suppliers. No worries, I contacted the seller and they sent me a replacement board.

    (more…)

  • At long last…

    After 6 years of intense study, I’ve finally earned my PhD in Physics (with a Masters along the way) as of last Friday.

    That was a heck of a ride. What to do and where to go next?