/r/zfs

Photograph via snooOG

Don't be a jerk.

Don't be nasty to other people. If you think somebody's wrong, you can say that without casting aspersions or being super sarcastic. Just be nice to people, ok?

Don't spam.

It's fine to link to youtube videos, blog posts, what have you. Even if you're the one who created them. BUT, only if it's materially useful to answer a question, or offer information, in some sense other than "this will get people to give me money."

This isn't an issue we usually have trouble with, so let's just keep not having trouble with it. NOTE: sometimes Reddit's auto-spam system flags links it shouldn't. If your post or comment gets hidden, send modmail and we'll take a look.

All ZFS platforms are cool.

If there's useful information about a difference in implementation or performance between OpenZFS on FreeBSD and/or Linux and/or Illumos - or even Oracle ZFS! - great. But please don't flame people for not using your own personal One True Platform. Thanks.

No dirty deletes.

If I catch anybody else deleting their question and all their comments on it immediately after getting an answer, they're getting an instant banhammer.

Half the point of asking questions in a public sub is so that everyone can benefit from the answers—which is impossible if you go deleting everything behind yourself once you've gotten yours.

/r/zfs

33,273 Subscribers

1

Announcing bzfs-1.6.0

I'm pleased to announce the availability of bzfs-1.6.0. In the spirit of rsync, bzfs supports a variety of powerful include/exclude filters that can be combined to select which ZFS datasets, snapshots and properties to replicate or delete or compare.

This release contains performance and documentation enhancements as well as new features, including ...

  • On exit also terminate still-running processes started via subprocess.run()
  • --compare-snapshot-lists is now typically much faster than standard 'zfs list -t snapshot' CLI usage because the former issues requests with a higher degree of parallelism than the latter. The degree is configurable with the --threads option.
  • Also run nightly tests on zfs-2.2.6
  • Progress viewer: also display the total size (in GB, TB, etc) of all incremental snapshots that are to be transferred for the current dataset, as well as the total size of all incremental snapshots that have already been transferred for the current dataset as part of the current job.

All users are encouraged to upgrade.

For more details, see https://github.com/whoschek/bzfs

0 Comments
2024/12/03
19:48 UTC

8

Why you should enable AMD's TSME (or Intel's TME) for ZFS servers - especially if you aren't running ECC

(Yes, "you should use ECC", obviously. But crazy enough, not everyone IS running ECC. I know you want to make that comment anyway, so please read the rest of the post first.)

On AMD, TSME stands for "Transparent Secure Memory Encryption". On Intel it's "Total Memory Encryption". When enabled in BIOS, it encrypts all memory, per 64-byte page, without requiring the OS'es active involvement or even awareness.

Why enable it: Data reliability, even if you ARE running ECC. (Also security, but that's obvious. Reliability is less obvious.)

With TSME/TME enabled, a single flipped bit changes the contents of an entire memory page, or 512 bits.

Without TSME (and without ECC), a single flipped memory bit can crash a program or halt the entire OS - if you're lucky. But if unlucky, a flipped bit can silently corrupt your data. The filesystem you're running is completely irrelevant.

But with TSME enabled, a single flipped bit will catastrophically change the entire memory page it's in, which will have much more obvious results and will be far more likely to crash the program if not the OS. (Which is better than silent corruption.)

It can help ECC too, at the cost of only about 6-10ns extra latency (and no cost to throughput). Although vastly less likely mathematically, ECC can also experience silent corruption.

The standard refrain for ZFS servers is, of course, "use ECC". So much so that some people new to ZFS actually think it's required, and that ZFS won't even run without it. (And then those new, low-information users are sadly the ones most likely to knee-jerk echo the refrain, thinking it makes them sound smart, without really understanding it.)

A refrain which needlessly hurts ZFS adoption. ZFS "requires" ECC no more than any other filesystem. Memory errors are bad, plain and simple. ECC is just a good idea for data integrity in general, and become a better and better idea the more data passes through your memory bus. (Eg active servers.) That ZFS and Btrfs are checksumming filesystems, have nothing to do with an ECC requirement.

ZFS checksumming and memory ECC improve data reliability independently. Together, even better.

And indeed, you can't even order an enterprise ZFS server without ECC. (And if you value your job, I wouldn't try. If anything goes wrong, even unrelated, it will still be your head. Can't say I'd blame them.)

If you're setting up a new ZFS server at home, there's no good argument for NOT running ECC. The only real argument period, is "affordable Intel CPUs don't support ECC" - so just go with AMD instead, and make sure the motherboard supports ECC.

But here's the rub: there is a significant home server segment of budget and/or eco-minded people who turn old desktops that don't support ECC, into multi-HDD file servers

For example, running Linux and ZFS. Or Windows and NTFS RAID 1. Or whatever.

Then there's another set of ZFS users - possibly overlapping - who run an array from what they consider to be, say, a primary gaming rig - and/or who otherwise prioritize absolute performance over ECC.

(For example, there is no >=3600 MHz ECC RAM to my knowledge. And none to my knowledge with factory-installed heat-spreaders, which some users may not feel comfortable installing themselves. Some people just have to have their 4000 MHz RAM. Not me, but also not for me to judge.)

Those users may be well aware of the risks - in fact they are probably more objectively aware of the risks than those with the knee-jerk "never use non-ECC RAM for ZFS" refrain - and have weighed the economic and data loss pros and cons and made a rational decision that works for them. (Maybe not for you, but it's not your choice is it. Some people just have a hard time thinking of a world outside their own narrow interests and needs. And yet, 4000 MHz memory modules somehow exist.)

FWIW, I run multiple ZFS and Btrfs arrays on ECC and non-ECC machines, with multiple HDD, SATA SSD, and NVMe arrays. Of course I prefer ECC memory, and only run critical data on ECC. Currently all non-ECC arrays run with TSME enabled. But I've never had a single flipped bit on a storage array in 35 years, that was logged (or in more recently years that TSME crashed). Of course I've had plenty of crashes on non-critical non-ECC machines, like anyone. Whether due to cosmic rays, driver bugs, or other hardware problem, sometimes impossible to know - but usually pretty sure was just buggy drivers.

References:

A (rare) sober analysis which has been posted a few times here, with links to several studies: "To ECC or Not To ECC"

https://blog.codinghorror.com/to-ecc-or-not-to-ecc/

More studies, less optimistic:

https://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf

https://www.cs.rochester.edu/~kshen/papers/usenix2010-li.pdf

15 Comments
2024/12/03
16:56 UTC

2

Why do the number of blocks in the volume keep changing? Second column in df output.

root@debian: [ ~ ]# df | grep "^zzyzx/mrbobbly"
zzyzx/mrbobbly  27756416 1239424  26516992   5% /zzyzx/mrbobbly

root@debian: [ ~ ]# df | grep "^zzyzx/mrbobbly"
zzyzx/mrbobbly  27757312 1242112  26515200   5% /zzyzx/mrbobbly
 
root@debian: [ ~ ]# df | grep "^zzyzx/mrbobbly"
zzyzx/mrbobbly  27757440 1242624  26514816   5% /zzyzx/mrbobbly
2 Comments
2024/12/03
16:08 UTC

0

Is it possible to configure an SSD to act as a sort of write-cache to speed up large incoming file transfers?

Hey all, I'm an enthusiast who's eager to learn more about ZFS. I'm setting up a ZFS server currently and looking at different configurations, and I've been reading a lot of different posts online.

Firstly, I'm coming at this more so with a "is it possible" mindset than a "is it optimal/worth-it" - I'm curious purely for educational purposes whether there's a way to make it work as I was expecting.

I'm wondering if it's possible to set up an SSD (or two) to act as a sort of cache when copying large amounts of data onto my storage server. The thought process being that I'd like to be able to see the file transfer as "complete" on the machine that's sending the file as fast as possible. My ZFS server would finish copying initially to the SSDs first and the file transfer would be complete, and it would then on it's own time/pace finish copying from the SSDs to it's HDD array without holding up the other devices that were sending data anymore.

If I could use SSDs for a purpose like this it would enable me to saturate my ethernet connection for much larger file transfers I believe.

Does anyone know if this is possible in some way?

(I would be okay with any dataloss from a rare SSD failure during the middle of this process - my thoughts being that most of my use case is making system backups, and small dataloss on the most recent backup would be the least damaging type possible as any new files would likely still exist on the source system, and any old files would exist in older backups too).

If additional context helps, I'm looking at 5 HDD drives with double parity (planning to expand with a few more drives and switching to triple parity eventually) - and the SSDs I'm considering currently adding two with no parity to optimize the speed of large transfers if the above concept works. (And yes, I'm aware of using SSDs as special metadata devices, I have plans for that as well but it seemed like a separate topic for now)

30 Comments
2024/12/03
00:53 UTC

6

Dumb Past Self causes Future me Write Speed Problems

Since 2021 I've had a tumultuous time with WD Drives and heavily regret not doing further research / increasing my budget before building a trueNAS machine - but here we are hindsight is always 20/20.

My original setup was in an N54L - I've now moved everything into a z87 3770K build - as I have always had issues with write performance I guess as soon as ram gets full. Once a few GB of data was written the write speed drops down into kilobytes, and wanted to ensure the CPU and RAM was not the bottleneck. This happened especially when dumping tons of DSLR images onto the dataset.

A bunch of drive failures and hasty replacements has not helped, but my write issues persist even with moving to the 3770k pc with 32gb ram. While looking into if zLOG could fix the issue I've now discovered SMR and CMR. And I think I'm cooked.

What I currently have in the array are as follows (Due to said failures)
3x WDC_WD40EZAZ-00SF3B0 - SMR
1x WDC_WD40EZAX-22C8UB0 - CMR

TLDR: bought some SMR drives - write performance has always been dreadful.

Now thats out of the way - the questions:

Does sustained heavy write performance massive drop off sound like the SMR drives being terrible? Or is it possible there is some other issue caussing this.

Do all drives in the array need to be the same model realistically?

Do I need to sell a kidney to just replace it all with SSDs or is that not worth it these days

Anyone got a way to send emails to the past to tell me to google smr vs cmr?

thanks in advance

20 Comments
2024/12/02
21:41 UTC

5

What is this write during scrub?

I'm running scrub on a 7-drive raidz1 SSD pool while watching smartctl (as I always do in case of errors). The pool is completely idle except for scrub - double checked and triple checked.

I noticed my LBA written counters steadily goes up during scrub at EXACTLY 80 LBA per second per drive on all 7 drives. That works out to 40KB/s per drive. That shouldn't happen given scrub theoretically is read-only but my googling hasn't yielded anything useful in terms of what could be writing.

The LBA increase stops immediately once scrub is paused so I'm 100% sure it's scrub that is doing the writing. Does anyone know why please? And is there any tuning I can do to reduce that?

I'm not too concerned but given it equals to 1.2TBW / year, if there's a tuning I can do to reduce that, it would be appreciated to.

3 Comments
2024/12/02
13:36 UTC

1

zfs auto balance problem?

I have a zfs pool with eight identical hard drives. I have noticed that the data is not evenly distributed. One of the mirror pairs has less data than the others. Why is there such a big difference?

root@dl360g8:~# zpool iostat -vy 1

capacity operations bandwidth

pool alloc free read write read write

-------------------------- ----- ----- ----- ----- ----- -----

LOCAL-R10 391G 2.87T 660 0 81.1M 0

mirror-0 102G 730G 156 0 19.9M 0

scsi-35000cca02298e348 - - 94 0 12.1M 0

scsi-35000cca0576f0f7c - - 62 0 7.77M 0

mirror-1 86.4G 746G 182 0 22.3M 0

scsi-35000c50076aecd67 - - 82 0 10.2M 0

scsi-35000cca0576ea544 - - 100 0 12.1M 0

mirror-2 102G 730G 169 0 20.3M 0

scsi-35000cca0576e0a60 - - 95 0 11.6M 0

scsi-35000cca07116b00c - - 74 0 8.69M 0

mirror-3 101G 731G 149 0 18.6M 0

scsi-35000cca05764eb34 - - 70 0 8.70M 0

scsi-35000cca05763f458 - - 79 0 9.87M 0

-------------------------- ----- ----- ----- ----- ----- -----

8 Comments
2024/12/01
21:14 UTC

0

upgrade to rc?

I'm running zfs-2.2.6-1 on debian 12. I have two pools, a primary 6x12TB zpool (zstore), and a 3x12TB zpool (backup) which contains a copy of the filesystems on the primary. I plan to add three 12TB drives to backup so it matches zstore in capacity. The three extra 12TB drives arrive soon. Is it fairly safe to install 2.3 RC3 from git to use raidz expansion to add the three new drives to zpool backup? Or should I just blow away my backup zpool and rebuild it (three days rsync, yikes!).

2 Comments
2024/12/01
04:29 UTC

1

Recommendations for setting up NAS with different size/types drives

I have the following hardware:
- AMD 3900x (12 core)/64 GB RAM, dual 10G NIC
- Two NVME drives (1TB, 2TB)
- Two 22TB HDD
- Two 24TB HDD

What I was thinking is to setup Proxmox on the 1TB drive and dedicate the other 5 drives for a TrueNAS VM running in Proxmox.

I dont think I have strong requirements... basically:

- I would like to have Encryption for all storage if possible (but we can ignore the Proxmox host drive for now to keep things simpler)

- I read that you need to have ZFS have access to host controller so, if I understand correctly, I may need to invest in an expansion card? Recommendations? and then redirect this to the TrueNAS VM (with all but the 1TB drive connected)

- The TrueNAS VM virtual volume would be on the 1TB host SSD

Assuming the above is done then we can focus on setting up TrueNAS with the 5 drives.

This leads me to some thoughts/questions for the NAS itself and ZFS configuration:

- I think I would be ok with one single zpool? or are there reasons I would not? (see below for more details)

- I *think* it would be ok to have 2x24TB (mirrored) and 2x22TB (mirrored)... would this give me 46TB of usable space in the pool? does it cause problems if the drives are different sizes?

- Presumably, the above would give me both redundancy and performance gains? basically I would only lose data if 2 drives in the same mirror set (vdev?) failed?

- What type of performance could I expect? Would ZFS essentially spread data across all 4 disks and potentially allow 4x read speeds? I don't think I will be able to max out a 10GB NIC with just 4 HDD but I am hope it is realistic to at least get 500MB/s+?

- What would make sense to do with the 2TB NVME drive? this is where it gets more complex with cache drive?

Thoughts/Suggestions?

Thanks

9 Comments
2024/12/01
02:27 UTC

0

16x 7200 RPM HDD w/striped mirror (8 vdev) performance?

Does anyone have performance metrics on a 16x 7200 RPM HDD w/striped mirror (8 vdev)? I recently came across some cheap 12TB HDDs for sale on ebay. Got me thinking about doing a ZFS build.

https://www.ebay.com/itm/305422566233

I wonder if I'm doing the calculations right

  • ~100 IOPS per HDD
  • 128KiB block size = 1024 Bytes/KiB * 128 KiB = 131072 Bytes
  • 128KiB * 100 IOPS/ HDD = 13.1 MB/s
  • 13.1 MB/s * 8 vdevs = 104 MB/s (834.4 Mbps)

My storage needs aren't amazing. Most of my stuff fits in a 1 TB NVMe drive. The storage needs are mostly based on VM performance rather than storage density, but having a few extra TBs of storage wouldn't hurt as I look to do file and media storage.

This is for home lab so light IOPS per VM is ok but there are times when I need to spin a ton of VMs up (like 50+). What are tools I can use to get a baseline understanding of my disk IO requirements for VMs?

834.4 Mbps seems a bit underwhelming for disk performance. I feel like getting 4x NVMe stripe with a smaller HDD array would be better for me. Will a NVMe SLOG can help with these VM workloads?

I'm a little confused here as well because there is the ARC for caching. For reference, I'm just running vanilla open-zfs on ubuntu 24.04. I'm not running anything like proxmox or truenas.

I guess I can shell out some money for a smaller test setup, but I was hoping to learn from everyone's experience here rather than potentially having a giant paper weight NAS collecting dust.

17 Comments
2024/11/30
19:02 UTC

0

Having issues correcting my RaidZ1 mistake.

Hey there,
I've setup a RaidZ1 pool, but I've used non-proper identifiers ex: sda, sdb, and sdd.
I wanted to correct my mistake, but when I do `sudo zpool export media-vault` I'm getting:
`cannot export 'media-vault': pool is busy`
But to my knowledge there is nothing interacting with the pool.

I've tried:
- Restarting my server.
- Unmounting the zpool.
- When using the mount | grep zfs command it returns nothing.
- I don't have any shares running that are accessing this zpool.
- There are also no terminal sessions in that.

Any help is greatly appreciated! Cuz I really don't know what to do anymore.
Thank you. :)

19 Comments
2024/11/30
17:35 UTC

4

ZFS-Send Questions

According to the manpage for ZFS-Send, output can be redirected to a file. Can that output be mounted or viewed after it is created? Or can it only be used by ZFS-Receive?

Also, does the ZFS properties affect the resulting send file? For example, if the copies property is set to 2, does ZFS-Send export 2 copies of the file?

6 Comments
2024/11/30
02:26 UTC

5

Drive suggestions for backup server?

My backup server is running my old PC's hardware:

  1. MOBO: Gigabyte H610I
  2. CPU: i5 13500
  3. RAM: 32GB RAM
  4. SSD: Gigabyte SSD M.2 PCIE NVMe 256GB
  5. NIC: ConnectX4 (10GB SFP+)

Both the backup server and the main server are connected via a 10Gbps SFP+ port.

There's no available PCIE or M.2 slots, only 4 Sata connections that I need to fill.

My main backup server has about 40TB, but in reality 80% of that is for usenet media which I don't need to backup.

I want to get the fastest storage + highest capacity that I could use GIVEN MY HARDWARE'S CONSTRAINTS. I want to maximize that 10gbps port when I back up.

What would you suggest for the 4 available SATA slots?

Note: My main server is a beast and can saturate that 10Gbps link without sweating, and my networking gear (switch, firewall, etc) can also easily eat this requirement. I only need to not make my backup server the bottleneck.

3 Comments
2024/11/29
22:41 UTC

1

Current 4x8TB raidz1, adding 4x8TB drives, what are some good options?

I currently have a single vdev 4x8TB raidz1 pool. I have 4 more 8TB drives I would like to use to expand the pool. Is my only good option here to create a second 4x8TB raidz1 vdev and add that to the pool, or is there another path available, such as to a 8x8TB raiz2 vdev? Unfortunately I don't really have an external storage volume capable of holding all the data currently in the pool (with redundancy or course).

I'm running unraid 6.12.14 so at the moment I'm stuck on zfs 2.1.15-1 unfortunately, which I'm guessing doesn't have the new vdev expansion feature. I'd be open to booting some other OS temporarily to run the vdev expansion as long as the pool was still importable in unraid with its older zfs version, not sure how backward compatible that kind of thing is.

21 Comments
2024/11/29
18:43 UTC

5

zfs disk cloning

I have a bootable disk that I am trying to clone. The disk has 2 zfs filesystems (/ and /boot called rpool/ROOT/uuid and bpool/BOOT/uuid) , a swap partition and a fat32 efi partition.

I used sgdisk to copy the source partition layout to the destination disk:

sgdisk --backup=/tmp/sgdisk-backup.gpt "$SOURCE_DISK" 
sgdisk --load-backup=/tmp/sgdisk-backup.gpt "$DEST_DISK" 
rm /tmp/sgdisk-backup.gpt

I created new zfs pools on the target disk (with different name from the source pools using today's date in the name of the pool)

I created filesystem datasets for the destination root and boot filesystems:

zfs create -o canmount=off -o mountpoint=none rpool_$DATE/ROOT zfs create -o canmount=off -o mountpoint=none bpool_$DATE/BOOT 
zfs create -o canmount=off -o  mountpoint=/      -o com.ubuntu.zsys:bootfs=yes      -o com.ubuntu.zsys:last-used=$(date +%s) rpool_$DATE/ROOT/uuid 
zfs create -o canmount=off -o mountpoint=/boot bpool_$DATE/BOOT/uuid

I use zfs send/recv to copy the source filesystems to the destination ones:

source_datasets=$(zfs mount | awk '{print $1}' | sort -u)
echo "Cloning ZFS datasets from source to destination..."
for dataset in $source_datasets; do   
SOURCE_DATASET=$dataset   
DEST_DATASET=$(echo $dataset | sed "s/([rb]pool)([0-9]{4}[A-Za-z]{3}[0-9]{2}[0-9]{4})?/\1_${DATE}/g")   
zfs snapshot -r "${SOURCE_DATASET}@backup_$DATE"   
zfs send -Rv "${SOURCE_DATASET}@backup_$DATE" | zfs receive -u -F $DEST_DATASET 
done

I then mount the destination filesystems at /mnt and /mnt/boot

I remove everything from /mnt/etc/fstab

I create the swap space and the efi partition on the destination disk and add those entries in /etc/fstab

I copy everything from my /boot/efi partition to /mnt/boot/efi

echo "Copying everything from /boot/efi/ to $MOUNTPOINT/boot/efi/..." 
rsync -aAXHv /boot/efi/ $MOUNTPOINT/boot/efi/

I install grub on the destination disk:

echo "Installing the boot loader (grub-install)..." 
grub-install --boot-directory=$MOUNTPOINT/boot $DEST_DISK

Sounds like this would work yes?

Sadly no: I am stuck at the point where grub.cfg does not correctly point to my root filesystem because it has a different name (rpool instead of rpool_$DATE). I can change this manually or script it and I think it will work but here is my question:

-- Is there an easier way?

Please help. I think I may be overthinking this. I want to make sure I can do this live, while the system is online. So far I think the method above would work minus the last step.

Does zpool/zfs offer a mirroring solution that I could un-mirror and have 2 useable disks that are clones of each other?

6 Comments
2024/11/29
18:20 UTC

0

Have I setup my RaidZ1 pool correctly?

Hello,

I've setup a ZFS pool, but I'm not 100% sure If I set it up correctly.
I'm using 2 16TB drives and 1 14TB drive.
Was expecting to have between 24TB and 28TB available since it would be 3 x 14TB in the raid and I'd lose one 14TB space for redundancy, but it ended up being 38.2TB which is way more than expected.

Does this mean I have not set up the RaidZ1 pool correctly which would mean no redundancy? Or is there something I'm missing?
Hope someone can explain.

Thanks in advance!

zpool status command result

zpool list command result

lsblk command result

15 Comments
2024/11/29
11:41 UTC

2

Suggestions for M.2 to SATA adapter and HBA card

I am looking to expand my pool but I've run out of SATA ports on my board. I have a M.2 and PCIex16 availables.

I would prefer to get the M.2 adapter since I am considering the idea of adding a GPU in the future (not decided yet).

However I've seen a lot of contradictory opinions regarding these type of adapters. Some people say it produces a lot of errors, others that work without a problema.

I would like to know your opinion and also get a recommendation for both M.2 adapter and hba card.

Thanks in advance.

8 Comments
2024/11/29
01:32 UTC

0

Anyone tested stride/stripe-width when creating EXT4 in VM-guest to be used with ZFS on VM-host?

Its like a common knowledge that you dont select ZFS if you want performance - reason to use ZFS is mainly for its features.

But having that sad Im looking through various optimization tips to make the life easier for my VM-host (Proxmox) who will be using ZFS through zvol's to store the virtual drives of VM-guests.

Except for the usual suspects of:

  • Adjust ARC.
  • Set compression=lz4 (or off for NVMe).
  • Set atime=off.
  • Set xattr=sa.
  • Consider sync=disabled along with txg_timeout=5 (or 1 for NVMe).
  • Adjust async/sync/scrub min/max.
  • Decompress data in ARC.
  • Use linear buffers for ARC Buffer Data (ABD) scatter/gather feature.
  • Rethink if you want to use default volblocksize of 16k or 32k.
  • Reformat NVMe's to use 4k instead of 512b blocks.
  • etc...

Where some do have effect, some are more debatable if they do have effect or just increased risk of dataintegrity.

For example the volblocksize seems to have effect on both lowering writeamplification and increase IOPS performance of ZFS for databases.

That is selecting 16k rather than 32k or even 64k (mainly Linux/BSD VM-guests in my case).

So I now ended up at --stride and --stripe-width when creating EXT4 which in theory might have effect on better utilizing available storage.

Anyone in here who have tested this or have seen benchmarks/performance results regarding this?

That is does this have any measureable effect when used in a VM-guest running Linux where the VM-host runs ZFS zvol's?

A summary of this EXT2/3/4-feature:

https://thelastmaimou.wordpress.com/2013/05/04/magic-soup-ext4-with-ssd-stripes-and-strides/

1 Comment
2024/11/28
21:26 UTC

2

Correct way to install ZFS in Debian

I'd like to use ZFS on a Debian 12 Bookworm netinstall (very barebones) that is my home network drive. It's for a single SSD that holds all our important stuff (it's backed up to the cloud). I have been using ext4 and have not encountered any corrupted files yet, but reading about this makes me anxious and I want something with checksumming.

I've been using Linux for years, but am nowhere near an expert and know enough to get by most of the time. I cannot get this to work. I tried following the guide on https://www.cyberciti.biz/faq/installing-zfs-on-debian-12-bookworm-linux-apt-get/ since it's for this specific Debian version, but I get install errors related to not being able to create the module and dependency conflicts. I first tried the instructions at https://wiki.debian.org/ZFS but got similar issues. I tried purging the packages and installing again, but similar errors appear. I also tried apt-get upgrade then rebooting, but no improvement. Sorry I'm not being too specific here, but I've tried multiple things and now I'm at a point where I just want to know if either of these are the best way to do this. One thing I'm not sure about is the Backport. As I understand, they are not stable releases (I think?) and I'd prefer a stable release even if it isn't the newest.

What is the correct way to install this? Each webpage referenced above gives a little different process.

24 Comments
2024/11/28
19:16 UTC

2

HELP: Encrypted dataset recovery

Many moons ago, I setup myself with a LUKS encrypted zfs on Ubuntu. Couple of weeks ago, my laptop crashed due to a partial SSD failure, with couple of megabytes from rpool which could not be read. When trying to boot, I'd enter initramfs, which showed an error that rpool could not be imported because no device was found.

I can import rpool from the copy in read only mode, and can see the datasets, albeit encrypted.
The key location for rpool is somewhere in `file:///run/keystore/rpool/system.key `. Knowing that I did not set up my system with zfs disk encryption directly, is there a way of generating this file? I have the passphrase I would be prompted for when booting.
Or is the data lost forever. I do have some backups, but they do not include couple of weeks of very useful work :/ Any help would be greatly appreciated!

5 Comments
2024/11/28
14:15 UTC

3

Can I use different size drives in RaidZ1?

I would like to setup a RaidZ1 pool, but I have 2 16TB drives and 1 14TB drive. Is this possible? I'd understand if I'd lose 2TB on the 16TB drives, I can live with that.

Couldn't really find a similar situation on the internet. Sorry if this is an obvious thing.

Thanks in advance!

10 Comments
2024/11/28
00:49 UTC

3

Should I periodically trim my zpool, or does autotrim suffice?

I have autotrim enabled on my zpool. Should I also setup a monthly cron job to trim my zpool? I have heard mixed info about this. I read the zpoolprops page, and I see no indication stating I need to run a manual trim in addition to the autotrim.

Just am wondering what the best practice is, thanks.

5 Comments
2024/11/27
23:17 UTC

4

Cant import Zpool Faulted corrupted data

I recently tried to remove a drive my pool, went fine, but after rebooting the pool disappeared then I ran zpool import, is there any way to import mirror 1, replace the faulted drive or otherwise recover the data?

root@pve:~# zpool import -a
cannot import 'dpool': one or more devices is currently unavailable
root@pve:~# zpool import -o readonly=yes
   pool: dpool
     id: 10389576891323462784
  state: FAULTED
status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
   see: 
 config:

dpool                                           FAULTED  corrupted data
 ata-WDC_WD40EZAX-00C8UB0_WD-WXH2D232Y65Z      FAULTED  corrupted data
 mirror-1                                      ONLINE
   ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D7EKF8  ONLINE
   ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D68C80  ONLINE
https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E

root@pve:~# zdb -e dpool

Configuration for import:
        vdev_children: 2
        version: 5000
        pool_guid: 10389576891323462784
        name: 'dpool'
        state: 0
        hostid: 952000300
        hostname: 'pve'
        vdev_tree:
            type: 'root'
            id: 0
            guid: 10389576891323462784
            children[0]:
                type: 'missing'
                id: 0
                guid: 0
            children[1]:
                type: 'mirror'
                id: 1
                guid: 2367893751909554525
                metaslab_array: 88
                metaslab_shift: 34
                ashift: 12
                asize: 2000384688128
                is_log: 0
                create_txg: 56488
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 14329578362940330027
                    whole_disk: 1
                    DTL: 41437
                    create_txg: 56488
                    path: '/dev/disk/by-id/ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D7EKF8-part1'
                    devid: 'ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D7EKF8-part1'
                    phys_path: 'pci-0000:00:11.4-ata-3.0'
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 6802284438884037621
                    whole_disk: 1
                    DTL: 41436
                    create_txg: 56488
                    path: '/dev/disk/by-id/ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D68C80-part1'
                    devid: 'ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D68C80-part1'
                    phys_path: 'pci-0000:00:1f.2-ata-1.0'
        load-policy:
            load-request-txg: 18446744073709551615
            load-rewind-policy: 2
zdb: can't open 'dpool': No such device or address

ZFS_DBGMSG(zdb) START:
spa.c:6521:spa_import(): spa_import: importing dpool
spa_misc.c:418:spa_load_note(): spa_load(dpool, config trusted): LOADING
vdev.c:161:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D7EKF8-part1': best uberblock found for spa dpool. txg 1287246
spa_misc.c:418:spa_load_note(): spa_load(dpool, config untrusted): using uberblock with txg=1287246
vdev.c:161:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/ata-WDC_WD40EZAX-00C8UB0_WD-WXH2D232Y65Z-part1': vdev_validate: vdev label pool_guid doesn't match config (7539688533288770386 != 10389576891323462784)
spa_misc.c:404:spa_load_failed(): spa_load(dpool, config trusted): FAILED: cannot open vdev tree after invalidating some vdevs
vdev.c:213:vdev_dbgmsg_print_tree():   vdev 0: root, guid: 10389576891323462784, path: N/A, can't open
vdev.c:213:vdev_dbgmsg_print_tree():     vdev 0: disk, guid: 2781254482063008702, path: /dev/disk/by-id/ata-WDC_WD40EZAX-00C8UB0_WD-WXH2D232Y65Z-part1, can't open
vdev.c:213:vdev_dbgmsg_print_tree():     vdev 1: mirror, guid: 2367893751909554525, path: N/A, healthy
vdev.c:213:vdev_dbgmsg_print_tree():       vdev 0: disk, guid: 14329578362940330027, path: /dev/disk/by-id/ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D7EKF8-part1, healthy
vdev.c:213:vdev_dbgmsg_print_tree():       vdev 1: disk, guid: 6802284438884037621, path: /dev/disk/by-id/ata-WDC_WD2003FZEX-00SRLA0_WD-WMC6N0D68C80-part1, healthy
spa_misc.c:418:spa_load_note(): spa_load(dpool, config trusted): UNLOADING
ZFS_DBGMSG(zdb) END
6 Comments
2024/11/27
19:29 UTC

3

How do I delete corrupted files?

I am on truenas and recently my boot ssd died. I reinstalled and imported my pool , but If I run
zpool status -v I get this message:

https://preview.redd.it/tyyiumd3kh3e1.png?width=787&format=png&auto=webp&s=3b25fdedf581cd37c9c9af020eb908b81dc2ee7b

How do I deal with the 2 errors on my Main_ZPool? I tried deleting those files , but Main_ZPool/.system isn't mounted anywhere and if I run zfs list It says that it is a legacy mountpoint:

https://preview.redd.it/9ufsu8xekh3e1.png?width=738&format=png&auto=webp&s=93f2f015fd19a73d526b246c59e711d7e883e0e4

What can I do here? I tried mounting whatever .system is to delete the two files like this:
https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbaln/index.html
but haven't been able to do it.
Any help is much apreciated :D

1 Comment
2024/11/27
18:19 UTC

0

Good/bad idea to expand a striped mirror vdev pool with a new mirror of bigger size?

I've got a striped pool of mirror vdevs composed of six 18TB WD Red Pros (so three mirrored pairs). I want to add a fourth pair, and WD's Black Friday sale has the 20TB Red Pro on sale for way less than the 18TB ($320 vs. 375). I'd rather stick to every mirror being the same size, but that price difference is hard to swallow.

What are the practical implications if I throw a mirrored 20TB pair in with the 18s? If I understand, it sounds like slightly larger stripes would be written to the larger vdev? For reference I plan to zfs send off the contents of the pool and recreate it from scratch, so assume that all data will be written fresh to the theoretical new pool with two different vdev sizes.

4 Comments
2024/11/27
16:38 UTC

2

Anyone experienced "missing label" on NVMe?

Hi!
I have a 2x2 mirror pool with NVMe on Ubuntu 24.04. I now suddenly had an issue where I was missing a member of each vDev, "missing label". I could see them with lsblk , but they were not available in the pool.

After just rebooting the server, they were back up and now resilvering.

I'm pretty sure there's nothing wrong with the hardware, so I'm trying to understand what could've happened here. Thoughts?

9 Comments
2024/11/27
12:07 UTC

2

expansion from mirror

Looking for recommendations for the best setup to expand from.

I'm currently running two 16TB drives in a mirror and I'm about at 80% capacity now. For backups, I have 6x 14TB drives in raidz2 that yield about 56TB of usable space.

Option 1: Continue adding mirrors. There are a few BF deals to shuck 20TB drives and I would most likely add just one mirror for now and add more as needed.

Option 2: I can also keep the mirror and create a 4 drive raidz1 array of either 14 or 12TB recertified drives.

Option 3 (Most Expensive): Buy 4x 16TB recertified drives and convert current mirrors to a 6 drive raidz2 array for 64TB of usable space. Not even sure how complicated it would be to convert the current mirror. This is a larger volume than my backups but I don't plan on filing up anytime soon so that doesn't concern me much. This gains a two drive parity.

Or other possible options?

3 Comments
2024/11/27
00:31 UTC

0

Help sizing first server/NAS

Hi everyone, I'm in the middle of a predicted here.. I've got a dell 7710 laying around that I would like to set up as my first server/home lab. Already have Proxmox with a couple of VMs and now going to go ahead and also add Plex, piHole and then want it to also be sort of a high speed NAS.

I have two dedicated nvme slots, and managed to confirm just today that the WWAN slot also works with and NVME drive. Also have one SATA 3 2.5 slot.

Because I'm limited to 2TB on the wwan slot (2230/2242 format limit), i feel like it would be a waste of money buying 2x 4TBnvme is i would be limited to the 2tb smaller disc..? I was planning on running the 2.5 sata as a boot disk BTW.. as I already have a ssd500gb there anyway.

That said, and keep in mind that I'm a total noob here, could I do mirror of 4tb+2tb into one pool? Can you mix mirrored and not mirrored drives in a pool? Or am I better of saving some money and just get all 3x of 2tb and get 4th usable raidz?

I also have an option of putting some 3.0 usb external drives for weekly backups and "cold storage " i guess..?

I plan on doing 4k video editing from it mainly.. that's the major kpi.. Already got 10gbe thunderbolt 3 ethernet adapters sorted.

Thanks

6 Comments
2024/11/26
21:47 UTC

0

Need help with specific question

I have a Synology NAS running BTRFS which has an issue with the power supply adapter because of which not all 4 hard drives can spin up (they click). Messages in /var/log show one of the 4 drives being unplugged every 30-60 mins. I got new power adapter and no such issue happens. I have UPS but the power adapter sits in between UPS and NAS so irrelevant.

Because of the issue the file system got corrupted and I was not able to repair, it goes into read only mode. Was getting I/O errors when trying to access and copy some folders via GUI but recovered all data by copying to USB via SSH (except for couple files not readable which is ok, in GUI I wasn’t able to copy anything from some folders)

My question is if ZFS offer better recovery than BTRFS (like can it take copies of file system that I can go back and restore from?) or can it also crash and not recoverable in the similar event. I am not concerned about speed and any other features between the two file systems but simply the ability to recover.

This is the second time I had this issue with my NAS and I am looking to get QNAP so I can get ZFS. I don’t expose my NAS to internet, I login through VPS on my security gateway so ransomware etc is not a concern for me), just looking to find if in this power issue scenario ZFS can be better?

2 Comments
2024/11/26
18:43 UTC

1

Looking for a genius to fix: corrupted metadata / mixed up HDD IDs?!

Hey everyone,
cross posting this here from a thread I started over on the openzfsonosx forum - hope that's ok!

I already did a couple of hours of research, testing and trying didn't get me anywhere.
I have the following problem:

- Had a ZFS RAIDZ1 pool running on my Mac Pro 2012 running 11.7.3, consisting of 4x 4TB HDDs
- moved the drives to another machine (TrueNAS Scale VM with dedicated HBA), but didn't export the pool before doing that
- couldn't import my pool on the TrueNAS VM, so moved the drives back to my Mac Pro
- now zpool import won't let me import the pool

Depending on which parameters I use for the import, I get different clues about the errors:

Simple zpool import (-f and -F give the same output as well):

sudo zpool import                                                 
   pool: tank
     id: 7522410235045551686
  state: UNAVAIL
status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
devices and try again.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-3C
 config:

tank                                            UNAVAIL  insufficient replicas
  raidz1-0                                      UNAVAIL  insufficient replicas
    disk4                                       ONLINE
    media-5A484847-B333-3E44-A0B3-632CF3EC20A6  UNAVAIL  cannot open
    media-9CEF4C13-418D-3F41-804B-02355E699FED  ONLINE
    media-7F264D47-8A0E-3242-A971-1D0BD7D755F4  UNAVAIL  cannot open

When specifying a device:

sudo zpool import -d /dev/disk4s1
   pool: tank
     id: 7522410235045551686
  state: FAULTED
status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
 config:

tank                                            FAULTED  corrupted data
  raidz1-0                                      DEGRADED
    media-026CF59D-BEBE-F043-B0A3-95F3FC1D4EDF  ONLINE
    disk4                                       ONLINE
    media-9CEF4C13-418D-3F41-804B-02355E699FED  ONLINE
    disk6                                       FAULTED  corrupted data

Specifying disk6s1 even returns all drives as ONLINE:

sudo zpool import -d /dev/disk6s1 
   pool: tank
     id: 7522410235045551686
  state: FAULTED
status: The pool metadata is corrupted.
 action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
 config:

tank                                            FAULTED  corrupted data
  raidz1-0                                      ONLINE
    media-026CF59D-BEBE-F043-B0A3-95F3FC1D4EDF  ONLINE
    media-17A0A5DF-B586-114C-8606-E1FB316FA23D  ONLINE
    media-9CEF4C13-418D-3F41-804B-02355E699FED  ONLINE
    disk6                                       ONLINE

What I've tried so far:

- looked at zdb -l for all the relevant partitions
- discovered that not all symlinks have been created, for example media-5A484847-B333-3E44-A0B3-632CF3EC20A6 is missing in /private/var/run/disk/by-id and /var/run/disk/by-id. Creating these manually didn't help.

I was thinking about somehow modifying the metadata that is shown with zdb -l, as it's different for each drive (especially the part that references the other drives), but not sure if that is even possible. What led me to think about that was when specifying disk6s1, all drives show as online and also have different IDs than in .

Does anyone have ideas about how to solve this? Help is greatly appreciated!

4 Comments
2024/11/26
13:27 UTC

Back To Top