{"id":9,"date":"2015-07-18T22:03:22","date_gmt":"2015-07-19T02:03:22","guid":{"rendered":"http:\/\/bhoey.com\/blog\/?p=9"},"modified":"2020-06-25T17:02:32","modified_gmt":"2020-06-25T21:02:32","slug":"3-way-disk-mirrors-with-zfsonlinux","status":"publish","type":"post","link":"https:\/\/bhoey.com\/blog\/3-way-disk-mirrors-with-zfsonlinux\/","title":{"rendered":"3-way Disk Mirrors With ZFSOnLinux"},"content":{"rendered":"<h3>Background<\/h3>\n<p>ZFS&nbsp; is a member of the newer generation of filesystems that include advanced features beyond simple file storage. Its capabilities are quite extensive covering a wide range of pain points hit with previous filesystems. <a href=\"https:\/\/en.wikipedia.org\/wiki\/ZFS#Features\">The Wikipedia page<\/a> details them all nicely but for the purpose of this post we will be focusing on its ability to create N-Way sets of disk mirrors.<\/p>\n<p>Traditionally mirrored disk sets in Linux and other operating systems have been limited to two devices (note: devices in this context could be disks, partitions or even other raid groups, such is the case in raid 10 setups). While mirroring has the benefit over other raid levels in that each mirrored device contains a complete copy of the data, the two device limit became inadequate as disk sizes ballooned.&nbsp; In the age of multi-TB drives, simply rebuilding a degraded mirrored array <a href=\"https:\/\/en.wikipedia.org\/wiki\/RAID#Increasing_rebuild_time_and_failure_probability\">could actually cause the surviving device to fail,<\/a> eliminating the very redundancy one was expecting.<\/p>\n<p>ZFS addresses this particular problem in several ways through <a href=\"https:\/\/blogs.oracle.com\/bonwick\/entry\/zfs_end_to_end_data\" target=\"_blank\" rel=\"noopener noreferrer\">data checksums<\/a>, <a href=\"https:\/\/blogs.oracle.com\/timc\/entry\/demonstrating_zfs_self_healing\" target=\"_blank\" rel=\"noopener noreferrer\">self-healing<\/a> and <a href=\"http:\/\/docs.oracle.com\/cd\/E23823_01\/html\/819-5461\/gbbvf.html#gbcus\" target=\"_blank\" rel=\"noopener noreferrer\">smart resilvering<\/a> instead of blindly rebuilding full array members even if only 1% of disk space is being used.<\/p>\n<p>And to top it off, ZFS also includes the ability to specify <em>N<\/em> number of devices in a mirrored set.&nbsp; In this post we will create a sample 3-way mirrored set using loopback devices and run a series of test scenarios against it.<\/p>\n<p style=\"margin-left: 60px; margin-right: 60px; padding-left: 10px; padding-right: 10px; text-align: justify; border: 2px solid black; background-color: #f7f8fa;\">For those unfamiliar, a loopback device allows you to expose an file as a block device. Using loopback devices we can create file-based \"disks\" that we can use as mirror array members in our test.<\/p>\n<h3>Testbed Setup<\/h3>\n<p>For this exercise I am using a fresh Debian Jessie (8.1) x86_64 vanilla system installed into a KVM\/QEMU virtual machine. The kernel currently shipped with Jessie is 3.16.0-4-amd64 and the ZFSOnLinux package currently available for Debian is 0.6.4-1.2-1.<\/p>\n<p>It should be especially noted that ZFS <a href=\"http:\/\/zfsonlinux.org\/debian.html\" target=\"_blank\" rel=\"noopener noreferrer\">should only be used on 64-bit hosts<\/a>.<\/p>\n<h3>Installation<\/h3>\n<p>Following the <a href=\"http:\/\/zfsonlinux.org\/debian.html\" target=\"_blank\" rel=\"noopener noreferrer\">Debian instructions on the ZFSOnLinux website<\/a>,&nbsp; the following commands were run:<\/p>\n<pre class=\"brush: plain; notranslate\">$ su -\n# apt-get install lsb-release\n# wget http:\/\/archive.zfsonlinux.org\/debian\/pool\/main\/z\/zfsonlinux\/zfsonlinux_6_all.deb\n# dpkg -i zfsonlinux_6_all.deb\n# apt-get update\n# apt-get install debian-zfs\n<\/pre>\n<p>This will add \/etc\/apt\/sources.list.d\/zfsonlinux.list, install the software and dependencies, then proceed to build the ZFS\/SPL kernel modules.<\/p>\n<h3>Preparing the loopback devices<\/h3>\n<h4>Finding the first available loopback device<\/h4>\n<pre class=\"brush: plain; notranslate\"># losetup -a<\/pre>\n<p>If you see anything listed, change 1 2 3 in the commands below to the start with the next available number and increment appropriately.<\/p>\n<h4>Creating the files<\/h4>\n<pre class=\"brush: plain; notranslate\"># for i in 1 2 3; do dd if=\/dev\/zero of=\/tmp\/zfsdisk_$i bs=1M count=250; done\n250+0 records in\n250+0 records out\n262144000 bytes (262 MB) copied, 0.371318 s, 706 MB\/s\n250+0 records in\n250+0 records out\n262144000 bytes (262 MB) copied, 0.614396 s, 427 MB\/s\n250+0 records in\n250+0 records out\n262144000 bytes (262 MB) copied, 0.824889 s, 318 MB\/s\n<\/pre>\n<h4>Setup the loopback mappings<\/h4>\n<pre class=\"brush: plain; notranslate\"># for i in 1 2 3; do losetup \/dev\/loop$i \/tmp\/zfsdisk_$i; done<\/pre>\n<h4>Verify the mappings<\/h4>\n<pre class=\"brush: plain; notranslate\"># losetup -a\n\/dev\/loop1: [65025]:399320 (\/tmp\/zfsdisk_1)\n\/dev\/loop2: [65025]:399323 (\/tmp\/zfsdisk_2)\n\/dev\/loop3: [65025]:399324 (\/tmp\/zfsdisk_3)\n<\/pre>\n<h3>Create the ZFS 3-Way Mirror<\/h3>\n<pre class=\"brush: plain; notranslate\"># zpool \\\n&nbsp;&nbsp;&nbsp; create \\\n&nbsp;&nbsp;&nbsp; -o ashift=12 \\\n&nbsp;&nbsp;&nbsp; -m \/mnt\/zfs\/mymirror \\\n&nbsp;&nbsp;&nbsp; mymirror \\\n&nbsp;&nbsp;&nbsp; mirror \\\n&nbsp;&nbsp;&nbsp; \/dev\/loop1 \\\n&nbsp;&nbsp;&nbsp; \/dev\/loop2 \\\n&nbsp;&nbsp;&nbsp; \/dev\/loop3\n<\/pre>\n<p>A couple things to note:<\/p>\n<ol>\n<li>&nbsp;-o ashift=12<br \/>\nThis tells ZFS to align along 4KB sectors. It is generally a good idea to always set this option since modern disks use 4KB sectors and once a pool has been created with a given sector size it cannot be changed later. The net result is that if you created a pool with 512b sectors lets say using 1TB drives, you couldn't later change the sector size to 4KB when adding 3TB drives (resulting in abysmal performance on the newer drives). So as a rule of thumb, always set -o ashift=12.<\/li>\n<li>&nbsp;-m \/mnt\/zfs\/mymirror<br \/>\nThis indicates where this pool should be mounted.<\/li>\n<li>&nbsp;\/dev\/loopN<br \/>\nThe devices that make up the mirrored set. If these were physical disks you would likely want to use the appropriate disk symlinks under \/dev\/disk\/by-id\/.<\/li>\n<\/ol>\n<h4>Verify The ZFS Pool<\/h4>\n<pre class=\"brush: plain; notranslate\"># zpool list\nNAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; SIZE&nbsp; ALLOC&nbsp;&nbsp; FREE&nbsp; EXPANDSZ&nbsp;&nbsp; FRAG&nbsp;&nbsp;&nbsp; CAP&nbsp; DEDUP&nbsp; HEALTH&nbsp; ALTROOT\nmymirror&nbsp;&nbsp; 244M&nbsp;&nbsp; 408K&nbsp;&nbsp; 244M&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp;&nbsp; 0%&nbsp;&nbsp;&nbsp;&nbsp; 0%&nbsp; 1.00x&nbsp; ONLINE&nbsp; -<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool status\n&nbsp; pool: mymirror\n&nbsp;state: ONLINE\n&nbsp; scan: none requested\nconfig:\n\n&nbsp;&nbsp; &nbsp;NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n&nbsp;&nbsp; &nbsp;mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp; mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop3&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n\nerrors: No known data errors\n<\/pre>\n<h3>Poking The Bear<\/h3>\n<p>So now that we have our test 3-way mirror running, lets test the resiliency.<\/p>\n<p><strong>!!! WARNING NOTE: ALTHOUGH ZFS IS BUILT TO RECOVER FROM ERRORS, ONLY RUN THE FOLLOWING COMMANDS IN A TEST ENVIRONMENT OTHERWISE YOU WILL SUFFER DATA LOSS!!!<\/strong><\/p>\n<h4>Setting The Stage<\/h4>\n<p>Create random file that takes up ~50% of disk space:<\/p>\n<pre class=\"brush: plain; notranslate\"># dd if=\/dev\/urandom of=\/mnt\/zfs\/mymirror\/test.dat bs=1M count=125\n 125+0 records in\n 125+0 records out\n 131072000 bytes (131 MB) copied, 16.8152 s, 7.8 MB\/s\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool list\n NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; SIZE&nbsp; ALLOC&nbsp;&nbsp; FREE&nbsp; EXPANDSZ&nbsp;&nbsp; FRAG&nbsp;&nbsp;&nbsp; CAP&nbsp; DEDUP&nbsp; HEALTH&nbsp; ALTROOT\n mymirror&nbsp;&nbsp; 244M&nbsp;&nbsp; 126M&nbsp;&nbsp; 118M&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 20%&nbsp;&nbsp;&nbsp; 51%&nbsp; 1.00x&nbsp; ONLINE&nbsp; -\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool scrub mymirror\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool status\n pool: mymirror\n state: ONLINE\n scan: scrub repaired 0 in 0h0m with 0 errors on Sun Jul 19 20:20:12 2015\n config:\n\nNAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n   mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop3&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n<\/pre>\n<h4>Complete Corruption Of A Single Disk<\/h4>\n<p>Wipe disk with all ones (to differentiate from the initialization above from \/dev\/zero to demonstrate how ZFS resilvers)<\/p>\n<pre class=\"brush: plain; notranslate\"># dd if=\/dev\/zero bs=1M count=250 | tr '\\000' '\\001' &gt; \/tmp\/zfsdisk_3\n 250+0 records in\n 250+0 records out\n 262144000 bytes (262 MB) copied, 0.708197 s, 370 MB\/s\n<\/pre>\n<p>This will wipe out the ZFS disk label among everything else, simulating the state where a disk is alive but corrupt.<\/p>\n<pre class=\"brush: plain; notranslate\"># zpool scrub mymirror\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool list\n NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; SIZE&nbsp; ALLOC&nbsp;&nbsp; FREE&nbsp; EXPANDSZ&nbsp;&nbsp; FRAG&nbsp;&nbsp;&nbsp; CAP&nbsp; DEDUP&nbsp; HEALTH&nbsp; ALTROOT\n mymirror&nbsp;&nbsp; 244M&nbsp;&nbsp; 127M&nbsp;&nbsp; 117M&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 21%&nbsp;&nbsp;&nbsp; 51%&nbsp; 1.00x&nbsp; ONLINE&nbsp; -\n\n# zpool status\n pool: mymirror\n state: ONLINE\n status: One or more devices could not be used because the label is missing or\n invalid.&nbsp; Sufficient replicas exist for the pool to continue\n functioning in a degraded state.\n action: Replace the device using 'zpool replace'.\n see: http:\/\/zfsonlinux.org\/msg\/ZFS-8000-4J\n scan: scrub repaired 0 in 0h0m with 0 errors on Sun Jul 19 20:39:45 2015\n config:\n\nNAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n   mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop3&nbsp;&nbsp; UNAVAIL&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; corrupted data\n\nerrors: No known data errors\n<\/pre>\n<p>Replacing the disk:<\/p>\n<pre class=\"brush: plain; notranslate\">zpool replace -o ashift=12 mymirror loop3<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool status\n pool: mymirror\n state: ONLINE\n scan: resilvered 126M in 0h0m with 0 errors on Sun Jul 19 20:42:51 2015\n config:\n\nNAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n   mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n     loop3&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n<\/pre>\n<p>Note that only 126MB needed to be resilvered. ZFS will only synchronize blocks in use, not empty blocks and not blocks that are the same in the new drive (this is demonstrated as we corrupted it with all ones).<\/p>\n<h4>Complete Corruption Of 2 Out Of 3 Disks<\/h4>\n<p>Check the file first:<\/p>\n<pre class=\"brush: plain; notranslate\"># md5sum \/mnt\/zfs\/mymirror\/test.dat \nc253c4c5421d793f4fefe34af5a5ecc1&nbsp; \/mnt\/zfs\/mymirror\/test.dat<\/pre>\n<p>Corrupt disk 2 and 3:<\/p>\n<pre class=\"brush: plain; notranslate\"># dd if=\/dev\/zero bs=1M count=250 | tr '\\000' '\\001' &gt; \/tmp\/zfsdisk_2\n250+0 records in\n250+0 records out\n262144000 bytes (262 MB) copied, 0.660485 s, 397 MB\/s\n# dd if=\/dev\/zero bs=1M count=250 | tr '\\000' '\\001' &gt; \/tmp\/zfsdisk_3\n250+0 records in\n250+0 records out\n262144000 bytes (262 MB) copied, 0.718505 s, 365 MB\/s\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool scrub mymirror<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool status\n&nbsp; pool: mymirror\n&nbsp;state: ONLINE\nstatus: One or more devices could not be used because the label is missing or\n&nbsp;&nbsp; &nbsp;invalid.&nbsp; Sufficient replicas exist for the pool to continue\n&nbsp;&nbsp; &nbsp;functioning in a degraded state.\naction: Replace the device using 'zpool replace'.\n&nbsp;&nbsp; see: http:\/\/zfsonlinux.org\/msg\/ZFS-8000-4J\n&nbsp; scan: scrub repaired 0 in 0h0m with 0 errors on Sun Jul 19 22:39:05 2015\nconfig:\n\n&nbsp;&nbsp; &nbsp;NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n&nbsp;&nbsp; &nbsp;mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp; mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop2&nbsp;&nbsp; UNAVAIL&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; corrupted data\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop3&nbsp;&nbsp; UNAVAIL&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; corrupted data\n\nerrors: No known data errors\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># md5sum \/mnt\/zfs\/mymirror\/test.dat \nc253c4c5421d793f4fefe34af5a5ecc1&nbsp; \/mnt\/zfs\/mymirror\/test.dat\n<\/pre>\n<p>File still looks good. Now replace both drives (done in the following way so we can see it in progress)<\/p>\n<pre class=\"brush: plain; notranslate\"># zpool replace -o ashift=12 mymirror loop2 &amp; \\\n  zpool replace -o ashift=12 mymirror loop3 &amp; \\\n  sleep 1 &amp;&amp; \\\n    zpool status &amp;\n\nstate: ONLINE\n scan: resilvered 127M in 0h0m with 0 errors on Sun Jul 19 22:45:17 2015\nconfig:\n\nNAME STATE READ WRITE CKSUM\n mymirror ONLINE 0 0 0\n mirror-0 ONLINE 0 0 0\n    loop1 ONLINE 0 0 0\n    replacing-1 UNAVAIL 0 0 0\n      old UNAVAIL 0 0 0 corrupted data\n      loop2 ONLINE 0 0 0\n    replacing-2 UNAVAIL 0 0 0\n      old UNAVAIL 0 0 0 corrupted data\n      loop3 ONLINE 0 0 0\n\nerrors: No known data errors\n<\/pre>\n<p>And finally replaced<\/p>\n<pre class=\"brush: plain; notranslate\"># zpool status\n&nbsp; pool: mymirror\n&nbsp;state: ONLINE\n&nbsp; scan: resilvered 127M in 0h0m with 0 errors on Sun Jul 19 22:45:17 2015\nconfig:\n\n&nbsp;&nbsp; &nbsp;NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n&nbsp;&nbsp; &nbsp;mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp; mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop3&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n\nerrors: No known data errors\n<\/pre>\n<p>And finally check the file:<\/p>\n<pre class=\"brush: plain; notranslate\"># md5sum \/mnt\/zfs\/mymirror\/test.dat \nc253c4c5421d793f4fefe34af5a5ecc1&nbsp; \/mnt\/zfs\/mymirror\/test.dat\n<\/pre>\n<h4>Corrupting A File<\/h4>\n<p>In this test we'll inject the file on the drive with bad data using the zinject testing tool included with ZFS.<\/p>\n<pre class=\"brush: plain; notranslate\"># zinject -t data -f 1 \/mnt\/zfs\/mymirror\/test.dat\nAdded handler 5 with the following properties:\n&nbsp; pool: mymirror\nobjset: 21\nobject: 24\n&nbsp; type: 0\n&nbsp;level: 0\n&nbsp;range: all\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool scrub mymirror\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool status\n&nbsp; pool: mymirror\n&nbsp;state: ONLINE\n&nbsp; scan: scrub in progress since Sun Jul 19 21:54:23 2015\n&nbsp;&nbsp;&nbsp; 88.4M scanned out of 127M at 3.84M\/s, 0h0m to go\n&nbsp;&nbsp;&nbsp; 2.12M repaired, 69.51% done\nconfig:\n\n&nbsp;&nbsp; &nbsp;NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n&nbsp;&nbsp; &nbsp;mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp; mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; (repairing)\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; (repairing)\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop3&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; (repairing)\n<\/pre>\n<p>Found bad data and in the process of repairing.<\/p>\n<pre class=\"brush: plain; notranslate\"># zpool status\n&nbsp; pool: mymirror\n&nbsp;state: ONLINE\n&nbsp; scan: scrub repaired 3M in 0h0m with 0 errors on Sun Jul 19 21:54:55 2015\nconfig:\n\n&nbsp;&nbsp; &nbsp;NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n&nbsp;&nbsp; &nbsp;mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp; mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop3&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n\nerrors: No known data errors\n<\/pre>\n<p>Finished reparing 3M of bad data.<\/p>\n<p>Cleanup: If you are testing this yourself, remember to cancel the zinject handler after<\/p>\n<pre class=\"brush: plain; notranslate\"># zinject \n&nbsp;ID&nbsp; POOL&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; OBJSET&nbsp; OBJECT&nbsp; TYPE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; LVL&nbsp; RANGE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;\n---&nbsp; ---------------&nbsp; ------&nbsp; ------&nbsp; --------&nbsp; ---&nbsp; ---------------\n&nbsp; 5&nbsp; mymirror&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 21&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 24&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp; all\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zinject -c 5\nremoved handler 5\n<\/pre>\n<h4>Partial Drive Corruption<\/h4>\n<p>Inject random bytes into one of the files backing a loopback device (mirrored array member) with dd<\/p>\n<pre class=\"brush: plain; notranslate\"># dd if=\/dev\/urandom of=\/tmp\/zfsdisk_3 bs=1K count=10 seek=200000\n10+0 records in\n10+0 records out\n10240 bytes (10 kB) copied, 0.00324266 s, 3.2 MB\/s\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool scrub mymirror\n<\/pre>\n<pre class=\"brush: plain; notranslate\"># zpool status\n&nbsp; pool: mymirror\n&nbsp;state: ONLINE\nstatus: One or more devices has experienced an unrecoverable error.&nbsp; An\n&nbsp;&nbsp; &nbsp;attempt was made to correct the error.&nbsp; Applications are unaffected.\naction: Determine if the device needs to be replaced, and clear the errors\n&nbsp;&nbsp; &nbsp;using 'zpool clear' or replace the device with 'zpool replace'.\n&nbsp;&nbsp; see: http:\/\/zfsonlinux.org\/msg\/ZFS-8000-9P\n&nbsp; scan: scrub in progress since Sun Jul 19 22:08:26 2015\n&nbsp;&nbsp;&nbsp; 127M scanned out of 127M at 31.8M\/s, 0h0m to go\n&nbsp;&nbsp;&nbsp; 24.8M repaired, 99.91% done\nconfig:\n\n&nbsp;&nbsp; &nbsp;NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n&nbsp;&nbsp; &nbsp;mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp; mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop3&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp; 260&nbsp; (repairing)\n\nerrors: No known data errors<\/pre>\n<p>Found corruption and is fixing.<\/p>\n<pre class=\"brush: plain; notranslate\"># zpool status\n&nbsp; pool: mymirror\n&nbsp;state: ONLINE\nstatus: One or more devices has experienced an unrecoverable error.&nbsp; An\n&nbsp;&nbsp; &nbsp;attempt was made to correct the error.&nbsp; Applications are unaffected.\naction: Determine if the device needs to be replaced, and clear the errors\n&nbsp;&nbsp; &nbsp;using 'zpool clear' or replace the device with 'zpool replace'.\n&nbsp;&nbsp; see: http:\/\/zfsonlinux.org\/msg\/ZFS-8000-9P\n&nbsp; scan: scrub repaired 24.8M in 0h0m with 0 errors on Sun Jul 19 22:08:30 2015\nconfig:\n\n&nbsp;&nbsp; &nbsp;NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; STATE&nbsp;&nbsp;&nbsp;&nbsp; READ WRITE CKSUM\n&nbsp;&nbsp; &nbsp;mymirror&nbsp;&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp; mirror-0&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop1&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop2&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0\n&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; loop3&nbsp;&nbsp; ONLINE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp; 260\n\nerrors: No known data errors<\/pre>\n<p>24M of drive corruption fixed<\/p>\n<h3>Conclusion<\/h3>\n<p>Setting up 3-way disk arrays using ZFS provides robust error-detection and recovery from a wide variety of damage scenarios. Its ability to target healing to only the affected data allows it to resilver efficiently and recover faster than traditional array configurations.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Traditionally mirrored disk sets in Linux and other operating systems have been limited to two devices. While mirroring has the benefit over other raid levels in that each mirrored device contains a complete copy of the data, the two device limit became inadequate as disk sizes ballooned. In the age of multi-TB drives, simply rebuilding a degraded mirrored array could actually cause the surviving device to fail, eliminating the very redundancy one was expecting.<\/p>\n<p>ZFS addresses this particular problem in several ways through data checksums, self-healing and smart resilvering instead of blindly rebuilding full array members even if only 1% of disk space is being used.&nbsp;<a href=\"https:\/\/bhoey.com\/blog\/3-way-disk-mirrors-with-zfsonlinux\/\">[Continue&nbsp;reading...] <span class=\"screen-reader-text\">3-way Disk Mirrors With ZFSOnLinux<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":845,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[9,10,4,11,3,5,6],"_links":{"self":[{"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/posts\/9"}],"collection":[{"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/comments?post=9"}],"version-history":[{"count":5,"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/posts\/9\/revisions"}],"predecessor-version":[{"id":483,"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/posts\/9\/revisions\/483"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/media\/845"}],"wp:attachment":[{"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/media?parent=9"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/categories?post=9"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/bhoey.com\/blog\/wp-json\/wp\/v2\/tags?post=9"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}