Home > Storage, Systems > Building a ZFS Deduplication System

Building a ZFS Deduplication System

December 24th, 2009

The news of Sun integrating an in-line deduplication feature into ZFS has created quite a buzz in storage circles. And our clients have been asking us about how to gain access to this new feature. This blog post describes the steps needed to build an OpenSolaris server, integrate the deduplication feature, and enable it.

For details about the ZFS deduplication feature, what it does, and how it does it, have a look at Jeff Bonwick’s blog post on the topic. He was the lead engineer on the project so you can take his word on it.

Deduplication was integrated into OpenSolaris build 128. That takes a little explanation. Solaris is Sun’s current commercial operating system. OpenSolaris has two flavors – the semiannual support-able release, and the frequently-updated developer release. The current supportable release is called 2009.06 and is available for download here. Also at that location is the “SXCE” latest build. That distribution is more like Solaris 10 – a big ol’ DVD including all the bits of all the packages. OpenSolaris is the acknowledged future of Solaris, including a new package manager (more like Linux) and a live-CD image that can be booted for exploration, and installed as the core release. To that core more packages can be added via the package manager.

For this example I started by downloading the 2009.06 OpenSolaris distribution. I then clicked on the desktop “install” icon to install OpenSolaris to my hard drive (in this case inside of VMware Fusion on Mac OS X, but it can be installed anywhere good OSes can live). At the conclusion of that step, my system is now rebooted into 2009.06. The good news is that 2009.06 is a valid release to run for production use. You can pay for support on it, and important security fixes and patches are made available to those with a support contract. The bad news is that deduplication is not available in that release. Rather, we need to point OpenSolaris at a package repository that contains the latest OpenSolaris developer release. Note that the developer release is not supported, and performing these next steps on Opensolaris 2009.06 makes your system unsupported by Sun. But until an official OpenSolaris distribution ships that includes the deduplication code, this is the only current way to get ZFS deduplication.

pbg@opensolaris:~$ pfexec pkg set-publisher -O http://pkg.opensolaris.org/dev opensolaris.org
Refreshing catalog
Refreshing catalog 1/1 opensolaris.org
Caching catalogs ...

Now we tell OpenSolaris to update itself, creating a new boot environment in which the current packages are replaced by any newer packages:

pbg@opensolaris:~$ pfexec pkg image-update
Refreshing catalog
Refreshing catalog 1/1 opensolaris.org
Creating Plan . . .
DOWNLOAD                                  PKGS       FILES    XFER (MB)
entire                                   0/690     0/21250    0.0/449.4
SUNW1394                                 1/690     1/21250    0.0/449.4
. . .

A few-hundred megabytes of downloads later, OpenSolaris adds a new grub (on x86) boot entry as the default boot environment, pointing at the updated version.

A clone of opensolaris-1 exists and has been updated and activated.
On the next boot the Boot Environment opensolaris-2 will be mounted on '/'.
Reboot when ready to switch to this updated BE.

---------------------------------------------------------------------------
NOTE: Please review release notes posted at:

http://opensolaris.org/os/project/indiana/resources/relnotes/200906/x86/

---------------------------------------------------------------------------

A reboot to that new environment brings up the latest OpenSolaris developer distribution, in this case build 129:

pbg@opensolaris:~$ cat /etc/release
                       OpenSolaris Development snv_129 X86
           Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 04 December 2009

Finally, ZFS deduplication is available in this system.

pbg@opensolaris:~$ zfs get dedup rpool
NAME   PROPERTY  VALUE          SOURCE
rpool  dedup     on             default

Let’s try using it:

pbg@opensolaris:~$ pfexec zfs set dedup=on rpool/export/home/pbg
cannot set property for 'rpool/export/home/pbg':
pool and or dataset must be upgraded to set this property or value

Hmm, the on-disk ZFS format is from the 2009.06 release. We need up upgrade it to gain access to the deduplication feature.

pbg@opensolaris:~$ zpool upgrade
This system is currently running ZFS pool version 22.

The following pools are out of date, and can be upgraded.  After being
upgraded, these pools will no longer be accessible by older software versions.

VER  POOL
---  ------------
14   rpool

Use 'zpool upgrade -v' for a list of available versions and their associated features.

pbg@opensolaris:~$ zpool upgrade -v
This system is currently running ZFS pool version 22.

The following versions are supported:

VER  DESCRIPTION
---  --------------------------------------------------------
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z
 4   zpool history
 5   Compression using the gzip algorithm
 6   bootfs pool property
 7   Separate intent log devices
 8   Delegated administration
 9   refquota and refreservation properties
 10  Cache devices
 11  Improved scrub performance
 12  Snapshot properties
 13  snapused property
 14  passthrough-x aclinherit
 15  user/group space accounting
 16  stmf property support
 17  Triple-parity RAID-Z
 18  Snapshot user holds
 19  Log device removal
 20  Compression using zle (zero-length encoding)
 21  Deduplication
 22  Received properties

For more information on a particular version, including supported releases, see:

http://www.opensolaris.org/os/community/zfs/version/N

Where 'N' is the version number.

pbg@opensolaris:~$ pfexec zpool upgrade -a
This system is currently running ZFS pool version 22.

Successfully upgraded 'rpool'

Now we are ready to start using deduplication.

pbg@opensolaris:~$ zfs get dedup rpool
NAME   PROPERTY  VALUE          SOURCE
rpool  dedup     off            default
pbg@opensolaris:~$ pfexec zfs set dedup=on rpool
pbg@opensolaris:~$ zfs get dedup rpool
NAME   PROPERTY  VALUE          SOURCE
rpool  dedup     on             local
pbg@opensolaris:~$ zpool list rpool
NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
rpool  19.9G  10.7G  9.19G    53%  1.00x  ONLINE  -

Ben Rockwood provides a nice blog entry discussing its use, so I refer you to that site rather than repeating the information.

Also, according to this “PSARC” (architecture plan), deduplication also applies to replication, so in essence a deduplicated stream is used when replicating data. Let’s take a look:

pbg@opensolaris:~$ pfexec zfs snapshot rpool/export/home/pbg@friday
pbg@opensolaris:~$ pfexec zfs send -D rpool/export/home/pbg@friday > /var/tmp/pbg-friday-dedupe
pbg@opensolaris:~$

Unfortunately, the current “zfs send -D” functionality is only a subset of what is really needed. With -D, within that “send”, a given block is only sent once (and thus deduplicated). However, if additional duplicate blocks are written, executing the same “zfs send -D” again would send the same set of blocks again. There is no knowledge by ZFS of whether a block already exists at the destination of the send. If there was such knowledge, then “zfs send” would only transmit a given block once to a given target. In that case ZFS could become an even better replacement for backup tape: a ZFS system in production replicating to a ZFS system at a DR site, only sending blocks that the DR site has not seen before. Hopefully such functionality is in the ZFS development pipeline.

As it stands, ZFS deduplication is a powerful new feature. Once integrated into production-ready operating system releases and appliances, it could provide a breakthrough in low cost data reduction and management. We plan to track that progress here, so stay tuned.

  1. ZFS Capacity Usage – Optimizing Compression and Record Size Settings
  2. Sixth NEOSUG meeting rescheduled
  3. Deduplication – It’s not just about capacity
  4. Deduplication – Sometimes it’s about performance
  5. Column – OpenSolaris Crossbow

Categories: Storage, Systems Tags: , , ,
  1. jkotran
    December 24th, 2009 at 12:55 | #1

    Peter,

    Thank you for this post. I was planning to experiment with deduplication over the Christmas break. You have helped me to prepare. Do you have experience with build 129? Is it stable enough for experimentation by sys admins?

    Regards,

    Joe

  2. December 28th, 2009 at 09:08 | #2

    Hi Joe,

    Yes, build 129 seems to work fine. No issues so far in playing with deduplication.

    –Peter

  3. Rick Otten
    December 28th, 2009 at 16:57 | #3

    Cool stuff! Thanks!

Comments are closed.