Intro

So while learning about ZFS dedup and compression I came across the command zdb -DD which tells me about the dedup and compression ratios.. But I didn't really understand the output and couldn't find much info on the net to explain it. So..

My findings

Most of this is a wild ass guess.

Allocated is the space actually allocated in the pool.

Referenced is the amount of data referenced to by dedup/compression/copies this is how large things might be if all the data was copied to a non-zfs filesystem.

LSIZE seems to be related to zfs dedup and I'm guessing it's the size -after- dedup takes place.

PSIZE seems to be related to zfs compression and I'm guessing it's the size -after- compression takes place.

DSIZE seems to be related to zfs copies and I'm guessing it's the size -after- copies takes place.

IT would seem that DSIZE is the final answer of “How big is it” on both Allocated and Referenced.

You can use the 3 values to break down differences from dedup/compression/copies individually and see how each one contributes to the total size

Basic Examples

All examples are for a single pool with 9 copies of identical file

  • Shows only dedup enabled
DDT-sha256-zap-duplicate: 1 entries, size 4608 on disk, 8192 in core

DDT histogram (aggregated over all DDTs):

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     8        1   6.50K   6.50K   6.50K        9   58.5K   58.5K   58.5K
 Total        1   6.50K   6.50K   6.50K        9   58.5K   58.5K   58.5K

dedup = 9.00, compress = 1.00, copies = 1.00, dedup * compress / copies = 9.00
  • Shows dedup enabled + gzip-9 compression
DDT-sha256-zap-duplicate: 1 entries, size 4608 on disk, 8192 in core

DDT histogram (aggregated over all DDTs):

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     8        1   6.50K   2.50K   2.50K        9   58.5K   22.5K   22.5K
 Total        1   6.50K   2.50K   2.50K        9   58.5K   22.5K   22.5K

dedup = 9.00, compress = 2.60, copies = 1.00, dedup * compress / copies = 23.40
  • Shows dedup enabled + gzip-9 compression + copies=3
DDT-sha256-zap-duplicate: 1 entries, size 4608 on disk, 8192 in core

DDT histogram (aggregated over all DDTs):

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     8        1   6.50K   2.50K   7.50K        9   58.5K   22.5K   67.5K
 Total        1   6.50K   2.50K   7.50K        9   58.5K   22.5K   67.5K

dedup = 9.00, compress = 2.60, copies = 3.00, dedup * compress / copies = 7.80

Real-World Example

This output shows an example of some real world (for me) data. It consists of nightly backups of our user's folder server. In total that system stores about 900GB of individual user data that ranges from Outlook archive files, music, photos, office documents, and other random things. The backup runs nightly and there are about 10 nights data here. Each night it does a –link-dest against the previous nights data. So there's a large number of hard-link files that helps reduce the total size substantially already.

… to be added

 
howto/understand_zdb_-s_or_-dd_output.txt · Last modified: 2011/06/21 16:50 by bruce
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki