ZFS, a short overview
ZFS is a file system designed by Sun Microsystems. The file system is based on the Copy-on-Write (CoW) paradigm. When a file is modified, the blocks that must be changed are never modified in place. Instead the following operations are executed:
- Copy the block
- Change the new block
A block is never ever modified in place: a copy is created and then modified.
Like some other modern file-systems, ZFS uses a binary tree to store the list of blocks that form a file: each node knows the address of its two children. When a block is to be changed, a copy is done then the copy is modified. This means that the parent has to be updated when one of his children change. Thus the following operations happen:
- Copy the block
- Modify the new block
- Copy the parent block
- Update the address of the child in the new parent block
- Loop until the root block (Über block) is reached
- Atomically update the Über block
Erasing a file
Classical file system
When deleting a file, most file systems only remove the references to the blocks that form the file while letting the blocks unchanged. That's why sometimes, files can be restored after a deletion: the blocks are still present on the hard drive.
Sometimes you might want to erase a file and to unsure that the data are no longer present on the hard drive. A tool called Shred has been developed for this purpose.
$ cat private Really important information that must be removed. $ shred private $ hexdump -C private 00000000 c9 b9 75 91 02 1f a6 6f 71 d0 8a 9f 3c b5 f7 0f |..u....oq...<...| 00000010 a4 9d 7c fb 56 ac 41 b3 a5 dc be f8 8d c4 41 5d |..|.V.A.......A]| ..............
The file content is now erased by some random data (this process must be repeated several times to unsure that data cannot be recovered by some special tools)
On a ZFS file system, the same set of commands will show the same result: the file is replaced by some random data. But as we have seen before, ZFS is based on CoW, which means that data blocks can still be present on the hard drive. Let's have a look at the hard drive to see if we can find the deleted data.
For the sake of the demonstration, I am using a file as a partition for the zfs file system. With a real device the operations are exactly the same.
$ zpool create zpool_test /root/zfs_partition $ zfs mount zpool_test $ cat /zpool_test/private Really important information that must be removed. $ shred /zpool_test/private $ hexdump -C /zpool_test/private 00000000 c7 cc 86 60 d6 a3 f4 45 37 d5 e7 68 4d 49 c4 43 |...`...E7..hMI.C| 00000010 a8 87 ae e8 8c ac 21 37 aa e7 c1 34 a2 d5 1d ad |......!7...4....| ..............
Shred seems to do its job, but if we look directly at the partition:
$ hexdump -C /root/zfs_partition [...] 0040f000 52 65 61 6c 6c 79 20 69 6d 70 6f 72 74 61 6e 74 |Really important| 0040f010 20 69 6e 66 6f 72 6d 61 74 69 6f 6e 20 74 68 61 | information tha| 0040f020 74 20 6d 75 73 74 20 62 65 20 72 65 6d 6f 76 65 |t must be remove| 0040f030 64 2e 0a 00 00 00 00 00 00 00 00 00 00 00 00 00 |d...............| [...]
Here we found the data that should have been removed by Shred. But as ZFS is a CoW file systems as long as the blocks are not reused, the data stay on the hard drive.
This issue occurs with every CoW file systems like the promising Btrfs. I don't know of any way to erase data over than wiping the entire partition. Maybe a specific tool will be developed for this purpose...