2007/09/18

wishful feature: zfs splitfs

Example:
I have a tank/data filesystem, with my important "stuff" in it, including /tank/data/oracle and /tank/data/webcontent. This is a production system, so I can't shut down to move the data around. I need to quota off the web content so it doesn't run Oracle out of space.

So what I'd like to do is...
zfs splitfs tank/data/webcontent
zfs set quota=5g tank/data/webcontent

Conceptually, it seems simple enough. Just create the appropriate new zfs filesystem entries in the pool with its root inode pointing at an existing directory. No data copying necessary.

Unfortunately, I think it would not work because there may be open files on the new (-ly partitioned) filesystem, so the (fsid,inode) pair on those open files would have to be changed to be (newid,inode) on all processes. Atomically. As part of the update to the zpool metadata. Or else the kernel would have to be able to realize that the same inode is referenced by two different filesystems. :(

--Joe

1 comment:

Constantin said...

Hi Joe,

this could be done quite easily:

zfs snapshot tank/data@split
zfs clone tank/data@split tank/webcontent2
mv /tank/webcontent2/webcontent/* /tank/webcontent2
rmdir /tank/webcontent2/webcontent
svcadm disable -t $service_that_uses_webcontent
mv /tank/webcontent /tank/webcontent_old
zfs set mountpoint tank/webcontent2 /tank/webcontent
svcadm enable $service_that_uses_webcontent

You can then proceed and delete the old webcontent stuff, promote the clone and get rid of the snapshot if you want to recover space. You can also rename the webcontent2 filesystem to webcontent after the migration for cosmetic purposes.

Yes, this involves a short service interruption (which can be minimized to a few seconds with some scripting), but even if there was a zfs split command, it would need a short service interruption to perform the unmount/mount necessary to introduce the new filesystem.

Of course the above is quick'n'dirty, but after some scripting and testing it should do what you want.

Hope this helps,
Constantin