7th European BSD Conference: Oct 18-19 2008, Strasbourg, France
Dynamic memory allocation for dirhash in UFS2
Nick Barkas
Abstract
Hello
My name is Nick Barkas. I'm a master's student studying scientific
computing at Kungliga Tekniska högskolan (KTH) in Stockholm, Sweden. I
have just begun work on a Google Summer of Code project with FreeBSD:
Dynamic memory allocation for dirhash in UFS2
. I would like to present my results from this project at EuroBSDCon
this year. This project is very much a work in progress now so it is a
bit difficult to summarize what I would ultimately present. I will try
to describe an outline, though.
First I will give background information on dirhash: an explanation of
the directory data structure in UFS2, how directory lookups in this
structure necessitate a linear search, and how dirhash speeds these
lookups up without having to change anything about the directory data
structure. Next I will explain the current limitation that dirhash's
maximum memory use must be manually specified by administrators, or
left at a small conservative default of 2MB. I will explain some
different methods I will have explored to try and make this maximum
memory limit dynamically increase and decrease as the system has more
or less free memory, and which method I will have ultimately settled
on and implemented. Then I'll present some test results of performance
of operations on very large directories with and without dynamic
memory allocation enabled for dirhash.
Next I will talk about how speed gains from dirhash are limited by the
fact that the hash tables exist only in memory and must be recreated
after each system boot, as big directories are scanned for the first
time, or even have to be recreated for a directory that has not been
scanned in some time if its dirhash has been discarded to free memory.
These problems can be eliminated by using an on-disk index for
directory entries. I will talk about some of the challenges of
implementing on-disk indexing, such as remaining backwards compatible
with older versions of UFS2 and interoperating properly with
softupdates. Then, if my SoC project has permitted me time to work on
this aspect of it, I will explain some possible methods for adding
directory indexing to UFS2 that meets these challenges, and which of
those ideas I will have implemented. Finally I will present results of
some benchmarks on this filesystem with indices, and compare to
performance with dirhash, and with no indices or dirhashes.
Keywords
dirhash, ufs2, filesystems, performance tuning