7th European BSD Conference: Oct 18-19 2008, Strasbourg, France
        
    
    Dynamic memory allocation for dirhash in UFS2
Nick Barkas
Abstract
Hello
My name is Nick Barkas. I'm a master's student studying scientific  
computing at Kungliga Tekniska högskolan (KTH) in Stockholm, Sweden. I  
have just begun work on a Google Summer of Code project with FreeBSD:  
Dynamic memory allocation for dirhash in UFS2
 . I would like to present my results from this project at EuroBSDCon  
this year. This project is very much a work in progress now so it is a  
bit difficult to summarize what I would ultimately present. I will try  
to describe an outline, though.
First I will give background information on dirhash: an explanation of  
the directory data structure in UFS2, how directory lookups in this  
structure necessitate a linear search, and how dirhash speeds these  
lookups up without having to change anything about the directory data  
structure. Next I will explain the current limitation that dirhash's  
maximum memory use must be manually specified by administrators, or  
left at a small conservative default of 2MB. I will explain some  
different methods I will have explored to try and make this maximum  
memory limit dynamically increase and decrease as the system has more  
or less free memory, and which method I will have ultimately settled  
on and implemented. Then I'll present some test results of performance  
of operations on very large directories with and without dynamic  
memory allocation enabled for dirhash.
Next I will talk about how speed gains from dirhash are limited by the  
fact that the hash tables exist only in memory and must be recreated  
after each system boot, as big directories are scanned for the first  
time, or even have to be recreated for a directory that has not been  
scanned in some time if its dirhash has been discarded to free memory.  
These problems can be eliminated by using an on-disk index for  
directory entries. I will talk about some of the challenges of  
implementing on-disk indexing, such as remaining backwards compatible  
with older versions of UFS2 and interoperating properly with  
softupdates. Then, if my SoC project has permitted me time to work on  
this aspect of it, I will explain some possible methods for adding  
directory indexing to UFS2 that meets these challenges, and which of  
those ideas I will have implemented. Finally I will present results of  
some benchmarks on this filesystem with indices, and compare to  
performance with dirhash, and with no indices or dirhashes.
Keywords
dirhash, ufs2, filesystems, performance tuning