indexing
I hit unexpected limits today, here's the thing.
There's already an nginx module that can serve
HTTP from a tarchive. This e2http malarkey was in part so that I could
lean on an index for lookups. But I found out today that the
libext2fs
userspace doesn't use the directory index for reading.
The root of the aeolus build already contains about 15k directories,
one for each post, and each of those contains one file. Resolving
names in the root directory is slow because the namei_follow
that
I blindly followed does a linear search.
The quadractic collapse wasn't noticeable to me until I began work on
building out image layering for aeolus. tar2e2
worked well enough
reading directly from an image, but connecting over nbd cause an 8x
slow down in name resolution. And then I could see that even in the
"direct" case there was a measurable resolution penalty for entries
toward the end of the directory.
There's logic for this in the kernel's filesystem driver. There's
logic for it in e2fsprogs
! The library takes care to update
indexes when entries are linked, and the code to do that includes
index lookup functionality, but it's all private. I feel I'm being
"loudly encouraged" to work through a kernel emulation again. But I
want a namei_follow
that can use directory indexes from userpace!