Show newer

@freemo @p @tmy Well IDK about that you just seem mad that he won't make a benchmark. A benchmark isn't the only way to test efficiency, as P is proving here, although it's probably better.

Berating P over not making a benchmark just seems childish to me imho...y'all have different opinions on how this sort of thing should be handled, just agree to disagree man

@freemo @tmy Dude, next time I want to know what the HN hivemind has to say about best practices, I'm glad you're available as a resource, but I'd rather be dead than bored.
@freemo @p I think you both are over-estimating the level of content taught in CS 101.
@freemo

> A proper benchmark woukd be nice to back uo your claim

If I managed to make a hash table lookup in RAM slower than a pointer chase across the disk, I would publish that. This is just a property of the data structures and their respective storage media. It's an in-memory hash table versus an on-disk linked list, that'd be like devising and running a benchmark to determine that a search index is faster than a full scan of the disk. The only reason I bothered to eyeball it was to make sure I hadn't completely botched it by doing something stupid.

> I just wanted to know where the xlaim came from and if it had any weight.

Okay, the FS's block size is 4kB. dirents are not contiguous because they are appended piecemeal, so between two blocks that represent a directory, you will find thousands of blocks representing the files that have been written to disk in the interim. You can't predict it because it's a linked list. (Every clever thought you're having right now about putting an index in has to account for the physical medium and the failure modes and space overhead and performance. If you have any intuition about this, it is almost certainly tuned for RAM rather than disk, and most of the clever ideas have been tried already, and they did not result in a reliable, performant filesystem.) Here is a diagram. A pointer-chase across *random-access* memory is already bad, but even on SATA or NVMe disks, a seek is costlier.

Here is a hasty diagram, this is CS 101 stuff, dude.
filesystem_pointer_chase.png
@freemo I eyeballed it locally and I ran awk against the server logs. The uploads dir for FSE was 9.6MB, so just a full readdir was a 9.6MB pointer-chase across non-consecutive 4kB blocks of the disk, this is a lookup in a hash table in memory. The worst-case scenario for that was obscene, it was like ten seconds to do an 'ls' when the cache was cold. Across the board, HEAD requests are now pretty uniform and stay under 1ms of backend time.

If you need something more scientific than that, then feel free to not believe me. I'm already hacking on the next thing.
soyboys then: wHy do yOU neEd GuNs? dOnt yOu kNow TeH Us hAS nUkes?

soyboys now: PLASTIC BARRICADES ARE NOW IN PLACE. WE HAVE SECCEEDED FROM THE UNITED STATES. OUR DEMANDS MUST BE MET.

@Lainyboy@neckbeard.xyz Hmm can you rephrase your statement? It's really hard to understand.

thinks will eventually bend the knee and eventually fire and . IDK about that, but if they do, it's no big loss

@realcaseyrollins I suppose it depends when the beatdown comes and whether or not Trump is still president when it happens

Show older