Thursday, September 01, 2005

The Old New Thing : Understanding hash codes

The Old New Thing : Understanding hash codes: "# Re:Understanding hash codes
Wednesday, August 31, 2005 1:22 PM by Andrew Fedoniouk
Back to original task of creating
sting-to-numeric-id functions/maps.

Ternary search trees are generally better approach for that (compact and fast) and do no require non-deterministic hash functions.

Tested by myself on real tasks (HTML/DOM parsing/rendering). Indeed, it shows significant gains in some cases. Especially in places where hash function were not designed properly.
In general it is hard to create good hash functions - it's design is a serious research as a rule.

Canonical TST paper:
http://www.cs.princeton.edu/~rs/strings/

And about perfect hashes. There is a well known utility named gperf.exe (google:gperf.exe) - it allows to generate automaticly perfect and minimal string-to-id functions for C/C++. "

0 Comments:

Post a Comment

Links to this post:

Create a Link

<< Home