HomeUser Control Panel (unavailable in archive)ForumsTutorialsArt GalleryResourcesMaps

Integer maps again

08-12-2007, 07:53 AM#1
PipeDream
The hash table submissions theoretically have problems with inconsistent performance. If we have all the keys before we need to do any insertions or retrievals, then we can definitely avoid this, although I make no claim that it is faster than gamecache.

Collapse JASS:
// perfect hashing, intended to be O(1) on all operations
// JASS Design Constraints:
//   Very few arithmetic operators
//   Fixed array size
//   But, space is cheap.
// So, implement a hash function using arrays to lookup integer chunks
globals
    integer array hash1    //each hash array subset of Z < 4096 to 0..nhash1
    integer array hash2
    integer array hash3
    constant integer tsize = 4096
    constant integer nhash1 //nhash1*nhash2*nhash3 <= 8190
    constant integer nhash2
    constant integer nhash3
endglobals

//The hash function
//Annoyingly hand eliminated variables, but left constants unpropagated
function lookup takes integer key returns integer
    local integer k2 = key / tsize
    local integer k1 = key - k2*tsize
    return hash1[k1] + nhash1*hash2[(key - k2*tsize)/tsize] + nhash1*nhash2*hash3[k2/tsize]
endfunction

//Add a key to the hash function
function add takes integer key returns nothing
    local integer k2 = key / tsize
    local integer k1 = key - k2*tsize
    local integer k3 = k2/tsize
    set k2 = (key-k2*tsize)/tsize

    if hash1[k1] == 0 then
        set nhash1 = nhash1 + 1
        set hash1[k1] = nhash1
    endif
    if hash2[k2] == 0 then
        set nhash2 = nhash2 + 1
        set hash2[k2] = nhash2
    endif
    if hash3[k3] == 0 then
        set nhash3 = nhash3 + 1
        set hash3[k3] = nhash3
    endif
endfunction

Can this be made dynamic? I am not sure. Deletion could be added by keeping reference counts and stacks of free indices. However, changing nhash's would require rehashing the entire table unless you keep the slots allocated to each table fixed to some product less than 8190. This may be a practical limitation for some applications. For example, if your map is guaranteed to allocate less than 32768 handles, and you need a table on at most 512 of those, you can put a zero for nhash3 (i.e. remove it), an 8 for nhash2, and 512 for nhash1.
08-12-2007, 08:57 AM#2
PitzerMike
Very good idea Pipe.
I don't know how the algorithm works, but I know that for a set of known keys given enough time it's possible to construct a minimal perfect hash function.

Quote:
Minimal perfect hashing guarantees that n keys will map to 0..n-1 with no collisions at all.

Now considering we know how many handles are created (or supported for that matter) at most (eg. 32768 in your example) and we know the first handle index (i think it's 0x100000 or something) there is no need to know the algorithm and there is no need to implement it in JASS.

All we need to do is use a minimal perfect hash-generator (like this for example: http://burtleburtle.net/bob/hash/perfect.html ) to generate a hashfunction that maps numbers 0x100000..0x100000+32768 to 0..32768-1 and port the generated hash function to JASS.

Now we only need max_supported_objects / JASS_MAX_ARRAY_SIZE - 1 arrays to hold the data. (that's 4 + a tiny bit of a fifth array for our supposed limit of 32768.
After getting the hash key we just use key / JASS_MAX_ARRAY_SIZE - 1 to find out which array it goes to and ModuloInteger(key, JASS_MAX_ARRAY_SIZE - 1) to find out the index in that array.

Now considering that declaring some additional arrays doesn't have any effect on performance we practically don't need to limit the number of supported handles at all.


Things to consider: Special types like texttags and lightnings don't need the hash function since their handle ids start at 0 anyway, they only need their own arrays to store their data.
For string ids i don't know if there's a certain range where they can be found. Who knows?
08-12-2007, 09:05 AM#3
PipeDream
Perfectly mapping 0x100000 .. 0x100000+N to 0..N-1 is just subtraction. Then you branch N/8191 or log N / log 8191 times (linear or bisection) to choose the array. So really handles was a stupid example for me to bring up, as we have had that solution for a long time.
08-12-2007, 09:12 AM#4
PitzerMike
That's what I was thinking just now, hmm.
Don't know what else it could be used for though.
String hashing maybe in special cases.
Anyway one of those perfect hash-function generators could be ported to a JASS preprocessor if that's of any use.
08-12-2007, 09:19 AM#5
cohadar
Quote:
Originally Posted by PipeDream
Perfectly mapping 0x100000 .. 0x100000+N to 0..N-1 is just subtraction. Then you branch N/8191 or log N / log 8191 times (linear or bisection) to choose the array. So really handles was a stupid example for me to bring up, as we have had that solution for a long time.

I just wanted to agree, completely.
PS: Binary interpolation is the best method here but I didn't see anyone using it.
08-12-2007, 09:22 AM#6
PipeDream
Well my original motivation was for static fourcc codes. Using an external program to implement the perfect hashing would definitely work better as you can easily imagine that once you have cube root of 8190 codes the generator I suggest will fail if they're different in each group of 12 bits. Hopefully they won't be though if they're for example mostly 'I0xx' for item ids, but you have a few non custom odd ones that you need to take care of.

Then I wondered if there was a way to do dynamic stuff better, or at least for some special use case, but considering the above problem, probably not.
08-12-2007, 09:38 AM#7
cohadar
Quote:
Originally Posted by PipeDream
Then I wondered if there was a way to do dynamic stuff better, or at least for some special use case, but considering the above problem, probably not.

LoL you kidding me?

For external programs (I assume this has something to do with pJASS)
simply use java hashtables.
08-12-2007, 09:55 AM#8
PipeDream
No, we mean an external program to generate the lookup tables or algebra, like gperf targeted for JASS instead of C.
08-12-2007, 10:10 AM#9
cohadar
I don't follow you.
What data exactly do you need hashing for?

JASS bytecode?
Item/Ability IDs or strings?
Some sort of map optimization?
08-12-2007, 10:15 AM#10
PitzerMike
The aim is a JASS implementation of a perfect hash-function for a given set of object ids or strings.
08-12-2007, 12:57 PM#11
Toadcop
Quote:
The aim is a JASS implementation of a perfect hash-function for a given set of object ids or strings.
thats just toys ^^

strings - begins from 1... so it doesnt need some hash functions. but it not save/load compatible like lightnings,images,ubersplats,textags. (and maybe some others) so to attach something on that types makes no sense.
i like cohadars idea (but not directly) it makes the best work. you need mod + some ifs and thats all... the loop is useless.
maybe i will make some example.
08-12-2007, 01:43 PM#12
cohadar
Man you really piss me off with that oh I hate loops thing.

LOOPS HAPPEN ONLY WHEN HASH IS COLLIDING = ONCE IN MILLION YEARS.

Thats it, I am mad now,
I am going to make the version with less loops just to prove you wrong.
08-12-2007, 02:05 PM#13
Toadcop
cohadar drink a cool nice beer and relax ^^ btw it was a compliment from my side...

// TcX is comming fucking awesome soon ^^ ...
08-12-2007, 08:32 PM#14
Veev
//ok