| 08-12-2007, 07:53 AM | #1 |
The hash table submissions theoretically have problems with inconsistent performance. If we have all the keys before we need to do any insertions or retrievals, then we can definitely avoid this, although I make no claim that it is faster than gamecache. JASS:// perfect hashing, intended to be O(1) on all operations // JASS Design Constraints: // Very few arithmetic operators // Fixed array size // But, space is cheap. // So, implement a hash function using arrays to lookup integer chunks globals integer array hash1 //each hash array subset of Z < 4096 to 0..nhash1 integer array hash2 integer array hash3 constant integer tsize = 4096 constant integer nhash1 //nhash1*nhash2*nhash3 <= 8190 constant integer nhash2 constant integer nhash3 endglobals //The hash function //Annoyingly hand eliminated variables, but left constants unpropagated function lookup takes integer key returns integer local integer k2 = key / tsize local integer k1 = key - k2*tsize return hash1[k1] + nhash1*hash2[(key - k2*tsize)/tsize] + nhash1*nhash2*hash3[k2/tsize] endfunction //Add a key to the hash function function add takes integer key returns nothing local integer k2 = key / tsize local integer k1 = key - k2*tsize local integer k3 = k2/tsize set k2 = (key-k2*tsize)/tsize if hash1[k1] == 0 then set nhash1 = nhash1 + 1 set hash1[k1] = nhash1 endif if hash2[k2] == 0 then set nhash2 = nhash2 + 1 set hash2[k2] = nhash2 endif if hash3[k3] == 0 then set nhash3 = nhash3 + 1 set hash3[k3] = nhash3 endif endfunction Can this be made dynamic? I am not sure. Deletion could be added by keeping reference counts and stacks of free indices. However, changing nhash's would require rehashing the entire table unless you keep the slots allocated to each table fixed to some product less than 8190. This may be a practical limitation for some applications. For example, if your map is guaranteed to allocate less than 32768 handles, and you need a table on at most 512 of those, you can put a zero for nhash3 (i.e. remove it), an 8 for nhash2, and 512 for nhash1. |
| 08-12-2007, 08:57 AM | #2 | |
Very good idea Pipe. I don't know how the algorithm works, but I know that for a set of known keys given enough time it's possible to construct a minimal perfect hash function. Quote:
Now considering we know how many handles are created (or supported for that matter) at most (eg. 32768 in your example) and we know the first handle index (i think it's 0x100000 or something) there is no need to know the algorithm and there is no need to implement it in JASS. All we need to do is use a minimal perfect hash-generator (like this for example: http://burtleburtle.net/bob/hash/perfect.html ) to generate a hashfunction that maps numbers 0x100000..0x100000+32768 to 0..32768-1 and port the generated hash function to JASS. Now we only need max_supported_objects / JASS_MAX_ARRAY_SIZE - 1 arrays to hold the data. (that's 4 + a tiny bit of a fifth array for our supposed limit of 32768. After getting the hash key we just use key / JASS_MAX_ARRAY_SIZE - 1 to find out which array it goes to and ModuloInteger(key, JASS_MAX_ARRAY_SIZE - 1) to find out the index in that array. Now considering that declaring some additional arrays doesn't have any effect on performance we practically don't need to limit the number of supported handles at all. Things to consider: Special types like texttags and lightnings don't need the hash function since their handle ids start at 0 anyway, they only need their own arrays to store their data. For string ids i don't know if there's a certain range where they can be found. Who knows? |
| 08-12-2007, 09:05 AM | #3 |
Perfectly mapping 0x100000 .. 0x100000+N to 0..N-1 is just subtraction. Then you branch N/8191 or log N / log 8191 times (linear or bisection) to choose the array. So really handles was a stupid example for me to bring up, as we have had that solution for a long time. |
| 08-12-2007, 09:12 AM | #4 |
That's what I was thinking just now, hmm. Don't know what else it could be used for though. String hashing maybe in special cases. Anyway one of those perfect hash-function generators could be ported to a JASS preprocessor if that's of any use. |
| 08-12-2007, 09:19 AM | #5 | |
Quote:
I just wanted to agree, completely. PS: Binary interpolation is the best method here but I didn't see anyone using it. |
| 08-12-2007, 09:22 AM | #6 |
Well my original motivation was for static fourcc codes. Using an external program to implement the perfect hashing would definitely work better as you can easily imagine that once you have cube root of 8190 codes the generator I suggest will fail if they're different in each group of 12 bits. Hopefully they won't be though if they're for example mostly 'I0xx' for item ids, but you have a few non custom odd ones that you need to take care of. Then I wondered if there was a way to do dynamic stuff better, or at least for some special use case, but considering the above problem, probably not. |
| 08-12-2007, 09:38 AM | #7 | |
Quote:
LoL you kidding me? For external programs (I assume this has something to do with pJASS) simply use java hashtables. |
| 08-12-2007, 09:55 AM | #8 |
No, we mean an external program to generate the lookup tables or algebra, like gperf targeted for JASS instead of C. |
| 08-12-2007, 10:10 AM | #9 |
I don't follow you. What data exactly do you need hashing for? JASS bytecode? Item/Ability IDs or strings? Some sort of map optimization? |
| 08-12-2007, 10:15 AM | #10 |
The aim is a JASS implementation of a perfect hash-function for a given set of object ids or strings. |
| 08-12-2007, 12:57 PM | #11 | |
Quote:
strings - begins from 1... so it doesnt need some hash functions. but it not save/load compatible like lightnings,images,ubersplats,textags. (and maybe some others) so to attach something on that types makes no sense. i like cohadars idea (but not directly) it makes the best work. you need mod + some ifs and thats all... the loop is useless. maybe i will make some example. |
| 08-12-2007, 01:43 PM | #12 |
Man you really piss me off with that oh I hate loops thing. LOOPS HAPPEN ONLY WHEN HASH IS COLLIDING = ONCE IN MILLION YEARS. Thats it, I am mad now, I am going to make the version with less loops just to prove you wrong. |
| 08-12-2007, 02:05 PM | #13 |
cohadar drink a cool nice beer and relax ^^ btw it was a compliment from my side... // TcX is comming fucking awesome soon ^^ ... |
| 08-12-2007, 08:32 PM | #14 |
//ok |
