| 11-25-2009, 04:19 AM | #1 |
Updated:: Dictionary is an associative array, like Vexorian's table. Unlike Table however, Dictionary keeps track of elements stored in it and provides a means to iterate through the collection. This is basically a linked list and a hashtable combined. There is a readonly array struct that will inline, and doesn't allow mutation. The macro is instanced currently for 3 types Dict_integer Dict_string Dict_agent Thankfully it is a macro so you can instance your own. just at the bottom of Dictionary library instance to your hearts content. The macro takes 3 parameters //! runtextmacro DictTemplate (T,HashFunc,DefaultVal) where T is the type, HashFunc is a function to generate a hash, and defaultVal is the default for the type. Another included type is Set. Set is like a unitgroup but for any type T. Set provides a readonly node as well, and a fast contain operator. It too is all constant time operations except for its clear op. More text in the script header. Zinc://! zinc //! zinc library Dictionary { // Thanks goes to Jesus4lyf for his clever optimizations and showing // the light of typecast abuses. // Also thanks to Vexorian for Zinc and JassHelper. Zinc makes writing Jass // enjoyable and quick. // Note: T is a type. // by default T is either integer,agent, or string. //============================================================================= // Dict_T //============================================================================= // Dictionary is an associative array that keeps track of elements // stored in the array. All its operations are o(1) unless stated otherwise. // Hashtable calls # are displayed in [] next to the method description. // Property Index:: // ---------------- // Count->integer // returns number of elements in this dictionary. // // operator[T key]->integer [1 ht load] // returns the Value associated with this key. 0 if it fails. // // operator[T key]=integer [max 1 ht loads,1 ht save] // associate key with value in the dictionary. // Method Index:: // -------------- // Contains(T key)->boolean [1 ht load] // returns if this key is in the dictionary. // // Add(T key,integer value)->boolean [max 1 ht load, 1 ht save] // tries to add key,value to the dictionary, if and only if the key // isn't already part of the collection. Returns true if key was added. // // Remove(T key)->integer [max 1 ht remove, 1 ht load] // tries to remove this key from the collection. Returns the value // that was associated if it succeeded, 0 if it failed. // // GetFirstEntry()->DictEntry_T // returns a new dictEntry a readonly accessor through the collection //see further documentation For the enumeration paradigm. // // ForEach(ActionDict_T func) // applies func to each key,value in the collection. Useful for generating // debug messages, or when performance isn't critical. // Zinc is great here because of anon funcs. If for instance you want // you want to print all the entries in the collection, but only while testing // you can pull it off as a oneliner. /* example:: debug dict.ForEach(function(integer key,integer value){ BJDebugMsg("key= " + I2S(key) + "value= " + I2S(value));}); */ // Clear() (O(n) operation) [1 ht child hashtable flush] // Clears the collection of all elements in it. //============================================================================= // Set_T //============================================================================= // Set is a uniqueness enforcing collection structure. Only one // instance of any object can be stored in the collection at a time. // It is like a unitgroup but for any type T that you can instance the macro for. // Internally it is a thin-wrapper for a dictionary, and thus should inline // around the calls to dictionary. // Property Index:: // ---------------- // Count->integer // returns number of elements in this dictionary. // // Method Index:: // -------------- // Contains(T object)->boolean [1 ht load] // returns if this object is in the dictionary. // // Add(T object)->boolean [max 1 ht load, 1 ht save] // tries to add object to the dictionary, if and only if the object // isn't already part of the set. Returns if the object was added. // // Remove(T object)->boolean [max 1 ht remove, 1 ht load] // tries to remove object from the collection. Returns if the // the object was removed. // // GetFirstEntry()->SetEntry_T // returns the first entry in the set further documentation For // the enumeration paradigm (it kind of sucks!) // // ForEach(ActionSet_T func) // applies func to all members of the set. It is useful in non // performance critical application // Clear() (O(n) operation) [1 ht child hashtable flush] // Clears the collection of all elements in it. //============================================================================= // SetEntry/DictEntry_T //============================================================================= // Set/DictEntry_T is a safe way to enumerate through a collection, // it is high performance as it will inline, and doesn't use any hashtable // calls. They are also array structs meaning that no create or destroy // method should ever be called (may also be an error!) //Unfortunately, the enumeration paradigm is pretty awful // Zinc -- this is actually pretty clean to be honest this is almost // as good as it gets! /* here for copy and paste usage:: DictEntry_integer node; integer key; integer value; for(node = dict.GetFirstEntry(); node != dict; node = node.Next) { key = node.Key; value = node.value; // code goes here } */ // Vjass -- this ofcourse is much uglier /* local DictEntry_integer node = dict.GetFirstEntry(); local integer key local integer value loop exitwhen node == dict set key = node.Key set value = node.Value // code goes here set node = node.Next endloop */ // DictEntry_T Property Index:: // ============================== // Key-> T // returns the key at this point in the enumeration. // Value -> integer // returns the value at this point in the enumeration. // SetEntry_T Property Index:: // ============================= // Object -> T // returns the object stored at this point in the enumeration. // Shared Properties:: // ==================== // Next -> thistype // returns the next node in a safe readonly wrapper. //============================================================================= hashtable Table = InitHashtable(); interface INode { INode Next; INode Prev; integer Value; } //! textmacro DictTemplate takes T,HashFunc,DefaultVal // Node is private, and should stay that way. // A consumer should never have to know about this class. public type ActionDict_$T$ extends function($T$,integer); public type ActionSet_$T$ extends function($T$); struct Node_$T$ extends INode { $T$ Key; static method create($T$ key,integer value)->thistype { thistype n = thistype.allocate(); n.Key = key; n.Value = value; return n; } } public struct SetEntry_$T$[] { method operator Next()->thistype { return Node_$T$(this).Next; } method operator Object()->$T$ { return Node_$T$(this).Key; } } public struct DictEntry_$T$[] { method operator Next()->thistype { return Node_$T$(this).Next; } method operator Key()->$T$ { return Node_$T$(this).Key; } method operator Value()->integer { return Node_$T$(this).Value; } } public struct Dict_$T$ { private method Fetch($T$ key)->Node_$T$ { return LoadInteger(Table,integer(this),$HashFunc$(key)); } method Contains($T$ key)->boolean { return HaveSavedInteger(Table,integer(this),$HashFunc$(key)); } private method operator Prev()->Node_$T$ { return Node_$T$(this).Prev; } private method operator Prev=(Node_$T$ node) { Node_$T$(this).Prev = node; } private method operator Next()->Node_$T$ { return Node_$T$(this).Next; } method operator Count()->integer { return Node_$T$(this).Value; } private method operator Count=(integer value) { Node_$T$(this).Value = value; } method operator[]($T$ key)->integer { return Fetch(key).Value; } method operator[]=($T$ key,integer value) { Node_$T$ node = Fetch(key); if(node == 0) { node = Node_$T$.create(key,value); Prev.Next = node; node.Next = this; Prev = node; Count+=1; SaveInteger(Table,integer(this),$HashFunc$(key),node); } else { node.Value = value; } } method Add($T$ key,integer value)->boolean { Node_$T$ node; if(Contains(key)) return false; node = Node_$T$.create(key,value); Prev.Next = node; node.Next = this; Prev = node; Count+=1; SaveInteger(Table,integer(this),$HashFunc$(key),node); return true; } method Remove($T$ key)->integer { Node_$T$ node = Fetch(key); integer retVal = node.Value; if(node != 0) { RemoveSavedInteger(Table,integer(this),$HashFunc$(key)); node.Prev.Next = node.Next; node.Next.Prev = node.Prev; node.destroy(); Count-=1; } return retVal; } method GetFirstEntry()->DictEntry_$T$ { return DictEntry_$T$(Next); } method Clear() { Node_$T$ node = this.Next; while(node != this) { node.destroy(); node = node.Next; } FlushChildHashtable(Table,integer(this)); Count = 0; } static method create()->thistype { Node_$T$ n = Node_$T$.create($DefaultVal$,0); n.Next = n; n.Prev = n; return n; } method ForEach(ActionDict_$T$ func) { Node_$T$ node; for(node = this.Next; node != this; node = node.Next) { func.evaluate(node.Key,node.Value); } } private method onDestroy() { Clear(); Node_$T$(this).destroy(); } } public struct Set_$T$ { method operator Count()->integer { return Dict_$T$(this).Count; } method Add($T$ object)->boolean { return Dict_$T$(this).Add(object,-1); } method Remove($T$ object)->boolean { return Dict_$T$(this).Remove(object) == -1; } method Contains($T$ object)->boolean { return Dict_$T$(this).Contains(object); } method Clear() { Dict_$T$(this).Clear(); } private method onDestroy() { Dict_$T$(this).destroy(); } method GetFirstEntry()->SetEntry_$T$ { return SetEntry_$T$(Node_$T$(this).Next); } method ForEach(ActionSet_$T$ func) { Node_$T$ node; for(node = Node_$T$(this).Next; node != this; node = node.Next) { func.evaluate(node.Key); } } static method create()->thistype { return Dict_$T$.create(); } } //! endtextmacro //! runtextmacro DictTemplate("integer","","0") //! runtextmacro DictTemplate("agent","GetHandleId","null") //! runtextmacro DictTemplate("string","StringHash","null") } //! endzinc Here is fibonacci. You can try running the Fib method which has exponential runtime, and quickly blow out your op limit, or you can use the faster MemoizedFib which can compute any value in a worse case of o(n). Also demonstrated is how to enumerate through the values of the dictionary. Using the for loop feature of zinc. Its slightly miserable and a bit long, but it will compile cleanly to some nice jass. You'll note that you don't have to destroy the entry at the end, you are done as it is an array struct and isn't "instanced". Alternatively, a slower fashion of generating a loop through all entries is the forEach method. You can use it for debug purposes or whatever. Its not terribly less code to use the callback, but maybe you can use it do something interesting. Zinc:library fib { Dict_integer Dict; public function Fib(integer n)->integer { if(n < 2) return n; return Fib(n-1)+Fib(n-2); } public function MemoizedFib(integer n)->integer { if(Dict.Contains(n)) return Dict[n]; if(n < 2) { Dict[n] = n; return n; } Dict[n] = MemoizedFib(n-1)+MemoizedFib(n-2); return Dict[n]; } function onInit() { RO_DictNode_integer node; Dict = Dict_integer.create(); BJDebugMsg(I2S(MemoizedFib(15))); for(node = Dict.GetReadOnlyNode(); node != Dict; node = node.Next) { BJDebugMsg("key= " + I2S(node.Key) + "value= " + I2S(node.Value)); } function onInit() { DictEntry_integer node; Dict = Dict_integer.create(); MemoizedFact(15); static if(!DEBUG_MODE) { for(node = Dict.GetFirstEntry(); node != Dict; node = node.Next) { BJDebugMsg("key= " + I2S(node.Key) + "value= " + I2S(node.Value)); } } else { dict.ForEach(function(integer key,integer value){ BJDebugMsg("key= " + I2S(key) + "value= " + I2S(value));}); } } } UPDATED again:: This time I switched Dict_handle/Set_handle->Dict_agent/Set_agent as the fogstate return bug is unsupported and is the only way to make handles useful. Agent typecasting while still a grey arey is more supported. |
| 11-27-2009, 02:36 AM | #2 |
I also quickly wrote an example HashSet for demonstration purposes to show that its easy to wrap a hashset around it. Zinc://! zinc library HashSet requires Dictionary { //! textmacro HashSetTemplate takes T public struct Set_$T$ { Dict_$T$ m_dict; method operator Count()->integer { return m_dict.Count; } method Add($T$ object)->boolean { return m_dict.Add(object,-1); } method Remove($T$ object)->boolean { return m_dict.Remove(object) == -1; } method Contains($T$ object)->boolean { return m_dict.Contains(object); } method Clear() { m_dict.Clear(); } private method onDestroy() { m_dict.destroy(); } method GetEnumerator()->DictEnumerator_$T$ { return m_dict.GetEnumerator(); } static method create()->thistype { thistype s = thistype.allocate(); s.m_dict = m_dict.create(); return s; } } //! endtextmacro //! runtextmacro HashSetTemplate("integer") //! runtextmacro HashSetTemplate("handle") //! runtextmacro HashSetTemplate("string") } //! endzinc |
| 11-29-2009, 12:18 AM | #3 |
|
| 11-29-2009, 04:39 AM | #4 |
1. I can move the documentation up. That's not a concern. It was simply easier to write the documentation above the code in my Ide, as I could peer down a little bit. 2. I'm not sure what you'd like. I'm not going to document Node, as you can't create a node. I won't document constructors as Node or Enumerator's constructor can't be called by the end user. And the constructor for dictionary doesn't do anything. 3. I really don't like camel case starting with a lower case method names.I program in C# as my day job and the convention is capital method names. :/ I can try to fix that, but I really hate it. 4. The add method is not redundant, as it reports its insertion or failure. I can see a possible usage in something like:: Zinc:if(!Dict.Add(myKey,Value)) { // this key has some value associated do something else. } 6. I'd rather it wasn't. I can't really think of any improvements, but this and collections (which probably will just be renamed List) are going to be used in my fixed version of ItemMenu as I want to change the ItemMenus list on the hero to a hashset, and add a Hero's dictionary which will contain all the heroes instantiated as a unit-struct key-value collection. My question: Is the convention for properties lower case letters too? Because that is ugly. I really do not like the idea of seeing Dict_integer myDict = Dict_integer.create(); return myDict.count; That's horrible :/ Ps: Grim you're the guy who made me paranoid about HT calls so part of the goal of this dictionary is to avoid excess HT calls. I'm actually thinking about make a private insert method, so that Add and operator[]= can call that internally so as to avoid the double fetch that []= does when it doesn't find a node. |
| 11-29-2009, 10:40 AM | #5 |
5. You could use some obscure integer that nobody will ever use instead of 0 as the empty return. |
| 11-29-2009, 12:49 PM | #6 |
That's true, but [] currently inlines to the hashtable native, and 0 is a better return value imo. As this should be used mostly wtih structs, where 0 is nil. If you aren't using structs then that's why TryGetValue exists. |
| 11-29-2009, 02:35 PM | #7 |
Actually Grim, I just realized you probably need a typecast library to use this with handle types as the Node_handle isn't upcastable without it. So its probably best to make it instancable. If only there was some way to pass what Node type to use in the dictionary, then you'd only need one copy of it (you'd still have a generic handle taking key method like [], but when it compiled it would error out on you if you tried to use something like trigger instead of unit). |
| 12-05-2009, 04:18 AM | #8 |
I've updated the code and the documentation. Its more efficient now. I've included HashSet as its basically free. If there is desire I can probably throw in a switch in the macro so that you only have to instance the type you want. I've removed the enumerators as it seems silly to create a heavy object, when you want a light-safe way to enumerate a collection. The replacement is the RO_Dict/SetNode_T types. The enumeration pattern is described in the first post and displayed in my fib library. |
| 12-09-2009, 11:40 PM | #9 |
I've updated once more, and removed trygetvalue testing showed it was slower then if(contains) and another fetch. Is there any other criticism of the library? It performs nearly as fast as table for retrievable, it enumerates as quick as a linked list, updating or removing elements is lightening fast. The only slow point is the add method which still is reasonable. 1000 elements added in 20 ms, vs the about 11 seconds of a linked list. |
| 12-10-2009, 04:14 AM | #10 |
Anyway, you have done a lot of work on this, it is looking pretty useful now. |
| 12-12-2009, 04:52 AM | #11 |
1,2,3 fixable. TryGetValue is gone because it was an awful performer (function space is slower then 2 natives in Jass). Add is prety standard naming convention, I'd like Set and Dict to share the same name. RO_DictNode_T is an awful name I agree, But I can't think of a better one. ReadOnlyDictNodeT is really long. RODN_T/ROSN_T is shorter but pretty bad too. DictNode_T and SetNode_T might be better. Value is of type integer always, where as Object is of type T. I don't want to cause confusion. Yes, I am missing the call. Good catch! I'll fix the documentation. I'll probably introduce a ForEach(func) method, and an enumerator type (usable for any dictionary or set, just requires casting). I'll also be making a modification so that all the Node_T inherit from some interface so as to avoid extra array creation. |
| 12-15-2009, 01:43 AM | #12 |
Updated again. RO_DictNode/RO_SetNode->DictEntry/SetEntry Fixed the documentation, added the ForEach method for set and dictionary (from here I'll refer to them as maps as saying dict/set is really annoying). Made Node_T implement an interface called INode so that less arrays are made. It should be noted that this means in theory your got smaller amount of maps possible as all types here secretly inherit from Nodes. What do you think? |
| 04-16-2010, 03:52 PM | #13 |
Grim, final verdict on this? I haven't really been looking it over like you have, but I feel like I can at least see its use in some sparse cases, so I wouldn't be averse to approving it with your go ahead. |
