| 08-13-2008, 07:21 PM | #1 |
JASS:function StringMatch takes string text,string mask,boolean case returns boolean local string a local string b local string x local string y local integer i = 0 local integer j = 0 local boolean m if case then set a = text set b = mask else set a = StringCase(text, true) set b = StringCase(mask, true) endif set x = SubString(a, 0, 1) set y = SubString(b, 0, 1) if x == null and y == null then return false endif loop if y == null then return x == null elseif y == "#" then exitwhen not (x == "0" or x == "1" or x == "2" or x == "3" or x == "4" or x == "5" or x == "6" or x == "7" or x == "8" or x == "9") elseif y == "\\" then set j = j + 1 set y = SubString(b, j, j + 1) exitwhen x != y elseif y == "?" then // nothing to do elseif y == "[" then set j = j + 1 set y = SubString(b, j, j + 1) exitwhen y == null set m = false if y == "!" then loop set j = j + 1 set y = SubString(b, j, j + 1) if y == null then return false endif exitwhen y == "]" if x == y then set m = true endif endloop exitwhen y != "]" exitwhen m == true else loop if y == null then return false endif if x == y then set m = true endif set j = j + 1 set y = SubString(b, j, j + 1) exitwhen y == "]" endloop exitwhen y != "]" exitwhen m == false endif elseif y == "*" then loop set j = j + 1 set y = SubString(b, j, j + 1) exitwhen y != "*" endloop // if y == null then if StringLength(y) < 1 then return true endif set a = SubString(a, i, StringLength(a)) set b = SubString(b, j, StringLength(b)) loop exitwhen a == null if StringMatch(a, b, case) then return true endif set i = i + 1 set a = SubString(a, 1, StringLength(a)) endloop return false else exitwhen x != y endif set i = i + 1 set j = j + 1 set x = SubString(a, i, i + 1) set y = SubString(b, j, j + 1) endloop return false endfunction Basic case (in)sensitive pattern matching. Supported wildcard characters: * - matches 0 or more any characters ? - matches exactly 1 any character # - matches any digit, 0-9 [list] - matches any character in <list> (l, i, s and t in this example) [!list] - matches any character that isn't in the list Use \\* or \\? or \\[ to match a * or ? or [ respectively. To get a ] in a list, put it as first character of the list. To get a ! in a list, don't put it first. By common convention, special characters *, ? and # have no meaning when used in a list. If "case" is true, the matching is case sensitive. Examples: Code:
match("AceHart", "", case) returns false
match("AceHart", "*", case) returns true
match("AceHart", "*acehart*", false) returns true
match("AceHart", "*e*", case) returns true
match("AceHart", "**test**", case) returns false
match("AceHart", "Ace?art") returns true
match("Gray", "gr[ea]y", false) returns true
"####" matches any 4 digits
"[!0123456789]" matches any character that is not a digit Previous:function match takes string text,string mask returns boolean local string t = SubString(text, 0, 1) local string m = SubString(mask, 0, 1) if m == null then if t == null then return true endif set t = null return false endif if m == "?" then if t == null then return false endif return match(SubString(text, 1, StringLength(text)), SubString(mask, 1, StringLength(mask))) elseif m == "*" then set t = text set m = SubString(mask, 1, StringLength(mask)) if m == null then set t = null return true endif loop if match(t, m) then set t = null set m = null return true endif set t = SubString(t, 1, StringLength(t)) exitwhen t == null endloop if m == null then return true endif set m = null return false elseif m == "\\" then set mask = SubString(mask, 1, StringLength(mask)) if mask == null then set m = null set t = null return false endif set m = SubString(mask, 0, 1) endif if StringCase(m, false) != StringCase(t, false) then set t = null set m = null return false endif set t = null set m = null return match(SubString(text, 1, StringLength(text)), SubString(mask, 1, StringLength(mask))) endfunction |
| 08-13-2008, 07:55 PM | #2 |
did you tested oplimit, what is maximum string lenght allowed? |
| 08-13-2008, 08:39 PM | #3 |
No idea. It depends more on the number of "*" than string length. 500 characters and some mask line "**a**b**c**d**e**f**" can still work. It might fail if it doesn't find anything. My current best guess is that you hit some recursion limit rather than operations limit. |
| 08-14-2008, 01:34 AM | #4 |
This could be really helpful for chat driven npc conversations. Good work! idea: JASS:function onChat takes nothing returns nothing local string in = GetEventPlayerChatString() if match(in, "*hi*") or match(in, "*hello*") then call ConversationStart() elseif match(in, "*bye*") or match(in, "*see ya*") then call ConversationEnd() endif endfunction |
| 08-14-2008, 07:35 AM | #5 |
It definitely needs a better naming convention, though. "match" just doesn't cut it. Maybe "IsStringMatch" or something would work better. |
| 08-15-2008, 01:07 AM | #6 |
Sweet. I was just working on a chat-driven NPC dialogue system. |
| 08-15-2008, 03:25 AM | #7 |
I got my doubts about the implementation, It seems at first glance the recursion is not helping. I got the impression the "*" algorithm got factorial time complexity which would be quite bad, I am asleep to really read code, will do a more serious review later. Edit: You don't need to null strings. Edit II: Yes, factorial time complexity. if match("AAAAAAAAAAA","*A*A*A*A*A*A*B") then //crash the thread Edit III: I am working on a version that will have polynomial complexity, if it works, I'll post it... Tide-Arc Ephemera: Removed your spam. |
| 08-15-2008, 05:04 AM | #8 |
This got O( length(text) * length(mask) ) time complexity: ouch:// // Seems 1.5 years of programming contest really help! ... In war3 map making stuff... // // // Sorry, but this is the sort of thing that requires a 2d array, and this is vJass for // a 2d array... if you want, you can manage to do the whole thing using a single array // and internally doing all that offset stuff, it will just make the match code even harder // tounderstand though, that's the reason I didn't do it. // private struct dp static integer width=50 static method operator [] takes integer x returns dp return dp(x* dp.width) endmethod private boolean value method operator [] takes integer i returns boolean return dp( integer(this)+i ).value endmethod method operator []= takes integer i, boolean v returns nothing set dp( integer(this)+i ).value=v endmethod endstruct function match takes string text,string mask returns boolean local integer lt=StringLength(text) local integer lm=StringLength(mask) local integer i local integer k local integer j local string ch set text=StringCase(test,false) set mask=StringCase(mask,false) set dp.width=lm+1 set dp[lt][lm] = true set i=lt-1 loop exitwhen (i<0) set dp[i][lm]=false set i=i-1 endloop set i=lt loop exitwhen (i<0) set j=lm-1 loop exitwhen(j<0) set ch=SubString(mask,j,j+1) if(ch=="\\") then set ch=SubString(mask,j+1,j+2) set dp[i][j] = ( (dp[i+1][j+2]) and (ch==SubString(text,i,i+1)) ) elseif(ch=="?") then set dp[i][j] = dp[i+1][j+1] elseif(ch=="*") then set dp[i][j] = ( dp[i][j+1] or ( (i<lt) and dp[i+1][j]) ) else set dp[i][j] = ( (dp[i+1][j+1]) and (ch==SubString(text,i,i+1)) ) endif set j=j-1 endloop set i=i-1 endloop //call dp.show(lt,lm) return dp[0][0] endfunction The problem is that the original match() implementation has around O(text) best case complexity, while mine takes O(text*mask) time on every test case, so, with some simple masks and long texts my implementation will thread crash while AceHart's won't. However, mine is more predictable and you would know that with a certain text size and mask size it will always take the same amount of time/operations, regardless of what the mask or text contain. The implementation I posted also 'leaks' less strings. If there was a quick way to fill an array with -1, I could do memo-ization instead of dp and get an implementation that got the best of both worlds. Now that I notice (while typing this paragraph) there is a way to do memo-ization without filling up with -1, the code will look funny though, going to try tomorrow when there's more time. |
| 08-15-2008, 05:31 PM | #9 |
Well, this DFA certainly gets my vote for top runner in the next OJCC (Obfuscated JASS Code Contest). Post 1 has been updated ever so slightly. Tail recursion removed, less SubStrings and some additional features. |
| 08-15-2008, 06:54 PM | #10 |
im making a Regular Expression system for wc3 to further improve this idea of matching strings |
| 08-15-2008, 07:20 PM | #11 |
if you can script Perl-like regex (or at least php-like regex), you're my personal hero... |
| 08-15-2008, 08:15 PM | #12 |
tats what im trying to do |
| 08-15-2008, 10:27 PM | #13 | |
Quote:
|
| 08-16-2008, 05:31 AM | #14 |
On a vaguely related note, consider the following two functions: JASS:function feature1 takes nothing returns string local string s = "aaa" local string c local integer i = 0 loop set c = SubString(s, i, i + 1) set i = i + 1 exitwhen c == null endloop if c == null then return "NULL" endif return "???" endfunction function feature2 takes nothing returns string local string s = "aaa" local string c local integer i = 0 loop set c = SubString(s, i, i + 1) set i = i + 1 exitwhen c != "a" endloop if c == null then return "NULL" endif return "???" endfunction And some basic call: call DisplayTextToPlayer(GetLocalPlayer(), 0, 0, feature1()) call DisplayTextToPlayer(GetLocalPlayer(), 0, 0, feature2()) Personally, I was expecting both to say "NULL". However, only the first one does... EDIT: Continued here. |
| 08-16-2008, 12:37 PM | #15 |
It seems that is better to use an empty string "" instead of null Also display the iterations (i) and the string c of your loops it's pretty strange. |
