HomeUser Control Panel (unavailable in archive)ForumsTutorialsArt GalleryResourcesMaps

StringMatch

08-13-2008, 07:21 PM#1
AceHart
Collapse JASS:
function StringMatch takes string text,string mask,boolean case returns boolean
    local string a
    local string b
    local string x
    local string y
    local integer i = 0
    local integer j = 0
    local boolean m

    if case then
        set a = text
        set b = mask
    else
        set a = StringCase(text, true)
        set b = StringCase(mask, true)
    endif

    set x = SubString(a, 0, 1)
    set y = SubString(b, 0, 1)
    if x == null and y == null then
        return false
    endif

    loop
        if y == null then
            return x == null
        elseif y == "#" then
            exitwhen not (x == "0" or x == "1" or x == "2" or x == "3" or x == "4" or x == "5" or x == "6" or x == "7" or x == "8" or x == "9")
        elseif y == "\\" then
            set j = j + 1
            set y = SubString(b, j, j + 1)
            exitwhen x != y
        elseif y == "?" then
            // nothing to do
        elseif y == "[" then
            set j = j + 1
            set y = SubString(b, j, j + 1)
            exitwhen y == null
            set m = false
            if y == "!" then
                loop
                    set j = j + 1
                    set y = SubString(b, j, j + 1)
                    if y == null then
                        return false
                    endif
                    exitwhen y == "]"
                    if x == y then
                        set m = true
                    endif
                endloop
                exitwhen y != "]"
                exitwhen m == true
            else
                loop
                    if y == null then
                        return false
                    endif
                    if x == y then
                        set m = true
                    endif
                    set j = j + 1
                    set y = SubString(b, j, j + 1)
                    exitwhen y == "]"
                endloop
                exitwhen y != "]"
                exitwhen m == false
            endif
        elseif y == "*" then
            loop
                set j = j + 1
                set y = SubString(b, j, j + 1)
                exitwhen y != "*"
            endloop
//			if y == null then
            if StringLength(y) < 1 then
                return true
            endif
            set a = SubString(a, i, StringLength(a))
            set b = SubString(b, j, StringLength(b))
            loop
                exitwhen a == null
                if StringMatch(a, b, case) then
                    return true
                endif
                set i = i + 1
                set a = SubString(a, 1, StringLength(a))
            endloop
            return false
        else
            exitwhen x != y
        endif

        set i = i + 1
        set j = j + 1
        set x = SubString(a, i, i + 1)
        set y = SubString(b, j, j + 1)
    endloop
    return false
endfunction


Basic case (in)sensitive pattern matching.

Supported wildcard characters:
* - matches 0 or more any characters
? - matches exactly 1 any character
# - matches any digit, 0-9
[list] - matches any character in <list> (l, i, s and t in this example)
[!list] - matches any character that isn't in the list


Use \\* or \\? or \\[ to match a * or ? or [ respectively.
To get a ] in a list, put it as first character of the list.
To get a ! in a list, don't put it first.
By common convention, special characters *, ? and # have no meaning when used in a list.

If "case" is true, the matching is case sensitive.


Examples:
Code:
match("AceHart", "", case) returns false
match("AceHart", "*", case) returns true
match("AceHart", "*acehart*", false) returns true
match("AceHart", "*e*", case) returns true
match("AceHart", "**test**", case) returns false
match("AceHart", "Ace?art") returns true

match("Gray", "gr[ea]y", false) returns true

"####" matches any 4 digits
"[!0123456789]" matches any character that is not a digit


Expand Previous:
08-13-2008, 07:55 PM#2
DioD
did you tested oplimit, what is maximum string lenght allowed?
08-13-2008, 08:39 PM#3
AceHart
No idea.

It depends more on the number of "*" than string length.

500 characters and some mask line "**a**b**c**d**e**f**" can still work.
It might fail if it doesn't find anything.
My current best guess is that you hit some recursion limit rather than operations limit.
08-14-2008, 01:34 AM#4
Ammorth
This could be really helpful for chat driven npc conversations. Good work!

idea:
Collapse JASS:
function onChat takes nothing returns nothing
    local string in = GetEventPlayerChatString()
    if match(in, "*hi*") or match(in, "*hello*") then
        call ConversationStart()
    elseif match(in, "*bye*") or match(in, "*see ya*") then
        call ConversationEnd()
    endif
endfunction
08-14-2008, 07:35 AM#5
Rising_Dusk
It definitely needs a better naming convention, though. "match" just doesn't cut it. Maybe "IsStringMatch" or something would work better.
08-15-2008, 01:07 AM#6
Szythe
Sweet. I was just working on a chat-driven NPC dialogue system.
08-15-2008, 03:25 AM#7
Vexorian
I got my doubts about the implementation, It seems at first glance the recursion is not helping. I got the impression the "*" algorithm got factorial time complexity which would be quite bad, I am asleep to really read code, will do a more serious review later.

Edit: You don't need to null strings.

Edit II: Yes, factorial time complexity.

if match("AAAAAAAAAAA","*A*A*A*A*A*A*B") then //crash the thread
Edit III: I am working on a version that will have polynomial complexity, if it works, I'll post it...


Tide-Arc Ephemera: Removed your spam.
08-15-2008, 05:04 AM#8
Vexorian
This got O( length(text) * length(mask) ) time complexity:

Expand ouch:

The problem is that the original match() implementation has around O(text) best case complexity, while mine takes O(text*mask) time on every test case, so, with some simple masks and long texts my implementation will thread crash while AceHart's won't. However, mine is more predictable and you would know that with a certain text size and mask size it will always take the same amount of time/operations, regardless of what the mask or text contain. The implementation I posted also 'leaks' less strings.

If there was a quick way to fill an array with -1, I could do memo-ization instead of dp and get an implementation that got the best of both worlds. Now that I notice (while typing this paragraph) there is a way to do memo-ization without filling up with -1, the code will look funny though, going to try tomorrow when there's more time.
08-15-2008, 05:31 PM#9
AceHart
Well, this DFA certainly gets my vote for top runner in the next OJCC (Obfuscated JASS Code Contest).

Post 1 has been updated ever so slightly.
Tail recursion removed, less SubStrings and some additional features.
08-15-2008, 06:54 PM#10
MaD[Lion]
im making a Regular Expression system for wc3 to further improve this idea of matching strings
08-15-2008, 07:20 PM#11
Skater
if you can script Perl-like regex (or at least php-like regex), you're my personal hero...
08-15-2008, 08:15 PM#12
MaD[Lion]
tats what im trying to do
08-15-2008, 10:27 PM#13
Vexorian
Quote:
Originally Posted by AceHart
Well, this DFA certainly gets my vote for top runner in the next OJCC (Obfuscated JASS Code Contest).

Post 1 has been updated ever so slightly.
Tail recursion removed, less SubStrings and some additional features.
Well whatever, if people actually use such complicated masks that could make this crash it is their fault, approved under the subtitle "Could use a better name"
08-16-2008, 05:31 AM#14
AceHart
On a vaguely related note, consider the following two functions:

Collapse JASS:
function feature1 takes nothing returns string
    local string s = "aaa"
    local string c
    local integer i = 0

    loop
        set c = SubString(s, i, i + 1)
        set i = i + 1
        exitwhen c == null
    endloop
    if c == null then
        return "NULL"
    endif
    return "???"
endfunction

function feature2 takes nothing returns string
    local string s = "aaa"
    local string c
    local integer i = 0

    loop
        set c = SubString(s, i, i + 1)
        set i = i + 1
        exitwhen c != "a"
    endloop
    if c == null then
        return "NULL"
    endif
    return "???"
endfunction

And some basic call:
call DisplayTextToPlayer(GetLocalPlayer(), 0, 0, feature1())
call DisplayTextToPlayer(GetLocalPlayer(), 0, 0, feature2())

Personally, I was expecting both to say "NULL".
However, only the first one does...

EDIT:
Continued here.
08-16-2008, 12:37 PM#15
Troll-Brain
It seems that is better to use an empty string "" instead of null

Also display the iterations (i) and the string c of your loops it's pretty strange.