HomeUser Control Panel (unavailable in archive)ForumsTutorialsArt GalleryResourcesMaps

StringLib

07-18-2009, 12:30 PM#1
ToukoAozaki
StringLib is a set of functions/structs to help manipulate strings.

Requires JassHelper.

Available APIs:
  • function StringTrim takes string str returns string
    Creates and returns a new string by trimming all leading and trailing whitespaces from str.
  • function StringTrimLeft takes string str returns string
    Creates and returns a new string by trimming all leading whitespaces from str.
  • function StringTrimRight takes string str returns string
    Creates and returns a new string by trimming all trailing whitespaces from str.
  • function StringIndexOf takes string source, string toFind, boolean caseSensitive returns integer
    Finds the index of toFind in source from the start. Returns STRING_INDEX_NONE if could not be found.
  • function StringIndexOfReverse takes string source, string toFind, boolean caseSensitive returns integer
    Finds the index of toFind in source from the end. Returns STRING_INDEX_NONE if could not be found.
  • function StringReplace takes string source, string old, string new, boolean caseSensitive returns string
    Creates and returns a new string by replacing all old to new in source.
  • function StringHashCS takes string s returns integer
    Computes a case-sensitive hash of the string s.
  • function IsStringAscii takes string s returns boolean
    If s only consists of ascii characters, returns true. Otherwise, returns false.
  • function StringAsciiAt takes string s, integer index returns integer
    Returns the ascii value of indexth character of s.
  • function StringLengthUtf8 takes string s returns integer
    Returns the actual length of s in terms of unicode characters.
  • function IsStringValidUtf8 takes string s returns boolean
    If s is a valid UTF-8 string, returns true. Otherwise, returns false
  • struct StringSegments
    StringSegments splits a source string into multiple strings, divided by another delimiter (separator) string you provide. It can be used to split and extract text command components from a single input string from user, such as color components or a list of items.
    The interfaces are inspired by StringTokenizer in Java.

    Interfaces:
    • public static method create takes string source, string delimiter returns StringSegments
      Creates a StringSegments instance with supplied source and delimiter (separator) strings.
    • public method countSegments takes nothing returns integer
      Returns number of possible .nextSegment calls left.
    • public method hasMoreSegments takes nothing returns boolean
      If there are more segments to return, returns true. Otherwise, returns false.
    • public method nextSegment takes nothing returns string
      Gets next string segment with current delimiter (separator). Default is the one supplied on .create(). If there are no more segments, this method will return STRING_INVALID_SEGMENT.
    • public method nextSegmentEx takes string delimiter returns string
      Gets next string segment using delimiter as the delimiter (separator). After the call, delimiter becomes the current delimiter. If there are no more segments, this method will return STRING_INVALID_SEGMENT.

    How to use:
    1. Create an instance by calling StringSegments.create(source, delimiter)
    2. Check .hasMoreSegments(). If true, continue processing.
    3. Use .nextSegment() or .nextSegmentEx(delimiter) to get segments.
    4. When done, call .destroy() to clean up properly.

Expand Library Code:

Expand Example Code:
07-18-2009, 03:27 PM#2
Seshiro
Looks good, but could you maybe add a .reset() method, to reset the tokenizer, for more usage, and maybe add something that saves all members, wehn they are used, so second usage will be more faster :)

Greez
07-18-2009, 03:39 PM#3
Kwah
Tokenizer is a pretty damn strange name for it.
07-18-2009, 03:53 PM#4
Seshiro
Maybe Php-Style: exploder :D
07-18-2009, 04:01 PM#5
Kwah
Split, would seem like a more normal one to me, but meh. Good job on it though.
07-18-2009, 04:07 PM#6
ToukoAozaki
Thanks for the feedback. StringSplitter seems better to me, too. However, I don't have any good method names for that :(

Quote:
Looks good, but could you maybe add a .reset() method, to reset the tokenizer, for more usage, and maybe add something that saves all members, wehn they are used, so second usage will be more faster :)

Good idea, but won't be possible due to custom delimiter...
07-18-2009, 05:40 PM#7
Rising_Dusk
Your submissions should have paragraphs explaining the value, purpose, and intended use in layman's terms. Right now, I'm sitting here reading the first post to no avail.
07-18-2009, 05:44 PM#8
Kwah
Splits strings into sub-strings based on a character separator, which is often whitespace?
07-18-2009, 06:28 PM#9
Seshiro
You could Also Use Commas for kind of Comma Seperated Lists would be fucking cool for a map with many cimbinable gamemodes like DotA..just seperate gamemodes by comma or whitepsace...

And what would be great too, but i think it's kinda unnecessary, would be the wohle, but in Reversed Format...

Greez
07-18-2009, 10:24 PM#10
Anitarf
What if I want the delimiter to be ", "?
07-19-2009, 01:04 AM#11
ToukoAozaki
Quote:
Originally Posted by Anitarf
What if I want the delimiter to be ", "?

No problem at all. The code is designed to take care of delimiters longer than one character; the only drawback is the longer the delimiter, the more combinations will be added to the string table.
08-20-2009, 09:59 PM#12
Anitarf
Quote:
Originally Posted by ToukoAozaki
No problem at all. The code is designed to take care of delimiters longer than one character; the only drawback is the longer the delimiter, the more combinations will be added to the string table.
It would be quite possible to avoid that if the string were compared to the delimiter one character at a time.
08-21-2009, 02:18 PM#13
ToukoAozaki
Quote:
Originally Posted by Anitarf
It would be quite possible to avoid that if the string were compared to the delimiter one character at a time.

That is possible, but I'm afraid it would add extra complexity. Also, now I think a bit of extra entries on string table could be a trivial issue.

BTW I'm trying to make some test cases, making sure it guarantees the same behavior as StringTokenizer from Java.
09-09-2009, 07:19 PM#14
Rising_Dusk
I would rename the library to "StringSegments," because that makes it more obvious as to what the heck this does. I've since figured out what it does, but at first glance the name Tokenizer is very confusing.

Also, I can see a very niche application for this. If you have tested this thoroughly and have verified that it works for all of your test cases, then fix the script name and I think that this can be approved. (Unless Anitarf wants to chime in and follow-up on his previous comment about the string table growth)
09-14-2009, 04:53 AM#15
ToukoAozaki
Quote:
Originally Posted by Rising_Dusk
I would rename the library to "StringSegments," because that makes it more obvious as to what the heck this does. I've since figured out what it does, but at first glance the name Tokenizer is very confusing.

Also, I can see a very niche application for this. If you have tested this thoroughly and have verified that it works for all of your test cases, then fix the script name and I think that this can be approved. (Unless Anitarf wants to chime in and follow-up on his previous comment about the string table growth)

Aye. StringSegments sounds good enough for me. BTW the work is still in progress...

Edit: It seems that the one in Java treats delimiter string as set of delimiters. This means a single delimiter cannot be of length more than 1. Now I have to decide whether to stick to the original code, or Java one. Any suggestions on this?