Has anyone implemented a Regex and/or Xml parser around StringBuilders or Streams?

后端 未结 3 1203
误落风尘
误落风尘 2020-12-16 02:20

I\'m building a stress-testing client that hammers servers and analyzes responses using as many threads as the client can muster. I\'m constantly finding myself throttled by

相关标签:
3条回答
  • 2020-12-16 02:56

    Here try this. Everything's char based and relatively low level for efficiency. Any number of your *s or ?s can be used. However, your * is now and your ? is now . Around three days of work went into this to make it as clean as possible. You can even enter multiple queries on one sweep!

    Example usage: wildcard(new StringBuilder("Hello and welcome"), "hello✪w★l", "be") results in "become".

    ////////////////////////////////////////////////////////////////////////////////////////////////////////
    ///////////// Search for a string/s inside 'text' using the 'find' parameter, and replace with a string/s using the replace parameter
    // ✪ represents multiple wildcard characters (non-greedy)
    // ★ represents a single wildcard character
    public StringBuilder wildcard(StringBuilder text, string find, string replace, bool caseSensitive = false)
    {
        return wildcard(text, new string[] { find }, new string[] { replace }, caseSensitive);
    }
    public StringBuilder wildcard(StringBuilder text, string[] find, string[] replace, bool caseSensitive = false)
    {
        if (text.Length == 0) return text;          // Degenerate case
    
        StringBuilder sb = new StringBuilder();     // The new adjusted string with replacements
        for (int i = 0; i < text.Length; i++)   {   // Go through every letter of the original large text
    
            bool foundMatch = false;                // Assume match hasn't been found to begin with
            for(int q=0; q< find.Length; q++) {     // Go through each query in turn
                if (find[q].Length == 0) continue;  // Ignore empty queries
    
                int f = 0;  int g = 0;              // Query cursor and text cursor
                bool multiWild = false;             // multiWild is ✪ symbol which represents many wildcard characters
                int multiWildPosition = 0;          
    
                while(true) {                       // Loop through query characters
                    if (f >= find[q].Length || (i + g) >= text.Length) break;       // Bounds checking
                    char cf = find[q][f];                                           // Character in the query (f is the offset)
                    char cg = text[i + g];                                          // Character in the text (g is the offset)
                    if (!caseSensitive) cg = char.ToLowerInvariant(cg);
                    if (cf != '★' && cf != '✪' && cg != cf && !multiWild) break;        // Break search, and thus no match is found
                    if (cf == '✪') { multiWild = true; multiWildPosition = f; f++; continue; }              // Multi-char wildcard activated. Move query cursor, and reloop
                    if (multiWild && cg != cf && cf != '★') { f = multiWildPosition + 1; g++; continue; }   // Match since MultiWild has failed, so return query cursor to MultiWild position
                    f++; g++;                                                           // Reaching here means that a single character was matched, so move both query and text cursor along one
                }
    
                if (f == find[q].Length) {          // If true, query cursor has reached the end of the query, so a match has been found!!!
                    sb.Append(replace[q]);          // Append replacement
                    foundMatch = true;
                    if (find[q][f - 1] == '✪') { i = text.Length; break; }      // If the MultiWild is the last char in the query, then the rest of the string is a match, and so close off
                    i += g - 1;                                                 // Move text cursor along by the amount equivalent to its found match
                }
            }
            if (!foundMatch) sb.Append(text[i]);    // If a match wasn't found at that point in the text, then just append the original character
        }
        return sb;
    }
    
    0 讨论(0)
  • 2020-12-16 02:56

    The Mono project has switched the license for their core libraries to an MIT X11 license. If you need to create a regex library customized for performance in your particular application, you should be able to start with the latest code from Mono's implementation of the System library.

    0 讨论(0)
  • 2020-12-16 03:02

    XmlReader is a stream-based XML parser. See http://msdn.microsoft.com/en-us/library/756wd7zs.aspx

    0 讨论(0)
提交回复
热议问题