Need to perform Wildcard (*,?, etc) search on a string using Regex

后端 未结 10 2271
执笔经年
执笔经年 2020-11-27 04:35

I need to perform Wildcard (*, ?, etc.) search on a string. This is what I have done:

string input = \"Message\";
string pattern =          


        
相关标签:
10条回答
  • 2020-11-27 05:01

    You may want to use WildcardPattern from System.Management.Automation assembly. See my answer here.

    0 讨论(0)
  • 2020-11-27 05:02

    You need to convert your wildcard expression to a regular expression. For example:

        private bool WildcardMatch(String s, String wildcard, bool case_sensitive)
        {
            // Replace the * with an .* and the ? with a dot. Put ^ at the
            // beginning and a $ at the end
            String pattern = "^" + Regex.Escape(wildcard).Replace(@"\*", ".*").Replace(@"\?", ".") + "$";
    
            // Now, run the Regex as you already know
            Regex regex;
            if(case_sensitive)
                regex = new Regex(pattern);
            else
                regex = new Regex(pattern, RegexOptions.IgnoreCase);
    
            return(regex.IsMatch(s));
        } 
    
    0 讨论(0)
  • 2020-11-27 05:12

    d* means that it should match zero or more "d" characters. So any string is a valid match. Try d+ instead!

    In order to have support for wildcard patterns I would replace the wildcards with the RegEx equivalents. Like * becomes .* and ? becomes .?. Then your expression above becomes d.*

    0 讨论(0)
  • 2020-11-27 05:14

    From http://www.codeproject.com/KB/recipes/wildcardtoregex.aspx:

    public static string WildcardToRegex(string pattern)
    {
        return "^" + Regex.Escape(pattern)
                          .Replace(@"\*", ".*")
                          .Replace(@"\?", ".")
                   + "$";
    }
    

    So something like foo*.xls? will get transformed to ^foo.*\.xls.$.

    0 讨论(0)
  • 2020-11-27 05:15

    Windows and *nux treat wildcards differently. *, ? and . are processed in a very complex way by Windows, one's presence or position would change another's meaning. While *nux keeps it simple, all it does is just one simple pattern match. Besides that, Windows matches ? for 0 or 1 chars, Linux matches it for exactly 1 chars.

    I didn't find authoritative documents on this matter, here is just my conclusion based on days of tests on Windows 8/XP (command line, dir command to be specific, and the Directory.GetFiles method uses the same rules too) and Ubuntu Server 12.04.1 (ls command). I made tens of common and uncommon cases work, although there'are many failed cases too.

    The current answer by Gabe, works like *nux. If you also want a Windows style one, and are willing to accept the imperfection, then here it is:

        /// <summary>
        /// <para>Tests if a file name matches the given wildcard pattern, uses the same rule as shell commands.</para>
        /// </summary>
        /// <param name="fileName">The file name to test, without folder.</param>
        /// <param name="pattern">A wildcard pattern which can use char * to match any amount of characters; or char ? to match one character.</param>
        /// <param name="unixStyle">If true, use the *nix style wildcard rules; otherwise use windows style rules.</param>
        /// <returns>true if the file name matches the pattern, false otherwise.</returns>
        public static bool MatchesWildcard(this string fileName, string pattern, bool unixStyle)
        {
            if (fileName == null)
                throw new ArgumentNullException("fileName");
    
            if (pattern == null)
                throw new ArgumentNullException("pattern");
    
            if (unixStyle)
                return WildcardMatchesUnixStyle(pattern, fileName);
    
            return WildcardMatchesWindowsStyle(fileName, pattern);
        }
    
        private static bool WildcardMatchesWindowsStyle(string fileName, string pattern)
        {
            var dotdot = pattern.IndexOf("..", StringComparison.Ordinal);
            if (dotdot >= 0)
            {
                for (var i = dotdot; i < pattern.Length; i++)
                    if (pattern[i] != '.')
                        return false;
            }
    
            var normalized = Regex.Replace(pattern, @"\.+$", "");
            var endsWithDot = normalized.Length != pattern.Length;
    
            var endWeight = 0;
            if (endsWithDot)
            {
                var lastNonWildcard = normalized.Length - 1;
                for (; lastNonWildcard >= 0; lastNonWildcard--)
                {
                    var c = normalized[lastNonWildcard];
                    if (c == '*')
                        endWeight += short.MaxValue;
                    else if (c == '?')
                        endWeight += 1;
                    else
                        break;
                }
    
                if (endWeight > 0)
                    normalized = normalized.Substring(0, lastNonWildcard + 1);
            }
    
            var endsWithWildcardDot = endWeight > 0;
            var endsWithDotWildcardDot = endsWithWildcardDot && normalized.EndsWith(".");
            if (endsWithDotWildcardDot)
                normalized = normalized.Substring(0, normalized.Length - 1);
    
            normalized = Regex.Replace(normalized, @"(?!^)(\.\*)+$", @".*");
    
            var escaped = Regex.Escape(normalized);
            string head, tail;
    
            if (endsWithDotWildcardDot)
            {
                head = "^" + escaped;
                tail = @"(\.[^.]{0," + endWeight + "})?$";
            }
            else if (endsWithWildcardDot)
            {
                head = "^" + escaped;
                tail = "[^.]{0," + endWeight + "}$";
            }
            else
            {
                head = "^" + escaped;
                tail = "$";
            }
    
            if (head.EndsWith(@"\.\*") && head.Length > 5)
            {
                head = head.Substring(0, head.Length - 4);
                tail = @"(\..*)?" + tail;
            }
    
            var regex = head.Replace(@"\*", ".*").Replace(@"\?", "[^.]?") + tail;
            return Regex.IsMatch(fileName, regex, RegexOptions.IgnoreCase);
        }
    
        private static bool WildcardMatchesUnixStyle(string pattern, string text)
        {
            var regex = "^" + Regex.Escape(pattern)
                                   .Replace("\\*", ".*")
                                   .Replace("\\?", ".")
                        + "$";
    
            return Regex.IsMatch(text, regex);
        }
    

    There's a funny thing, even the Windows API PathMatchSpec does not agree with FindFirstFile. Just try a1*., FindFirstFile says it matches a1, PathMatchSpec says not.

    0 讨论(0)
  • 2020-11-27 05:16

    The correct regular expression formulation of the glob expression d* is ^d, which means match anything that starts with d.

        string input = "Message";
        string pattern = @"^d";
        Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
    

    (The @ quoting is not necessary in this case, but good practice since many regexes use backslash escapes that need to be left alone, and it also indicates to the reader that this string is special).

    0 讨论(0)
提交回复
热议问题