glob pattern matching in .NET

后端 未结 14 1351
遇见更好的自我
遇见更好的自我 2020-11-29 01:52

Is there a built-in mechanism in .NET to match patterns other than Regular Expressions? I\'d like to match using UNIX style (glob) wildcards (* = any number of any characte

相关标签:
14条回答
  • 2020-11-29 02:28

    I found the actual code for you:

    Regex.Escape( wildcardExpression ).Replace( @"\*", ".*" ).Replace( @"\?", "." );
    
    0 讨论(0)
  • 2020-11-29 02:29

    I like my code a little more semantic, so I wrote this extension method:

    using System.Text.RegularExpressions;
    
    namespace Whatever
    {
        public static class StringExtensions
        {
            /// <summary>
            /// Compares the string against a given pattern.
            /// </summary>
            /// <param name="str">The string.</param>
            /// <param name="pattern">The pattern to match, where "*" means any sequence of characters, and "?" means any single character.</param>
            /// <returns><c>true</c> if the string matches the given pattern; otherwise <c>false</c>.</returns>
            public static bool Like(this string str, string pattern)
            {
                return new Regex(
                    "^" + Regex.Escape(pattern).Replace(@"\*", ".*").Replace(@"\?", ".") + "$",
                    RegexOptions.IgnoreCase | RegexOptions.Singleline
                ).IsMatch(str);
            }
        }
    }
    

    (change the namespace and/or copy the extension method to your own string extensions class)

    Using this extension, you can write statements like this:

    if (File.Name.Like("*.jpg"))
    {
       ....
    }
    

    Just sugar to make your code a little more legible :-)

    0 讨论(0)
  • 2020-11-29 02:29

    I have written a globbing library for .NETStandard, with tests and benchmarks. My goal was to produce a library for .NET, with minimal dependencies, that doesn't use Regex, and outperforms Regex.

    You can find it here:

    • github.com/dazinator/DotNet.Glob
    • https://www.nuget.org/packages/DotNet.Glob/
    0 讨论(0)
  • 2020-11-29 02:30

    From C# you can use .NET's LikeOperator.LikeString method. That's the backing implementation for VB's LIKE operator. It supports patterns using *, ?, #, [charlist], and [!charlist].

    You can use the LikeString method from C# by adding a reference to the Microsoft.VisualBasic.dll assembly, which is included with every version of the .NET Framework. Then you invoke the LikeString method just like any other static .NET method:

    using Microsoft.VisualBasic;
    using Microsoft.VisualBasic.CompilerServices;
    ...
    bool isMatch = LikeOperator.LikeString("I love .NET!", "I love *", CompareMethod.Text);
    // isMatch should be true.
    
    0 讨论(0)
  • 2020-11-29 02:31

    If you want to avoid regular expressions this is a basic glob implementation:

    public static class Globber
    {
        public static bool Glob(this string value, string pattern)
        {
            int pos = 0;
    
            while (pattern.Length != pos)
            {
                switch (pattern[pos])
                {
                    case '?':
                        break;
    
                    case '*':
                        for (int i = value.Length; i >= pos; i--)
                        {
                            if (Glob(value.Substring(i), pattern.Substring(pos + 1)))
                            {
                                return true;
                            }
                        }
                        return false;
    
                    default:
                        if (value.Length == pos || char.ToUpper(pattern[pos]) != char.ToUpper(value[pos]))
                        {
                            return false;
                        }
                        break;
                }
    
                pos++;
            }
    
            return value.Length == pos;
        }
    }
    

    Use it like this:

    Assert.IsTrue("text.txt".Glob("*.txt"));
    
    0 讨论(0)
  • 2020-11-29 02:33

    Just out of curiosity I've glanced into Microsoft.Extensions.FileSystemGlobbing - and it was dragging quite huge dependencies on quite many libraries - I've decided why I cannot try to write something similar?

    Well - easy to say than done, I've quickly noticed that it was not so trivial function after all - for example "*.txt" should match for files only in current directly, while "**.txt" should also harvest sub folders.

    Microsoft also tests some odd matching pattern sequences like "./*.txt" - I'm not sure who actually needs "./" kind of string - since they are removed anyway while processing. (https://github.com/aspnet/FileSystem/blob/dev/test/Microsoft.Extensions.FileSystemGlobbing.Tests/PatternMatchingTests.cs)

    Anyway, I've coded my own function - and there will be two copies of it - one in svn (I might bugfix it later on) - and I'll copy one sample here as well for demo purposes. I recommend to copy paste from svn link.

    SVN Link:

    https://sourceforge.net/p/syncproj/code/HEAD/tree/SolutionProjectBuilder.cs#l800 (Search for matchFiles function if not jumped correctly).

    And here is also local function copy:

    /// <summary>
    /// Matches files from folder _dir using glob file pattern.
    /// In glob file pattern matching * reflects to any file or folder name, ** refers to any path (including sub-folders).
    /// ? refers to any character.
    /// 
    /// There exists also 3-rd party library for performing similar matching - 'Microsoft.Extensions.FileSystemGlobbing'
    /// but it was dragging a lot of dependencies, I've decided to survive without it.
    /// </summary>
    /// <returns>List of files matches your selection</returns>
    static public String[] matchFiles( String _dir, String filePattern )
    {
        if (filePattern.IndexOfAny(new char[] { '*', '?' }) == -1)      // Speed up matching, if no asterisk / widlcard, then it can be simply file path.
        {
            String path = Path.Combine(_dir, filePattern);
            if (File.Exists(path))
                return new String[] { filePattern };
            return new String[] { };
        }
    
        String dir = Path.GetFullPath(_dir);        // Make it absolute, just so we can extract relative path'es later on.
        String[] pattParts = filePattern.Replace("/", "\\").Split('\\');
        List<String> scanDirs = new List<string>();
        scanDirs.Add(dir);
    
        //
        //  By default glob pattern matching specifies "*" to any file / folder name, 
        //  which corresponds to any character except folder separator - in regex that's "[^\\]*"
        //  glob matching also allow double astrisk "**" which also recurses into subfolders. 
        //  We split here each part of match pattern and match it separately.
        //
        for (int iPatt = 0; iPatt < pattParts.Length; iPatt++)
        {
            bool bIsLast = iPatt == (pattParts.Length - 1);
            bool bRecurse = false;
    
            String regex1 = Regex.Escape(pattParts[iPatt]);         // Escape special regex control characters ("*" => "\*", "." => "\.")
            String pattern = Regex.Replace(regex1, @"\\\*(\\\*)?", delegate (Match m)
                {
                    if (m.ToString().Length == 4)   // "**" => "\*\*" (escaped) - we need to recurse into sub-folders.
                    {
                        bRecurse = true;
                        return ".*";
                    }
                    else
                        return @"[^\\]*";
                }).Replace(@"\?", ".");
    
            if (pattParts[iPatt] == "..")                           // Special kind of control, just to scan upper folder.
            {
                for (int i = 0; i < scanDirs.Count; i++)
                    scanDirs[i] = scanDirs[i] + "\\..";
    
                continue;
            }
    
            Regex re = new Regex(pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
            int nScanItems = scanDirs.Count;
            for (int i = 0; i < nScanItems; i++)
            {
                String[] items;
                if (!bIsLast)
                    items = Directory.GetDirectories(scanDirs[i], "*", (bRecurse) ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly);
                else
                    items = Directory.GetFiles(scanDirs[i], "*", (bRecurse) ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly);
    
                foreach (String path in items)
                {
                    String matchSubPath = path.Substring(scanDirs[i].Length + 1);
                    if (re.Match(matchSubPath).Success)
                        scanDirs.Add(path);
                }
            }
            scanDirs.RemoveRange(0, nScanItems);    // Remove items what we have just scanned.
        } //for
    
        //  Make relative and return.
        return scanDirs.Select( x => x.Substring(dir.Length + 1) ).ToArray();
    } //matchFiles
    

    If you find any bugs, I'll be grad to fix them.

    0 讨论(0)
提交回复
热议问题