Natural Sort Order in C#

后端 未结 17 1995
野性不改
野性不改 2020-11-21 04:54

Anyone have a good resource or provide a sample of a natural order sort in C# for an FileInfo array? I am implementing the IComparer interface in

相关标签:
17条回答
  • 2020-11-21 05:28

    This is my code to sort a string having both alpha and numeric characters.

    First, this extension method:

    public static IEnumerable<string> AlphanumericSort(this IEnumerable<string> me)
    {
        return me.OrderBy(x => Regex.Replace(x, @"\d+", m => m.Value.PadLeft(50, '0')));
    }
    

    Then, simply use it anywhere in your code like this:

    List<string> test = new List<string>() { "The 1st", "The 12th", "The 2nd" };
    test = test.AlphanumericSort();
    

    How does it works ? By replaceing with zeros:

      Original  | Regex Replace |      The      |   Returned
        List    | Apply PadLeft |    Sorting    |     List
                |               |               |
     "The 1st"  |  "The 001st"  |  "The 001st"  |  "The 1st"
     "The 12th" |  "The 012th"  |  "The 002nd"  |  "The 2nd"
     "The 2nd"  |  "The 002nd"  |  "The 012th"  |  "The 12th"
    

    Works with multiples numbers:

     Alphabetical Sorting | Alphanumeric Sorting
                          |
     "Page 21, Line 42"   | "Page 3, Line 7"
     "Page 21, Line 5"    | "Page 3, Line 32"
     "Page 3, Line 32"    | "Page 21, Line 5"
     "Page 3, Line 7"     | "Page 21, Line 42"
    

    Hope that's will help.

    0 讨论(0)
  • 2020-11-21 05:30

    Adding to Greg Beech's answer (because I've just been searching for that), if you want to use this from Linq you can use the OrderBy that takes an IComparer. E.g.:

    var items = new List<MyItem>();
    
    // fill items
    
    var sorted = items.OrderBy(item => item.Name, new NaturalStringComparer());
    
    0 讨论(0)
  • 2020-11-21 05:33

    Just thought I'd add to this (with the most concise solution I could find):

    public static IOrderedEnumerable<T> OrderByAlphaNumeric<T>(this IEnumerable<T> source, Func<T, string> selector)
    {
        int max = source
            .SelectMany(i => Regex.Matches(selector(i), @"\d+").Cast<Match>().Select(m => (int?)m.Value.Length))
            .Max() ?? 0;
    
        return source.OrderBy(i => Regex.Replace(selector(i), @"\d+", m => m.Value.PadLeft(max, '0')));
    }
    

    The above pads any numbers in the string to the max length of all numbers in all strings and uses the resulting string to sort.

    The cast to (int?) is to allow for collections of strings without any numbers (.Max() on an empty enumerable throws an InvalidOperationException).

    0 讨论(0)
  • 2020-11-21 05:33

    You do need to be careful -- I vaguely recall reading that StrCmpLogicalW, or something like it, was not strictly transitive, and I have observed .NET's sort methods to sometimes get stuck in infinite loops if the comparison function breaks that rule.

    A transitive comparison will always report that a < c if a < b and b < c. There exists a function that does a natural sort order comparison that does not always meet that criterion, but I can't recall whether it is StrCmpLogicalW or something else.

    0 讨论(0)
  • 2020-11-21 05:34

    My solution:

    void Main()
    {
        new[] {"a4","a3","a2","a10","b5","b4","b400","1","C1d","c1d2"}.OrderBy(x => x, new NaturalStringComparer()).Dump();
    }
    
    public class NaturalStringComparer : IComparer<string>
    {
        private static readonly Regex _re = new Regex(@"(?<=\D)(?=\d)|(?<=\d)(?=\D)", RegexOptions.Compiled);
    
        public int Compare(string x, string y)
        {
            x = x.ToLower();
            y = y.ToLower();
            if(string.Compare(x, 0, y, 0, Math.Min(x.Length, y.Length)) == 0)
            {
                if(x.Length == y.Length) return 0;
                return x.Length < y.Length ? -1 : 1;
            }
            var a = _re.Split(x);
            var b = _re.Split(y);
            int i = 0;
            while(true)
            {
                int r = PartCompare(a[i], b[i]);
                if(r != 0) return r;
                ++i;
            }
        }
    
        private static int PartCompare(string x, string y)
        {
            int a, b;
            if(int.TryParse(x, out a) && int.TryParse(y, out b))
                return a.CompareTo(b);
            return x.CompareTo(y);
        }
    }
    

    Results:

    1
    a2
    a3
    a4
    a10
    b4
    b5
    b400
    C1d
    c1d2
    
    0 讨论(0)
  • 2020-11-21 05:36

    Matthews Horsleys answer is the fastest method which doesn't change behaviour depending on which version of windows your program is running on. However, it can be even faster by creating the regex once, and using RegexOptions.Compiled. I also added the option of inserting a string comparer so you can ignore case if needed, and improved readability a bit.

        public static IEnumerable<T> OrderByNatural<T>(this IEnumerable<T> items, Func<T, string> selector, StringComparer stringComparer = null)
        {
            var regex = new Regex(@"\d+", RegexOptions.Compiled);
    
            int maxDigits = items
                          .SelectMany(i => regex.Matches(selector(i)).Cast<Match>().Select(digitChunk => (int?)digitChunk.Value.Length))
                          .Max() ?? 0;
    
            return items.OrderBy(i => regex.Replace(selector(i), match => match.Value.PadLeft(maxDigits, '0')), stringComparer ?? StringComparer.CurrentCulture);
        }
    

    Use by

    var sortedEmployees = employees.OrderByNatural(emp => emp.Name);
    

    This takes 450ms to sort 100,000 strings compared to 300ms for the default .net string comparison - pretty fast!

    0 讨论(0)
提交回复
热议问题