Sorting mixed numbers and strings

28,498

Solution 1

Perhaps you could go with a more generic approach and use a natural sorting algorithm such as the C# implementation here.

Solution 2

Two ways come to mind, not sure which is more performant. Implement a custom IComparer:

class MyComparer : IComparer<string>
{
    public int Compare(string x, string y)
    {
        int xVal, yVal;
        var xIsVal = int.TryParse( x, out xVal );
        var yIsVal = int.TryParse( y, out yVal );

        if (xIsVal && yIsVal)   // both are numbers...
            return xVal.CompareTo(yVal);
        if (!xIsVal && !yIsVal) // both are strings...
            return x.CompareTo(y);
        if (xIsVal)             // x is a number, sort first
            return -1;
        return 1;               // x is a string, sort last
    }
}

var input = new[] {"a", "1", "10", "b", "2", "c"};
var e = input.OrderBy( s => s, new MyComparer() );

Or, split the sequence into numbers and non-numbers, then sort each subgroup, finally join the sorted results; something like:

var input = new[] {"a", "1", "10", "b", "2", "c"};

var result = input.Where( s => s.All( x => char.IsDigit( x ) ) )
                  .OrderBy( r => { int z; int.TryParse( r, out z ); return z; } )
                  .Union( input.Where( m => m.Any( x => !char.IsDigit( x ) ) )
                               .OrderBy( q => q ) );

Solution 3

Use the other overload of OrderBy that takes an IComparer parameter.

You can then implement your own IComparer that uses int.TryParse to tell if it's a number or not.

Solution 4

I had a similar problem and landed here: sorting strings that have a numeric suffix as in the following example.

Original:

"Test2", "Test1", "Test10", "Test3", "Test20"

Default sort result:

"Test1", "Test10", "Test2", "Test20", "Test3"

Desired sort result:

"Test1", "Test2", "Test3, "Test10", "Test20"

I ended up using a custom Comparer:

public class NaturalComparer : IComparer
{

    public NaturalComparer()
    {
        _regex = new Regex("\\d+$", RegexOptions.IgnoreCase);
    }

    private Regex _regex;

    private string matchEvaluator(System.Text.RegularExpressions.Match m)
    {
        return Convert.ToInt32(m.Value).ToString("D10");
    }

    public int Compare(object x, object y)
    {
        x = _regex.Replace(x.ToString(), matchEvaluator);
        y = _regex.Replace(y.ToString(), matchEvaluator);

        return x.CompareTo(y);
    }
}   

Usage:

var input = new List<MyObject>(){...};
var sorted = input.OrderBy(o=>o.SomeStringMember, new NaturalComparer());

HTH ;o)

Solution 5

I'd say you could split up the values using a RegularExpression (assuming everything is an int) and then rejoin them together.

//create two lists to start
string[] data = //whatever...
List<int> numbers = new List<int>();
List<string> words = new List<string>();

//check each value
foreach (string item in data) {
    if (Regex.IsMatch("^\d+$", item)) {
        numbers.Add(int.Parse(item));
    }
    else {
        words.Add(item);
    }
}

Then with your two lists you can sort each of them and then merge them back together in whatever format you want.

Share:
28,498
Boris Callens
Author by

Boris Callens

Senior .net programmer. Belgium(Antwerp) based. linked-in My real email is gmail.

Updated on July 31, 2022

Comments

  • Boris Callens
    Boris Callens almost 2 years

    I have a list of strings that can contain a letter or a string representation of an int (max 2 digits). They need to be sorted either alphabetically or (when it is actually an int) on the numerical value it represents.

    Example:

    IList<string> input = new List<string>()
        {"a", 1.ToString(), 2.ToString(), "b", 10.ToString()};
    
    input.OrderBy(s=>s)
      // 1
      // 10
      // 2
      // a
      // b
    

    What I would want is

      // 1
      // 2
      // 10
      // a
      // b
    

    I have some idea involving formatting it with trying to parse it, then if it is a successfull tryparse to format it with my own custom stringformatter to make it have preceding zeros. I'm hoping for something more simple and performant.

    Edit
    I ended up making an IComparer I dumped in my Utils library for later use.
    While I was at it I threw doubles in the mix too.

    public class MixedNumbersAndStringsComparer : IComparer<string> {
        public int Compare(string x, string y) {
            double xVal, yVal;
    
            if(double.TryParse(x, out xVal) && double.TryParse(y, out yVal))
                return xVal.CompareTo(yVal);
            else 
                return string.Compare(x, y);
        }
    }
    
    //Tested on int vs int, double vs double, int vs double, string vs int, string vs doubl, string vs string.
    //Not gonna put those here
    [TestMethod]
    public void RealWorldTest()
    {
        List<string> input = new List<string>() { "a", "1", "2,0", "b", "10" };
        List<string> expected = new List<string>() { "1", "2,0", "10", "a", "b" };
        input.Sort(new MixedNumbersAndStringsComparer());
        CollectionAssert.AreEquivalent(expected, input);
    }
    
  • Coryza
    Coryza almost 15 years
    Yeah, this is simpler than my approach. +1
  • LukeH
    LukeH almost 15 years
    Your IComparer doesn't return non-numeric strings in the correct (alphabetical) order. Your LINQ query does.
  • Boris Callens
    Boris Callens almost 15 years
    I added my ending code in the OP. Also noticed the string thing. Furthermore I tried shortcirquiting before every parse. Don't know if it makes much performance sence, but it took me exactly as much effort to reorder them as it would have taken me to test it ;)
  • Boris Callens
    Boris Callens almost 15 years
    Made code a whole lot shorter. By applying the system of short cirquiting (literally translated from Dutch "Kortsluitingsprincipe") I only do as much tryparses as needed.
  • Boris Callens
    Boris Callens almost 15 years
    Not really simpler then what I posted for all I know. Could be more performant, but it's not critical enough to put perf over simplicity
  • Peter Turner
    Peter Turner over 14 years
    Very cool indeed, I just found a Delphi wrapper for this too irsoft.de/web/strnatcmp-and-natsort-for-delphi
  • AH.
    AH. over 9 years
    This will not work in all cases. Suppose ypu have the following list of items: "0 / 30" "0 / 248" "0 / 496" "0 / 357.6". This order will be keept after sorting, which is not what you may expect.
  • Toshi
    Toshi over 6 years
    Instead of pasting some urls you should add the code here to avoid dead links. Now this answer isn't more than a comment
  • Aakash Bashyal
    Aakash Bashyal over 3 years
    this is exactly what needed to me but I have the list of string in inside the list type of object, can you show on how to use this method?
  • mike
    mike over 3 years
    @AakashBashyal I added an example to my answer above.
  • Aakash Bashyal
    Aakash Bashyal over 3 years
    just creating a list type object will sort the item inside the list?
  • Aakash Bashyal
    Aakash Bashyal over 3 years
    but there is error on x = _regex.Replace(x.ToString, matchEvaluator); which says Argument 1: cannot convert from 'method group' to 'string'
  • mike
    mike over 3 years
    @AakashBashyal sorry, my bad. i fixed the code. thanks!