I have a method returning a list of String that need to be sorted. However, I'm running into the old String number sorting issue and was wondering if any one could assist with a Comparator implementation or point me in the direction of one.
The list is going to return something list this:
State Lower Legislative District 1
State Lower Legislative District 11
State Lower Legislative District 12
...
State Lower Legislative District 2
...
State Lower Legislative District 100
...
State Upper Legislative District 1
State Upper Legislative District 11
...
So, first I need to do a basic String sort, but then I need to sort by the number. The number to sort on should always trail, and may be 2 or 3 digits.
(Edit) My initial thought is to split the string on space, run StringUtils.isNumeric on the number portion, then sort. However, it seems a bit of a kludge to me.
Can anyone assist?
There is an article about this on Coding Horror. This is called natural sorting, where you effectively treat a group of digits as a single "character". See this question for some Java implementations of the idea.
Sorting for Humans : Natural Sort Order
The default sort functions in almost every programming language are poorly suited for human consumption. What do I mean by that? Well, consider the difference between sorting filenames in Windows explorer, and sorting those very same filenames via
Array.Sort()
code:
I wrote a variation on String.CompareTo that compares the length of numbers found in the two strings. When encounting two numbers of the same length the alphanumeric compare is resumed as normal. It also skips leading zeros.
public static int compareNatural(String a, String b) {
int la = a.length();
int lb = b.length();
int ka = 0;
int kb = 0;
while (true) {
if (ka == la)
return kb == lb ? 0 : -1;
if (kb == lb)
return 1;
if (a.charAt(ka) >= '0' && a.charAt(ka) <= '9' && b.charAt(kb) >= '0' && b.charAt(kb) <= '9') {
int na = 0;
int nb = 0;
while (ka < la && a.charAt(ka) == '0')
ka++;
while (ka + na < la && a.charAt(ka + na) >= '0' && a.charAt(ka + na) <= '9')
na++;
while (kb < lb && b.charAt(kb) == '0')
kb++;
while (kb + nb < lb && b.charAt(kb + nb) >= '0' && b.charAt(kb + nb) <= '9')
nb++;
if (na > nb)
return 1;
if (nb > na)
return -1;
if (ka == la)
return kb == lb ? 0 : -1;
if (kb == lb)
return 1;
}
if (a.charAt(ka) != b.charAt(kb))
return a.charAt(ka) - b.charAt(kb);
ka++;
kb++;
}
}
One way would be to use a simple regex to parse out the fields of interest in your comparator and then compare them manually. Here's an untested example:
private static final Pattern pattern = Pattern.compile("^State (Lower|Upper) Legislative District (\\d+)$");
public int compare(String a, String b) {
Matcher matcher1 = pattern.matcher(a);
Matcher matcher2 = pattern.matcher(b);
if( matcher1.matches() && matcher2.matches() ) {
//compare upper/lower
int upperLowerComparison = matcher1.group(1).compareTo(matcher2.group(1));
if ( upperLowerComparison != 0 ) {
return upperLowerComparison;
}
//number comparison
return Integer.valueOf(matcher1.group(2)).compareTo(Integer.valueOf(matcher2.group(2));
}
//...what to do if they don't match?
}
You have two options. The first one is to create a class having two fields - the name and the number. Of course first parse the name and numbers. Then in the comparator first compare the name and then the number. The second one is to do the parsing at place in the compare
method. Choose which one is more appropriate to you.
Have a look at this implementation:
public static int naturalCompare(String a, String b, boolean ignoreCase) {
if (ignoreCase) {
a = a.toLowerCase();
b = b.toLowerCase();
}
int aLength = a.length();
int bLength = b.length();
int minSize = Math.min(aLength, bLength);
char aChar, bChar;
boolean aNumber, bNumber;
boolean asNumeric = false;
int lastNumericCompare = 0;
for (int i = 0; i < minSize; i++) {
aChar = a.charAt(i);
bChar = b.charAt(i);
aNumber = aChar >= '0' && aChar <= '9';
bNumber = bChar >= '0' && bChar <= '9';
if (asNumeric)
if (aNumber && bNumber) {
if (lastNumericCompare == 0)
lastNumericCompare = aChar - bChar;
} else if (aNumber)
return 1;
else if (bNumber)
return -1;
else if (lastNumericCompare == 0) {
if (aChar != bChar)
return aChar - bChar;
asNumeric = false;
} else
return lastNumericCompare;
else if (aNumber && bNumber) {
asNumeric = true;
if (lastNumericCompare == 0)
lastNumericCompare = aChar - bChar;
} else if (aChar != bChar)
return aChar - bChar;
}
if (asNumeric)
if (aLength > bLength && a.charAt(bLength) >= '0' && a.charAt(bLength) <= '9') // as number
return 1; // a has bigger size, thus b is smaller
else if (bLength > aLength && b.charAt(aLength) >= '0' && b.charAt(aLength) <= '9') // as number
return -1; // b has bigger size, thus a is smaller
else
return lastNumericCompare;
else
return aLength - bLength;
}
It should be fast, without any regular expressions or array manipulation, just a couple of flags and a lot of cases.
This should sort any combination of numbers inside strings and properly support numbers which are equal and move on.
A simple implementation would be like this one (this works with any string that ends with a number):
public class SplitComparator implements Comparator<String> {
static class Pair implements Comparable<Pair> {
private String name;
private Integer number;
public Pair(String value) {
value = value.trim();
this.name = value.substring( 0, value.lastIndexOf(" ") );
this.number = Integer.valueOf( value.substring( value.lastIndexOf(" ") + 1, value.length() ) );
}
@Override
public int compareTo( Pair right) {
int result = this.name.compareTo( right.name );
if ( result == 0 ) {
result = this.number.compareTo( right.number );
}
return result;
}
}
@Override
public int compare(String left, String right) {
return new Pair( left ).compareTo( new Pair( right ) );
}
public static void main( String ... args ) {
String[] values = { "State Lower Legislative District 1",
"State Lower Legislative District 11",
"State Upper Legislative District 1",
"State Upper Legislative District 11"};
SplitComparator comparator = new SplitComparator();
System.out.println( comparator.compare( values[1] , values[0]) );
System.out.println( comparator.compare( values[0] , values[1]) );
System.out.println( comparator.compare( values[0] , values[3]) );
}
}
I usually do this by prefixing zeros to the number and handle the whole entity as a string. then sort it.
See this:
public abstract class MyNumberComparator {
protected int doCompare(final String number1, final String number2) {
String strNumber1 = fillUpLeftWithZeros(number1, 30);
String strNumber2 = fillUpLeftWithZeros(number2, 30);
return strNumber1.toUpperCase().compareTo(strNumber2.toUpperCase());
}
}
来源:https://stackoverflow.com/questions/7270447/java-string-number-comparator