1. Introduction
Oftentimes while operating upon Strings, we need to figure out whether a String is a valid number or not.
In this tutorial, we’ll explore multiple ways to detect if the given String is numeric, first using plain Java, then regular expressions and finally by using external libraries.
Once we're done discussing various implementations, we'll use benchmarks to get an idea of which methods are optimal.
2. Prerequisites
Let's start with some prerequisites before we head on to the main content.
In the latter part of this article, we'll be using Apache Commons external library for which we'll add its dependency in our pom.xml:
1
2
3
4
5
|
<
dependency
>
<
groupId
>org.apache.commons</
groupId
>
<
artifactId
>commons-lang3</
artifactId
>
<
version
>3.9</
version
>
</
dependency
>
|
The latest version of this library can be found on Maven Central.
3. Using Plain Java
Perhaps the easiest and the most reliable way to check whether a String is numeric or not is by parsing it using Java's built-in methods:
- Integer.parseInt(String)
- Float.parseFloat(String)
- Double.parseDouble(String)
- Long.parseLong(String)
- new BigInteger(String)
If these methods don't throw any NumberFormatException, then it means that the parsing was successful and the String is numeric:
1
2
3
4
5
6
7
8
9
10
11
|
public
static
boolean
isNumeric(String strNum) {
if
(strNum ==
null
) {
return
false
;
}
try
{
double
d = Double.parseDouble(strNum);
}
catch
(NumberFormatException nfe) {
return
false
;
}
return
true
;
}
|
Let's see this method in action:
1
2
3
4
5
6
7
8
9
|
assertThat(isNumeric(
"22"
)).isTrue();
assertThat(isNumeric(
"5.05"
)).isTrue();
assertThat(isNumeric(
"-200"
)).isTrue();
assertThat(isNumeric(
"10.0d"
)).isTrue();
assertThat(isNumeric(
" 22 "
)).isTrue();
assertThat(isNumeric(
null
)).isFalse();
assertThat(isNumeric(
""
)).isFalse();
assertThat(isNumeric(
"abc"
)).isFalse();
|
In our isNumeric() method, we're just checking for values that are of type Double, but this method can also be modified to check for Integer, Float, Long and large numbers by using any of the parse methods that we have enlisted earlier.
These methods are also discussed in the Java String Conversions article.
4. Using Regular Expressions
Now let's use regex -?\d+(\.\d+)? to match numeric Strings consisting of the positive or negative integer and floats.
But this goes without saying, that we can definitely modify this regex to identify and handle a wide range of rules. Here, we'll keep it simple.
Let’s break down this regex and see how it works:
- -? – this part identifies if the given number is negative, the dash “–” searches for dash literally and the question mark “?” marks its presence as an optional one
- \d+ – this searches for one or more digits
- (\.\d+)? – this part of regex is to identify float numbers. Here we're searching for one or more digits followed by a period. The question mark, in the end, signifies that this complete group is optional
Regular expressions are a very broad topic. To get a brief overview, check our tutorial on the Java regular expressions API.
For now, let's create a method using the above regular expression:
1
2
3
4
5
6
7
8
|
private
Pattern pattern = Pattern.compile(
"-?\\d+(\\.\\d+)?"
);
public
boolean
isNumeric(String strNum) {
if
(strNum ==
null
) {
return
false
;
}
return
pattern.matcher(strNum).matches();
}
|
Let's now look at some assertions for the above method:
1
2
3
4
5
6
|
assertThat(isNumeric(
"22"
)).isTrue();
assertThat(isNumeric(
"5.05"
)).isTrue();
assertThat(isNumeric(
"-200"
)).isTrue();
assertThat(isNumeric(
null
)).isFalse();
assertThat(isNumeric(
"abc"
)).isFalse();
|
5. Using Apache Commons
In this section, we'll discuss various methods available in the Apache Commons library.
5.1. NumberUtils.isCreatable(String)
NumberUtils from Apache Commons provides a static method NumberUtils.isCreatable(String) which checks whether a String is a valid Java number or not.
This method accepts:
- Hexadecimal numbers starting with 0x or 0X
- Octal numbers starting with a leading 0
- Scientific notation (for example 1.05e-10)
- Numbers marked with a type qualifier (for example 1L or 2.2d)
If the supplied string is null or empty/blank, then it's not considered a number and the method will return false.
Let's run some tests using this method:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
assertThat(NumberUtils.isCreatable(
"22"
)).isTrue();
assertThat(NumberUtils.isCreatable(
"5.05"
)).isTrue();
assertThat(NumberUtils.isCreatable(
"-200"
)).isTrue();
assertThat(NumberUtils.isCreatable(
"10.0d"
)).isTrue();
assertThat(NumberUtils.isCreatable(
"1000L"
)).isTrue();
assertThat(NumberUtils.isCreatable(
"0xFF"
)).isTrue();
assertThat(NumberUtils.isCreatable(
"07"
)).isTrue();
assertThat(NumberUtils.isCreatable(
"2.99e+8"
)).isTrue();
assertThat(NumberUtils.isCreatable(
null
)).isFalse();
assertThat(NumberUtils.isCreatable(
""
)).isFalse();
assertThat(NumberUtils.isCreatable(
"abc"
)).isFalse();
assertThat(NumberUtils.isCreatable(
" 22 "
)).isFalse();
assertThat(NumberUtils.isCreatable(
"09"
)).isFalse();
|
Note how we're getting true assertions for hexadecimal numbers, octal numbers and scientific notations in lines 6, 7 and 8 respectively.
Also, on line 14, the string “09” returns false because the preceding “0” indicates that this is an octal number and “09” is not a valid octal number.
For every input that returns true with this method, we can use NumberUtils.createNumber(String) which will give us the valid number.
5.2. NumberUtils.isParsable(String)
The NumberUtils.isParsable(String) method checks whether the given String is parsable or not.
Parsable numbers are those that are parsed successfully by any parse method like Integer.parseInt(String), Long.parseLong(String), Float.parseFloat(String) or Double.parseDouble(String).
Unlike NumberUtils.isCreatable(), this method won't accept hexadecimal numbers, scientific notations or strings ending with any type qualifier, that is, ‘f', ‘F', ‘d' ,'D' ,'l'or‘L'.
Let's look at some affirmations:
1
2
3
4
5
6
7
8
9
10
11
12
|
assertThat(NumberUtils.isParsable(
"22"
)).isTrue();
assertThat(NumberUtils.isParsable(
"-23"
)).isTrue();
assertThat(NumberUtils.isParsable(
"2.2"
)).isTrue();
assertThat(NumberUtils.isParsable(
"09"
)).isTrue();
assertThat(NumberUtils.isParsable(
null
)).isFalse();
assertThat(NumberUtils.isParsable(
""
)).isFalse();
assertThat(NumberUtils.isParsable(
"6.2f"
)).isFalse();
assertThat(NumberUtils.isParsable(
"9.8d"
)).isFalse();
assertThat(NumberUtils.isParsable(
"22L"
)).isFalse();
assertThat(NumberUtils.isParsable(
"0xFF"
)).isFalse();
assertThat(NumberUtils.isParsable(
"2.99e+8"
)).isFalse();
|
On line 4, unlike NumberUtils.isCreatable(), the number starting with string “0” isn't considered as an octal number, but a normal decimal number and hence it returns true.
We can use this method as a replacement for what we did in section 3, where we’re trying to parse a number and checking for an error.
5.3. StringUtils.isNumeric(CharSequence)
The method StringUtils.isNumeric(CharSequence) checks strictly for Unicode digits. This means:
- Any digits from any language that is a Unicode digit is acceptable
- Since a decimal point is not considered as a Unicode digit, it's not valid
- Leading signs (either positive or negative) are also not acceptable
Let's now see this method in action:
1
2
3
4
5
6
7
8
9
10
11
|
assertThat(StringUtils.isNumeric(
"123"
)).isTrue();
assertThat(StringUtils.isNumeric(
"١٢٣"
)).isTrue();
assertThat(StringUtils.isNumeric(
"१२३"
)).isTrue();
assertThat(StringUtils.isNumeric(
null
)).isFalse();
assertThat(StringUtils.isNumeric(
""
)).isFalse();
assertThat(StringUtils.isNumeric(
" "
)).isFalse();
assertThat(StringUtils.isNumeric(
"12 3"
)).isFalse();
assertThat(StringUtils.isNumeric(
"ab2c"
)).isFalse();
assertThat(StringUtils.isNumeric(
"12.3"
)).isFalse();
assertThat(StringUtils.isNumeric(
"-123"
)).isFalse();
|
Note that the input parameters in lines 2 and 3 are representing numbers 123 in Arabic and Devanagari respectively. Since they're valid Unicode digits, this method returns true on them.
5.4. StringUtils.isNumericSpace(CharSequence)
The StringUtils.isNumericSpace(CharSequence) checks strictly for Unicode digits and/or space. This is same as StringUtils.isNumeric() with the only difference being that it accepts spaces as well, not only leading and trailing spaces but also if they're in between numbers:
1
2
3
4
5
6
7
8
9
10
|
assertThat(StringUtils.isNumericSpace(
"123"
)).isTrue();
assertThat(StringUtils.isNumericSpace(
"١٢٣"
)).isTrue();
assertThat(StringUtils.isNumericSpace(
""
)).isTrue();
assertThat(StringUtils.isNumericSpace(
" "
)).isTrue();
assertThat(StringUtils.isNumericSpace(
"12 3"
)).isTrue();
assertThat(StringUtils.isNumericSpace(
null
)).isFalse();
assertThat(StringUtils.isNumericSpace(
"ab2c"
)).isFalse();
assertThat(StringUtils.isNumericSpace(
"12.3"
)).isFalse();
assertThat(StringUtils.isNumericSpace(
"-123"
)).isFalse();
|
6. Benchmarks
Before we conclude this article, let's go through some benchmark results to help us to analyze which of the above-mentioned methods are best for our use-case.
6.1. Simple Benchmark
First, we take a simple approach. We pick one string value – for our test we use Integer.MAX_VALUE. Then, that value will be tested against all our implementations:
Benchmark Mode Cnt Score Error Units
Benchmarking.usingCoreJava avgt 20 57.241 ± 0.792 ns
/op
Benchmarking.usingNumberUtils_isCreatable avgt 20 26.711 ± 1.110 ns
/op
Benchmarking.usingNumberUtils_isParsable avgt 20 46.577 ± 1.973 ns
/op
Benchmarking.usingRegularExpressions avgt 20 101.580 ± 4.244 ns
/op
Benchmarking.usingStringUtils_isNumeric avgt 20 35.885 ± 1.691 ns
/op
Benchmarking.usingStringUtils_isNumericSpace avgt 20 31.979 ± 1.393 ns
/op
|
As we see, the most costly operations are regular expressions. After that is our core Java-based solution.
Moreover, note that the operations using the Apache Commons library are by-and-large the same.
6.2. Enhanced Benchmark
Let's use a more diverse set of tests, for a more representative benchmark:
- 95 values are numeric (0-94 and Integer.MAX_VALUE)
- 3 contain numbers but are still malformatted — ‘x0‘, ‘0..005′, and ‘–11‘
- 1 contains only text
- 1 is a null
Upon executing the same tests, we'll see the results:
Benchmark Mode Cnt Score Error Units
Benchmarking.usingCoreJava avgt 20 10162.872 ± 798.387 ns
/op
Benchmarking.usingNumberUtils_isCreatable avgt 20 1703.243 ± 108.244 ns
/op
Benchmarking.usingNumberUtils_isParsable avgt 20 1589.915 ± 203.052 ns
/op
Benchmarking.usingRegularExpressions avgt 20 7168.761 ± 344.597 ns
/op
Benchmarking.usingStringUtils_isNumeric avgt 20 1071.753 ± 8.657 ns
/op
Benchmarking.usingStringUtils_isNumericSpace avgt 20 1157.722 ± 24.139 ns
/op
|
The most important difference is that two of our tests – the regular expressions solution and the core Java-based solution – have traded places.
From this result, we learn that throwing and handling of the NumberFormatException, which occurs in only 5% of the cases, has a relatively big impact on the overall performance. So, we conclude, that the optimal solution depends on our expected input.
Also, we can safely conclude that we should use the methods from the Commons library or a method implemented similarly for optimal performance.
7. Conclusion
In this article, we explored different ways to find if a String is numeric or not. We looked at both solutions – built-in methods and also external libraries.
As always, the implementation of all examples and code snippets given above including the code used to perform benchmarks can be found over on GitHub.
来源:oschina
链接:https://my.oschina.net/ciet/blog/3161977