Hi I\'ve got a simple date format set up with a custom format string: MMddyy
and I give it the following value to parse: 4 1 01
I don\'t think it should pars
This is expected behaviour - you are telling the DateFormat object to expect a 6 character String representation of a date and that is what you passed in. Spaces are parsed OK. However, if you used "4x1x01" you would get an error. Note that when parsing, leniency defaults to true e.g.
DateFormat df = new SimpleDateFormat("MMddyy");
Date date = df.parse("4 1 01"); // runs successfully (as you know)
DateFormat df = new SimpleDateFormat("MMddyy");
Date date = df.parse("41 01"); // 5 character String - runs successfully
DateFormat df = new SimpleDateFormat("MMddyy");
df.setLenient(false);
Date date = df.parse("41 01"); // 5 character String - causes exception
DateFormat df = new SimpleDateFormat("MMddyy");
Date date = df.parse("999999"); // 6 character String - runs successfully
DateFormat df = new SimpleDateFormat("MMddyy");
df.setLenient(false);
Date date = df.parse("999999"); // 6 character String - causes exception
When leniency is set to true (the default behaviour), the parse makes an effort to decipher invalid input e.g. the 35th day of a 31 day month becomes the 4th day of the next month.
for parsing the size of a pattern (number of repeated characters) is not the expected size of the corresponding text. From the javadoc, for the different relevant presentation types:
- Number: For parsing, the number of pattern letters is ignored unless it's needed to separate two adjacent fields.
- Year: During parsing, only strings consisting of exactly two digits […] will be parsed into the default century. Any other numeric string, such as a one digit string, a three or more digit string, or a two digit string that isn't all digits (for example, "-1"), is interpreted literally. So "01/02/3" or "01/02/003" are parsed, using the same pattern
- Month: If the number of pattern letters is 3 or more, the month is interpreted as text; otherwise, it is interpreted as a number.
The whitespace causes the parser to stop parsing the actual field (trailing spaces are not valid for numbers) and start with the next one. Since the pattern does not have a space between these two fields, it is not consumed and is part of the second field (leading spaces are valid). So the year got is not "exactly two digits" and will not be parsed into the default century.
Parsing tests (lenient
set to false
):
FORMAT TEXT RESULT (ISO yyyy-MM-dd)
-------------------------------------------------
ffffdyy 01011 2011-01-10
ffffdyy 10 11 0011-01-10 (year is 3 chars: " 11")
ffffdyy 10 1 0001-01-10 (year is 2 char but not 2 digits: " 1")
ffffdy 01011 2011-01-10 ("y" same as "yy")
dd yy 10 11 2011-01-10 (ok, whitespace is consumed, year: "11")
d/y 3/4 0004-01-03 (year is not 2 digits)
d/y 3/04 2004-01-03
M/d/y 4/6/11 2011-04-06
The 2 digit year is ambiguous - and it is therefore assuming 0001 - the first year that would have ended in 01. Can you convert to 4 digit years - maybe using String manipulation?