I\'m making a cross-platform application that renames files based on data retrieved online. I\'d like to sanitize the Strings I took from a web API for the current platform.
or just do this:
String filename = "A20/B22b#öA\\BC#Ä$%ld_ma.la.xps";
String sane = filename.replaceAll("[^a-zA-Z0-9\\._]+", "_");
Result: A20_B22b_A_BC_ld_ma.la.xps
Explanation:
[a-zA-Z0-9\\._]
matches a letter from a-z lower or uppercase, numbers, dots and underscores
[^a-zA-Z0-9\\._]
is the inverse. i.e. all characters which do not match the first expression
[^a-zA-Z0-9\\._]+
is a sequence of characters which do not match the first expression
So every sequence of characters which does not consist of characters from a-z, 0-9 or . _ will be replaced.
Here is the code I use:
public static String sanitizeName( String name ) {
if( null == name ) {
return "";
}
if( SystemUtils.IS_OS_LINUX ) {
return name.replaceAll( "[\u0000/]+", "" ).trim();
}
return name.replaceAll( "[\u0000-\u001f<>:\"/\\\\|?*\u007f]+", "" ).trim();
}
SystemUtils
is from Apache commons-lang3