I\'m receiving a string from an external process. I want to use that String to make a filename, and then write to that file. Here\'s my code snippet to do this:
You could remove the invalid chars ( '/', '\', '?', '*') and then use it.
This is probably not the most effective way, but shows how to do it using Java 8 pipelines:
private static String sanitizeFileName(String name) {
return name
.chars()
.mapToObj(i -> (char) i)
.map(c -> Character.isWhitespace(c) ? '_' : c)
.filter(c -> Character.isLetterOrDigit(c) || c == '-' || c == '_')
.map(String::valueOf)
.collect(Collectors.joining());
}
The solution could be improved by creating custom collector which uses StringBuilder, so you do not have to cast each light-weight character to a heavy-weight string.
For those looking for a general solution, these might be common critera:
To achieve this we can use regex to match illegal characters, percent-encode them, then constrain the length of the encoded string.
private static final Pattern PATTERN = Pattern.compile("[^A-Za-z0-9_\\-]");
private static final int MAX_LENGTH = 127;
public static String escapeStringAsFilename(String in){
StringBuffer sb = new StringBuffer();
// Apply the regex.
Matcher m = PATTERN.matcher(in);
while (m.find()) {
// Convert matched character to percent-encoded.
String replacement = "%"+Integer.toHexString(m.group().charAt(0)).toUpperCase();
m.appendReplacement(sb,replacement);
}
m.appendTail(sb);
String encoded = sb.toString();
// Truncate the string.
int end = Math.min(encoded.length(),MAX_LENGTH);
return encoded.substring(0,end);
}
Patterns
The pattern above is based on a conservative subset of allowed characters in the POSIX spec.
If you want to allow the dot character, use:
private static final Pattern PATTERN = Pattern.compile("[^A-Za-z0-9_\\-\\.]");
Just be wary of strings like "." and ".."
If you want to avoid collisions on case insensitive filesystems, you'll need to escape capitals:
private static final Pattern PATTERN = Pattern.compile("[^a-z0-9_\\-]");
Or escape lower case letters:
private static final Pattern PATTERN = Pattern.compile("[^A-Z0-9_\\-]");
Rather than using a whitelist, you may choose to blacklist reserved characters for your specific filesystem. E.G. This regex suits FAT32 filesystems:
private static final Pattern PATTERN = Pattern.compile("[%\\.\"\\*/:<>\\?\\\\\\|\\+,\\.;=\\[\\]]");
Length
On Android, 127 characters is the safe limit. Many filesystems allow 255 characters.
If you prefer to retain the tail, rather than the head of your string, use:
// Truncate the string.
int start = Math.max(0,encoded.length()-MAX_LENGTH);
return encoded.substring(start,encoded.length());
Decoding
To convert the filename back to the original string, use:
URLDecoder.decode(filename, "UTF-8");
Limitations
Because longer strings are truncated, there is the possibility of a name collision when encoding, or corruption when decoding.