I have a function that uses Pattern#compile
and a Matcher
to search a list of strings for a pattern.
This function is used in multiple th
To sum up, you can reuse (keep in static variables) the compiled Pattern(s) and tell them to give you new Matchers when needed to validate those regex pattens against some string
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/**
* Validation helpers
*/
public final class Validators {
private static final String EMAIL_PATTERN = "^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9-]+)*(\\.[A-Za-z]{2,})$";
private static Pattern email_pattern;
static {
email_pattern = Pattern.compile(EMAIL_PATTERN);
}
/**
* Check if e-mail is valid
*/
public static boolean isValidEmail(String email) {
Matcher matcher = email_pattern.matcher(email);
return matcher.matches();
}
}
see http://zoomicon.wordpress.com/2012/06/01/validating-e-mails-using-regular-expressions-in-java/ (near the end) regarding the RegEx pattern used above for validating e-mails (in case it doesn't fit ones needs for e-mail validation as it is posted here)
While you need to remember that thread safety has to take into account the surrounding code as well, you appear to be in luck. The fact that Matchers are created using the Pattern's matcher factory method and lack public constructors is a positive sign. Likewise, you use the compile static method to create the encompassing Pattern.
So, in short, if you do something like the example:
Pattern p = Pattern.compile("a*b");
Matcher m = p.matcher("aaaaab");
boolean b = m.matches();
you should be doing pretty well.
Follow-up to the code example for clarity: note that this example strongly implies that the Matcher thus created is thread-local with the Pattern and the test. I.e., you should not expose the Matcher thus created to any other threads.
Frankly, that's the risk of any thread-safety question. The reality is that any code can be made thread-unsafe if you try hard enough. Fortunately, there are wonderful books that teach us a whole bunch of ways that we could ruin our code. If we stay away from those mistakes, we greatly reduce our own probability of threading problems.
A quick look at the code for Matcher.java
shows a bunch of member variables including the text that is being matched, arrays for groups, a few indexes for maintain location and a few boolean
s for other state. This all points to a stateful Matcher
that would not behave well if accessed by multiple Threads
. So does the JavaDoc:
Instances of this class are not safe for use by multiple concurrent threads.
This is only an issue if, as @Bob Cross points out, you go out of your way to allow use of your Matcher
in separate Thread
s. If you need to do this, and you think that synchronization will be an issue for your code, an option you have is to use a ThreadLocal storage object to maintain a Matcher
per working thread.
Yes, from the Java API documentation for the Pattern class
Instances of this (Pattern) class are immutable and are safe for use by multiple concurrent threads. Instances of the Matcher class are not safe for such use.
If you are looking at performance centric code, attempt to reset the Matcher instance using the reset() method, instead of creating new instances. This would reset the state of the Matcher instance, making it usable for the next regex operation. In fact, it is the state maintained in the Matcher instance that is responsible for it to be unsafe for concurrent access.
Thread-safety with regular expressions in Java
SUMMARY:
The Java regular expression API has been designed to allow a single compiled pattern to be shared across multiple match operations.
You can safely call Pattern.matcher() on the same pattern from different threads and safely use the matchers concurrently. Pattern.matcher() is safe to construct matchers without synchronization. Although the method isn't synchronized, internal to the Pattern class, a volatile variable called compiled is always set after constructing a pattern and read at the start of the call to matcher(). This forces any thread referring to the Pattern to correctly "see" the contents of that object.
On the other hand, you shouldn't share a Matcher between different threads. Or at least, if you ever did, you should use explicit synchronization.