I am running a regex in a java function to parse a document and return true if it has found the string specified by the regex and return false if it hasn\'t. But the problem
You don't show the function that actually performs the regex, so I'll assume that it reads lines from the file and executes the regex over each line.
If that is the case, then a better solution is to pass a timeout value to that function. After every N lines (whatever N might be), it checks the timeout value.
The real problem that you'll have is with blocking IO -- for example, reading from a network. In that case, there's nothing you can do from Java, as the block is actually happening in the OS kernel.
What you've done kind of looks fine to me here's how I'd modify it:
final AtomicReference<String> resultXml = new AtomicReference<String>();
RegexpThread rt = new RegexpThread() {
public void run() {
method2(m, urlCopy, document, resultXml);
}
};
rt.start();
try {
rt.join(6 * 1000);
} catch (InterruptedException e) {
return "y";
}
if(resultXml.get() == null) {
rt.interupt();
return "g";
}
resultXml.append(resultXml.get());
return resultXml.toString();
The below answer is perhaps late for the post and Java version has also changed. However, the mechanism mentioned below works for me.
The central idea is to change the input text which is being evaluated to an empty string while the matching is in progress. The input for the below test has been taken from OWASP ReDoS example. The input text has been changed as the one provided was not of adequate length for the complexity.
package org.test.xpath;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class InterruptableMatcherTest {
public static void main(String[] args) throws Exception{
Pattern pattern=Pattern.compile("^(([a-z])+.)+[A-Z]([a-z])+$");
String input="aaaaaaaaaaaaaaaaaaaaaffffdffffdffffdffffdffffdffffdffffdffffdffffdffffdffffdffffdffffdaaaaaaaaaaaa!";
PatternMatcher patternMatcher=new PatternMatcher(pattern, input);
Thread thread=new Thread(patternMatcher);
thread.start();
Thread.sleep(1*1000);
System.out.println("Done sleeping ...");
if(patternMatcher.running)patternMatcher.reset();//Without this call the program will hang
thread.join();
}//main closing
}//class closing
class PatternMatcher implements Runnable{
Pattern pattern;
Matcher matcher;
boolean running=false;
PatternMatcher(Pattern pattern, String input) {
this.pattern=pattern;
matcher=this.pattern.matcher(input);
}//constructor closing
@Override
public void run() {
running=true;
matcher.matches();
running=false;
}//run closing
void reset(){
System.out.println("Reset called ...");
matcher.reset("");
}//reset closing
}//class closing
The reset() method, resets the input of the matcher to an empty String. refer code for Matcher class, Matcher reset(CharSequence input) method, which calls the Matcher reset(), which in turn sets the start and end of the text region to be matched to 0, effectively stopping the matching process in the next stage match. The mechanism works for me by terminating the matching process after a set timeout.
You can use AOP and a @Timeable annotation from jcabi-aspects (I'm a developer):
@Timeable(limit = 1, unit = TimeUnit.SECONDS)
String yourMethod() {
// execution as usual
}
Make sure that you somewhere in your method you check for Thread#isInterrupted()
:
if (Thread.currentThread.isInterrupted()) {
throw new IllegalStateException("time out");
}
When time limit is reached your thread will get isInterrupted()
flag set to true
and it's your job to handle this situation correctly and to stop execution.