Is there an algorithm that can produce a regular expression (maybe limited to a simplified grammar) from a set of strings such that the evaluation of all possible strings th
You can try to use Aho-Corasick algorithm to create a finite state machine from the input strings, after which it should be somewhat easy to generate the simplified regex. Your input strings as example:
h_q1_a
h_q1_b
h_q1_c
h_p2_a
h_p2_b
h_p2_c
will generate a finite machine that most probably look like this:
[h_] <-level 0
/ \
[q1] [p2] <-level 1
\ /
[_] <-level 2
/\ \
/ \ \
a b c <-level 3
Now for every level/depth of the trie all the stings (if multiple) will go under OR
brackets, so
h_(q1|p2)_(a|b|c)
L0 L1 L2 L3