How can I match irreducible fractions with regex?
For example, 23/25, 3/4, 5/2, 100/101, etc.
First of all, I have no idea about the gcd-algorithm realization in
Since the poster requested a single regex that matches against strings like "36/270", but says it doesn’t matter how legible it is, that regex is:
my $reducible_rx = qr{^(\d+)/(\d+)$(?(?{(1x$1."/".1x$2)=~m{^(?|1+/(1)|(11+)\1*/\1+)$}})|^)};
But, if like me, you believe that an illegible regex is absolutely unacceptable, you will write that more legibly as:
my $reducible_rx = qr{
# first match a fraction:
^ ( \d+ ) / ( \d+ ) $
# now for the hard part:
(?(?{ ( 1 x $1 . "/" . 1 x $2 ) =~ m{
^
(?| 1+ / (1) # trivial case: GCD=1
| (11+) \1* / \1+ # find the GCD
)
$
}x
})
# more portable version of (*PASS)
| ^ # more portable version of (*FAIL)
)
}x;
You can improve maintainability by splitting out the version that matches the unary version from the one that matches the decimal version like this:
# this one assumes unary notation
my $unary_rx = qr{
^
(?| 1+ / (1)
| (11+) \1* / \1+
)
$
}x;
# this one assumes decimal notation and converts internally
my $decimal_rx = qr{
# first match a fraction:
^ ( \d+ ) / ( \d+ ) $
# now for the hard part:
(?(?{( 1 x $1 . "/" . 1 x $2 ) =~ $unary_rx})
# more portable version of (*PASS)
| ^ # more portable version of (*FAIL)
)
}x;
Isn’t that much easier by separating it into two named regexes? That would now make $reducible_rx
the same as $decimal_rx
, but the unary version is its own thing. That’s how I would do it, but the original poster wanted a single regex, so you’d have to interpolate the nested one for that as I first present above.
Either way, you can plug into the test harness below using:
if ($frac =~ $reducible_rx) {
cmp_ok($frac, "ne", reduce($i, $j), "$i/$j is $test");
} else {
cmp_ok($frac, "eq", reduce($i, $j), "$i/$j is $test");
}
And you will see that it is a correct regex that passes all tests, and does so moreover using a single regex, wherefore having now passed all requirements of the original question, I declare Qᴜᴏᴅ ᴇʀᴀᴛ ᴅᴇᴍᴏɴsᴛʀᴀɴᴅᴜᴍ: “Quit, enough done.”
If you write the numbers in unary, and use ":" as the division sign, I think this matches reducible fractions:
/^1+:1$|^(11+):\1$|^(11+?)\2+:\2\2+$/
You can then use !~ to find strings that don't match.
Based on this: http://montreal.pm.org/tech/neil_kandalgaonkar.shtml
Nope it cannot be done. Like a good computer scientist I will ignore the specifics of the tool regex and assume you are asking if there is a regular expression. I do not have enough knowledge about regex's features to ensure it is restricted to regular expressions. That caveat aside, on with the show.
Rewording this we get:
Let
L
be the language {"a
/b
"| wherea
andb
are natural numbers encoded in a radixr
anda
andb
are coprime}. IsL
regular?
Assume such a language is regular. Then there exists a DFA that can decide membership in L
. Let N
be the number of states of such a DFA. There are an infinite number of primes. As the number of primes is infinite, there are arbitrarily many primes greater than the largest number encodable in N
digits in the radix r
. (Note: The largest number is clearly r
raised to the power of N
. I am using this weird wording to show how to accommodate unary.) Select N+1
primes that are greater than this number. All of these numbers are encoded using at least N+1
digits (in the radix r
). Enumerate these primes p₀
to pₙ
. Let sᵢ
be the state of the pᵢ
is in immediately after reading the /
. By the pigeon hole principle, there are N
states and N+1
sᵢ
states so there exists at least one pair of indexes (j,k)
such that sⱼ = sₖ
. So starting from the initial state of the DFA, inputs pₖ/
and pⱼ/
lead to the same state sⱼ
(or sₖ
) and pⱼ
and pₖ
are distinct primes.
L
must accept all pairs of distinct primes p/q
as they are coprime and reject all primes divided by themselves p/p
as p
is not coprime to p
. Now the language accepts pⱼ = pₖ
so there is a sequence of states from sⱼ
using the string pₖ
to an accepting state, call this sequence β
. Let α
be the sequence of states reading pₖ
starting from the initial state. The sequence of states for the DFA starting at the initial state for the string pₖ/pₖ
must be the same as α
followed by β
. This sequence starts in an initial state, goes to sₖ
(by reading the input pₖ
), and reaches an accepting state by reading pₖ
. The DFA accepts pₖ/pₖ
and pₖ/pₖ
is in L
. pₖ
is not coprime to pₖ
, and therefore pₖ/pₖ
is not in L
. Contradiction. Therefore the language L
is irregular, or no regular expression exists.
You can know, that a number, ending in (0,5) is divisible by (5), or ending in (2,4,6,8,0) is divisible by 2.
For 3,4,6,7,8,9 as divisors, I wouldn't expect a possibility, and not for arbitrary divisors too.
I guess you know the method, to decide divisibility by 3 - to build the rekursive crosssum, which has to be divisible by 3, to make the number divisible. So there you could eliminate all 3s, 6s and 9s from the number, as well as the 0. For an arbitrary number, you would proceed this way:
If the result is empty, the number was divisible by 3:
echo ${RANDOM}${RANDOM}${RANDOM} | sed 's/[0369]//g;s/[47]/1/g;s/[58]/2/g;s/2/11/g;s/1\{3\}//g'
A similar approach could work for 9, where you have a similar rule. But a general approach for arbitrary divisors?