Are Perl regexes turing complete?

房东的猫 提交于 2019-11-27 18:41:36

Excluding any kind of embedded code, such as ?{ }, they probably don't cover all of context-free, much less Turing Machines. They might, but to my knowledge, nobody has actually proven it one way or another. Given that people have been trying to solve certain context-free problems with Perl regexes for a while and haven't come up with a solution yet, it's likely that they are not context-free.

There is an interesting discussion to be had about what features are merely convenient, and which actually add power. For instance, matching 0n*1*0n (that's notation for "any number of zeros, followed by a one, followed by the same number of zeros as before") is not something that can be done with pure regexes. You can prove this can't be done with regexes using the Pumping Lemma, but the simple, informal proof is that the regex would have to count an arbitrary number of zeros, and regexes can't do counting.

However, backreferences can match that with:

/(0*) 1 \1/x;

So that means backreferences give you more power, and are not a mere convenience. What else might give us more power, I wonder?

Also, Perl6 "patterns" (they're not even pretending they're regexes anymore) are designed to look kinda like Perl5 regexes (so you don't need to relearn much), but they have enough features added to be fully context-free. They're actually designed so you can use them to alter the way the language is parsed within a lexical scope.

There are at least two discussions: Turing completeness and regular expressions and Are Perl patterns universal? with further references.

The consensus (to my untrained eye) seems to be that the answer is "no", but I am not sure if I understand everything correctly.

usr

For regexes in Perl there are two cases:

  1. With embedded code: They are of course Turing-complete.
  2. Without embedded code: They always halt so they are not general Turing machines.

Every regular language can be accepted by a finite automaton. Its input must be a finite string.

[...] a deterministic finite automaton (DFA)—also known as deterministic finite state machine—is a finite state machine that accepts/rejects finite strings of symbols [...].

The same goes for Turing machines: The formal definition does not even have input. It must be encoded in the finite number of states.

Alternative (equivalent) definitions include input, but it must be finite.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!