dfa

Create set of all possible matches for a given regex

夙愿已清 提交于 2019-12-02 22:16:12
I'm wondering how to find a set of all matches to a given regex with a finite number of matches. For example: All of these example you can assume they start with ^ and end with $ `hello?` -> (hell, hello) `[1-9][0-9]{0,3}` -> (1,2,3 ..., 9998, 9999) `My (cat|dog) is awesome!` -> (My cat is awesome!, My dog is awesome!) `1{1,10}` -> (1,11, ..., 111111111, 1111111111) `1*` -> //error `1+` -> //error `(1|11){2}` -> (1,11,111,1111) //notice how it doesn't repeat any of the possibilities I'd also be interested if there was a way of retrieving count a unique a solutions to the regex or if there is a

How to find the intersection of two NFA

删除回忆录丶 提交于 2019-12-01 07:46:13
In DFA we can do the intersection of two automata by doing the cross product of the states of the two automata and accepting those states that are accepting in both the initial automata. Union is performed similarly. How ever although i can do union in NFA easily using epsilon transition how do i do their intersection? You can use the cross-product construction on NFAs just as you would DFAs. The only changes are how you'd handle ε-transitions. Specifically, for each state (q i , r j ) in the cross-product automaton, you add an ε-transition from that state to each pair of states (q k , r j )

How to find the intersection of two NFA

和自甴很熟 提交于 2019-12-01 04:26:24
问题 In DFA we can do the intersection of two automata by doing the cross product of the states of the two automata and accepting those states that are accepting in both the initial automata. Union is performed similarly. How ever although i can do union in NFA easily using epsilon transition how do i do their intersection? 回答1: You can use the cross-product construction on NFAs just as you would DFAs. The only changes are how you'd handle ε-transitions. Specifically, for each state (q i , r j )

boost string matching DFA

拜拜、爱过 提交于 2019-11-30 22:59:15
Given a string I have to test whether that ends with a known set of suffixes. Now as the the of suffixes are not very small and every word in the document has to be checked against that list of known suffixes. Every character in the word and suffix is char32_t . As a naive iterative matching will be expensive. Though most of the suffixes are not sub suffix or prefix of another suffix, most of them are constructed with a small set of characters. Most of the checks will be a miss rather than being a hit. So I want to build a DFA of the suffixes to minimize the cost of miss. I can manually parse

实现一个正则表达式引擎in Python(三)

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-30 11:52:35
项目地址: Regex in Python 前两篇已经完成的写了一个基于NFA的正则表达式引擎了,下面要做的就是更近一步,把NFA转换为DFA,并对DFA最小化 DFA的定义 对于NFA转换为DFA的算法,主要就是将NFA中可以状态节点进行合并,进而让状态节点对于一个输入字符都有唯一的一个跳转节点 所以对于DFA的节点就含有一个nfa状态节点的集合和一个唯一的标识和对是否是接收状态的flag class Dfa(object): STATUS_NUM = 0 def __init__(self): self.nfa_sets = [] self.accepted = False self.status_num = -1 @classmethod def nfas_to_dfa(cls, nfas): dfa = cls() for n in nfas: dfa.nfa_sets.append(n) if n.next_1 is None and n.next_2 is None: dfa.accepted = True dfa.status_num = Dfa.STATUS_NUM Dfa.STATUS_NUM = Dfa.STATUS_NUM + 1 return dfa NFA转换为DFA 将NFA转换为DFA的最终目标是获得一张跳转表,这个和之前C语言编译的语法分析表有点像

How are finite automata implemented in code?

空扰寡人 提交于 2019-11-30 07:09:13
问题 How does one implement a dfa or an nfa for that matter in Python code? What are some good ways to do it in python? And are they ever used in real world projects? 回答1: A straightforward way to represent a DFA is as a dictionary of dictionaries. For each state create a dictionary which is keyed by the letters of the alphabet and then a global dictionary which is keyed by the states. For example, the following DFA from the Wikipedia article on DFAs can be represented by a dictionary like this:

NFA/DFA implementation in C#

筅森魡賤 提交于 2019-11-30 00:50:00
Does anyone know of any good NFA and DFA implementation in C#, possibly implementing as well conversions between both? What I would like would be to be able to construct a NFA and then convert it automatically to a DFA, but without having to write my own code which would take a very long time. There is this Python code which perhaps I could use and integrate with C# using IronPython, but Python is slow. Take a look at my series of posts about this subject: Regular Expression Engine in C# (the Story) Regex engine in C# - the Regex Parser Regex engine in C# - the NFA Regex engine in C# - the DFA

[系列][编译原理]词法分析

爷,独闯天下 提交于 2019-11-29 20:34:20
作为从被人类理解的文本,到被计算机理解的文本的第一步,词法分析可通过扫描程序将源程序读入,并进行理解和分割为若干记号。由于扫描程序是格式匹配的一种特殊情况,所以需要研究在扫描过程中的格式说明和识别方法,其中最主要的就是 正则表达式 和 有穷自动机 。 词法分析的通常做法是:   1. 写出各类记号的正则表达式。   2. 根据正则表达式构造NFA。   3. 将NFA转为DFA。   4. 根据DFA就可以实现词法分析器。 下面根据 C-Minus的词法 来构造一个词法分析器。 以ID|NUM为例 首先写出ID和NUM的正则表达式   ID = letter letter*   NUM = digit digit*   letter = [a-zA-Z]   digit = [0-9] 然后利用Thompson结构将正则表达式ID|NUM转换为NFA。   首先构造letter和digit构建机器   接着构造ID和NUM的构建机器   最后构造ID|NUM的NFA 接着再利用 子集构造 将生成的NFA转换为DFA 最后根据DFA写出程序 利用子集构造模拟NFA 最小化DFA的状态数量 来源: http://www.cnblogs.com/cwblaze/archive/2010/01/28/1658109.html

Modelling a Finite Deterministic Automaton via this data

好久不见. 提交于 2019-11-29 15:50:11
I have this input file: 2 3 2 1 ab 1 0 2 0 2 0 2 0 3 abaa aab aba 3 3 2 ade 0 1 2 1 2 0 2 1 0 1 2 2 2 a de The first line represents the number of test cases. Each test case starts with 3 integers, the first is the number of state for the automaton, next is the number of symbols in the alphabet and then the number of final states. The next line is the alphabet. The symbols appear together. Then there's a number of lines equal to the number of states that describe the transition function. The first line of this group of lines represents the transition function for the first state in the

How are finite automata implemented in code?

最后都变了- 提交于 2019-11-29 03:43:05
How does one implement a dfa or an nfa for that matter in Python code? What are some good ways to do it in python? And are they ever used in real world projects? A straightforward way to represent a DFA is as a dictionary of dictionaries. For each state create a dictionary which is keyed by the letters of the alphabet and then a global dictionary which is keyed by the states. For example, the following DFA from the Wikipedia article on DFAs can be represented by a dictionary like this: dfa = {0:{'0':0, '1':1}, 1:{'0':2, '1':0}, 2:{'0':1, '1':2}} To "run" a dfa against an input string drawn