lex | 易学教程

在Visual Studio2008中搭建lex和yacc调试环境

阅读更多关于在Visual Studio2008中搭建lex和yacc调试环境

本文为原创，部分bat代码来自熊春雷前辈的博文： http://blog.csdn.net/pandaxcl/archive/2006/07/04/873898.aspx 为什么要使用lex和yacc 最近的项目需要写一个Language Service，于是不可避免的涉足到了lex和yacc。lex和yacc原本是UNIX系统下的两个工具，用于编写涉及文本分析的程序。在Linux下面有两个GNU的工具：flex和bison，用来代替原始的lex和yacc。熊春雷前辈在上面的博文中介绍了如何获取flex和bison的Win32版，和如何在Win32环境下配置这两个工具，其中还要用到Windows版本的GCC，还是挺麻烦的。为什么要在Visual Studio2008中调试lex和yacc程序正如上面所说的配置环境和使用GCC对于大多数Windows程序员来说是挺麻烦的，事实上我们完全可以使用Visual Studio带的C/C++编译器。使用Visual Studio2008的强大编辑环境，能够轻松的调试你的lex和yacc程序。配置思路事实上，上面提到的flex和bison分别将*.l和*.y编译成C语言代码，然后我们用Visual Studio带的C/C++编译器就可以编译这个C代码，生成可执行文件。（顺便提一下，在开发Language Service时

源码阅读1

阅读更多关于源码阅读1

LEX_STRING service_thd_alloc.h struct st_mysql_lex_string { char *str; size_t length; }; my_global.h typedef struct st_mysql_lex_string LEX_STRING; 来源： https://www.cnblogs.com/jie828/p/12410875.html

Problems with PLY LEX and YACC

阅读更多关于 Problems with PLY LEX and YACC

问题 I am trying to run the first part of a simple example of the PLY but I encounter a strange error. When I run the following code, it gives me an error regarding lex.lex() Anyone knows what the problem is? import ply.lex as lex tokens = [ 'NAME','NUMBER','PLUS','MINUS','TIMES', 'DIVIDE', 'EQUALS' ] t_ignore = '\t' t_PLUS = r'\+' t_MINUS = r'-' t_TIMES = r'\*' t_DIVIDE = r'/' t_EQUALS = r'=' t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*' def t_NUMBER(t): r'\d+' t.value = int(t.value) return t lex.lex() #

LEX和YACC的使用三

阅读更多关于 LEX和YACC的使用三

2.4.3 yacc解决二义性和冲突的方法在2.3.8中已涉及到二义性和冲突的问题，这里再集中介绍一下，这在写Yacc源程序时会经常碰到。二义性会带来冲突。在2.3.8中我们介绍了yacc可以用为算符确定优先级和结合规则解决由二义性造成的冲突，但是有一些由二义性造成的冲突不易通过优先级方法解决，如有名的例子： stat：IF bexp THEN stat |IF bexp THEN stat ELSE stat ; 对于这样的二义性造成的冲突和一些不是由二义性造成的冲突，Yacc提供了下面两条消除二义性的规则： A1．出现移进／归约冲突时，进行移进； A2. 出现归约／归约冲突时，按照产生式在yacc源程序中出现的次序，用先出现的产生式归约。我们可以看出用这两条规则解决上面的IF语句二义性问题是合乎我们需要的。所以用户不必将上述文法改造成无二义性的。当Yacc用上述两条规则消除了二义性，它将给出相应信息。下面再稍微严格地介绍一下Yacc如何利用优先级和结合性来解决冲突的。 Yacc源程序中的产生式也有一个优先级和结合性．这个优先级和结合性就是该产生式右部最后一个终结符或文字字符的优先级和结合性，当使用了％Prec子句时，该产生式的优先级和结合性由％Prec子句决定。当然如果产生式右部最后一个终结符或文字字符没有优先级或结合性，则该产生式也没有优先级或结合性。根据终结符

lex/flex 笔记

阅读更多关于 lex/flex 笔记

Lex的匹配策略： 1. 按最长匹配原则确定被选中的单词 2. 如果一个字符串能被若干正规式匹配，则先匹配排在前面的正规式。 lex源程序的写法：Lex源程序必须按照Lex语言的规范来写，其核心是一组词法规则（正规式）。一般而言，一个Lex源程序分为三部分，三部分之间以符号%%分隔。定义段 %% 词法规则段 %% 辅助函数段 Lex源程序中常用到的变量及函数： yyin和yyout：这是Lex中本身已定义的输入和输出文件指针。这两个变量指明了lex生成的词法分析器从哪里获得输入和输出到哪里。默认：键盘输入，屏幕输出。 yytext和yyleng：这也是lex中已定义的变量，直接用就可以了。 yytext：指向当前识别的词法单元（词文）的指针 yyleng：当前词法单元的长度。 ECHO：Lex中预定义的宏，可以出现在动作中，相当于fprintf(yyout, “%s”,yytext)，即输出当前匹配的词法单元。 yylex()：词法分析器驱动程序，用Lex翻译器生成的lex.yy.c内必然含有这个函数。 yywrap()：词法分析器遇到文件结尾时会调用yywrap()来决定下一步怎么做：若yywrap()返回0，则继续扫描若返回1，则返回报告文件结尾的0标记。 1. 用lex翻译器编译lex源程序命令（假设filename.l是lex源程序名）： flex filename

Flex and Bison: Beginning a sentence with a specific keyword

阅读更多关于 Flex and Bison: Beginning a sentence with a specific keyword

问题 I am working on a program using Flex and Bison. My task can be done using only Flex(using start conditions etc.), but I have understood that using Bison might make my life easier. My task is to design a program which recognizes a programming language declaration part. Its syntax and logic can be understood through my code below. My problem is that I want my program to recognize as an acceptable declaration part every part of code which begins only with the " var " keyword! Until now, I have

Lexer rule (regex) for TeX equation

阅读更多关于 Lexer rule (regex) for TeX equation

问题 In TeX, equations are defined in between $...$ . How can I define the lexer rule for lex, for the instance of any number of any characters between two dollar signs ? I tried: equation \$[^\$]*\$ without a success. 回答1: You can try using start conditions if you don't want the dollar signs to be included as part of the equation: %x EQN %% \$ { BEGIN(EQN); } /* switch to EQN start condition upon seeing $ */ <EQN>{ \$ { BEGIN(INITIAL); } /* return to initial state upon seeing another $ */ [^\$]*

tokenizing ints vs floats in lex/flex

阅读更多关于 tokenizing ints vs floats in lex/flex

问题 I'm teaching myself a little flex/bison for fun. I'm writing an interpreter for the 1975 version of MS Extended BASIC (Extended as in "has strings"). I'm slightly stumped by one issue though. Floats can be identified by looking for a . or an E (etc), and then fail over to an int otherwise. So I did this... [0-9]*[0-9.][0-9]*([Ee][-+]?[0-9]+)? { yylval.d = atof(yytext); return FLOAT; } [0-9]+ { yylval.i = atoi(yytext); return INT; } sub-fields in the yylval union are .d for double, .i for int

How Compiler distinguishes minus and negative number during parser process

阅读更多关于 How Compiler distinguishes minus and negative number during parser process

问题 Hey I'm recently involved in a compiler developer, I've encountered a problem with minus sign(-) and negative number(-1). Suppose now I have 5--3, 5+-3, how to write a grammar rule such that during the abstract syntax tree construction, yacc will produce a correct abstract syntax tree? My grammar is like: expr : constant {} | id {} | exec_expr {} exec_expr : expr PLUS expr {} | expr MINUS expr {} | expr MUL expr {} | expr DIV expr {} My thought for now is to have a UMINUS symbol with highest

Lex: identifier vs integer

阅读更多关于 Lex: identifier vs integer

问题 I'm trying to create my own simple programming language. For this I need to insert some regex into Lex. I'm using the following regex to match identifiers and integers. [a-zA-Z][a-zA-Z0-9]* /* identifier */ return IDENTIFIER; ("+"|"-")?[0-9]+ /* integer */ return INTEGER; Now when I check for example an illegal identifier like: 0a = 1; The leading zero is recognized as an integer followed by the 'a' recognized as an identifier. Instead of this I want this token '0a' to be recognized as an

订阅 lex