lexer

About a Prolog tokenizer

﹥>﹥吖頭↗ 提交于 2019-12-25 03:29:23
问题 One of my assignments ask us to build a prolog tokenizer. Right now I wrote a predicate that can change space and tab it new line. But I don't know how to implement that into the main program. The replace part looks like this: replace(_, _, [], []). replace(O, R, [O|T], [R|T2]):- replace(O, R, T, T2). replace(O, R, [H|T], [H|T2]) :- H \= O, replace(O, R, T, T2). And the Main part has a predicate called removewhite(list1 list2) So how can I let removewhite execute replace? 回答1: You are a bit

Is it possible to make Antlr4 generate lexer from base grammar lexer instead of gener Lexer?

帅比萌擦擦* 提交于 2019-12-25 03:26:45
问题 I have lexer grammar called BasicTokens which is set of basic tokens for my language, having tokens like null , true , false etc. Now when I create parser grammar say BasicGrammar which imports refers BasicTokens and another grammar called InheritedGrammar which imports BasicGrammar . When Antlr4 generates the parser for InheritedGrammar it included all the rules already defined in BasicGrammar . Is there a way to make Antlr describe only the rules generated in InheritedGrammar and not in

How to approach parsing through a javascript file?

可紊 提交于 2019-12-24 18:30:58
问题 I want to parse through a javascript and find all the variable declarations, attributions, and calls to functions from a specific library. What would be the best approach:regular expressions, lexer, use something already done that does that (does it exist?)....? What I want in fact is to be assured that an object namespace and methods are not modified, and this through a static analysis. 回答1: You can not do it with regexes and probably you also do not want to write you own implementation of

Oracle Text Contains and technical content

无人久伴 提交于 2019-12-24 10:08:15
问题 I'am searching for the technical word "AN-XYZ99". So I use SELECT * FROM foo WHERE CONTAINS(bar, 'AN{-}XYZ99') > 0 but I get also results like "FO-XYZ99" or "BAR-XYZ99". What can I do to ensure the expected result? I used BEGIN CTX_DDL.CREATE_PREFERENCE('FOO','BASIC_LEXER'); CTX_DDL.SET_ATTRIBUTE('FOO', 'ALTERNATE_SPELLING', 'GERMAN'); CTX_DDL.SET_ATTRIBUTE('FOO', 'COMPOSITE', 'GERMAN'); CTX_DDL.SET_ATTRIBUTE('FOO', 'MIXED_CASE', 'NO'); END; Sample data from column "bar" (VARCHAR2(4000)):

ANTLR Lua long string grammar rules

旧巷老猫 提交于 2019-12-24 07:39:20
问题 I'm trying to create ANTLR parser for Lua. So i took grammar produced by Nicolai Mainero(available at ANTLR's site, Lua 5.1 grammar) and begin to work. Grammar is good. One thing not working: LONG STRINGS. Lua specification rule: Literal strings can also be defined using a long format enclosed by long brackets. We define an opening long bracket of level n as an opening square bracket followed by n equal signs followed by another opening square bracket. So, an opening long bracket of level 0

Why my simple Ragel grammar use all memory and crash

耗尽温柔 提交于 2019-12-24 07:26:33
问题 I am trying to convert a set of regular expression from Adblock Plus rules into an optimized function I could call from C++. I was expecting to be able to use a lexer generator such as Ragel to do this but when I try with a very small set of Regex the memory usage go very high > 30 GB and Ragel quit without error message and without producing the output file. I included the toy grammar bellow, I am trying to understand if I am doing anything stupid that could be optimized to solve the issue.

Is this handling of ambiguities in dypgen normal or is it not?

风流意气都作罢 提交于 2019-12-24 01:26:29
问题 I would like to know, if this is a bug or behavior, that is intended by the inventor. Here I have a minimal example of a dypgen grammar: { open Parse_tree let dyp_merge = Dyp.keep_all } %start main %layout [' ' '\t'] %% main: | a "\n" { $1 } a: | ms b { Mt ($1,$2) } | b <Mt(_,_)> kon1 b { Koo ($1, $2, $3) } | b <Mt(_,_)> kon2 b { Koo ($1, $2, $3) } | b { $1 } b: | k { $1 } | ns b { Nt ($1,$2) } /* If you comment this line out, it will work with the permutation, but I need the 'n' ! */ /* | b

ANTLR: Lexer does not recognize token

半腔热情 提交于 2019-12-24 01:24:30
问题 Given the following Lexer grammar: lexer grammar CodeTableLexer; CodeTabHeader : '[code table 1.0]'; Code : 'code'; Table : 'table'; End : 'end'; Row : 'row'; Naming : 'naming'; Dfltlang : 'dfltlang'; Language : 'english' | 'german' | 'french' | 'italian' | 'spanish'; Null : 'null'; Number : Int ('.' Digit*)? ; Identifier : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '$' | '.' | Digit)* ; String @after { setText(getText().substring(1, getText().length() - 1).replaceAll("\\\\(.)",

boost-spirit parser lex->qi : Getting the “undocumented” on_success mechanism to work

我们两清 提交于 2019-12-23 10:16:13
问题 edit : I have ripped out the lexer as it does not cleanly integrate with Qi and just obfuscates grammars (see here). on_success isn't well documented and I am trying to wire it up to my parser. The examples dealing with on_success deal with parsers just built on qi --i.e., no lex . This is how I am trying to introduce the construct : using namespace qi::labels; qi::on_success(event_entry_,std::cout << _val << _1); But it won't compile. I am dreading the problem being lex . Could someone tell

QScintilla: how to create a new lexer or modify an existing one?

穿精又带淫゛_ 提交于 2019-12-23 03:09:22
问题 I find the default lexer for C++ highlighting not very specific enough. I want to at least be able to specify a different color for: type keyword (void, int, float etc) instruction keyword (if while do return for etc) class-related keyword (template class virtual friend) type modifiers keywords (static const extern unsigned etc) I found this in the LexerCPP source: const char *QsciLexerCPX::keywords(int set) const { if (set == 1) return "and and_eq asm auto bitand bitor bool break case "