Implementing a matcher for the regex '[ab][^r]+r]' in assembly

前端 未结 2 1879
名媛妹妹
名媛妹妹 2021-01-21 20:36

I need help with my assembly code. I need to use write code, that will find range, that suit to my regex expression.

My regex: [ab][^r]+r, so first i\'m loo

2条回答
  •  一个人的身影
    2021-01-21 20:40

    I realize this doesn't qualify as an 'answer' since the assignment requires you to use the specific format provided by your instructor. However since I feel that using inline asm is a poor way to learn asm, I want to show how this would look if you wrote this as pure asm. Trying to cram this into the other (already very long) answer seems like a poor fit.

    Instead I propose 2 files. The first is pure C code:

    #include 
    
    extern int __cdecl DoRegEx(const char *s, int *startpos);
    
    int main(void) 
    {
      const char *s = "fqr  b qabxx  xryc pqr";
      int startpos, len;
    
      len = DoRegEx(s, &startpos);
    
      printf("%d, %d\n", startpos, len);
      return 0; 
    }
    

    That's much easier to read/maintain than what you end up with using inline asm. But more importantly, here's the asm file:

    # foo2.s - Searches for regex "[ab][^r]+r" in string
    #
    # Called from c with:
    #
    #    extern int __cdecl DoRegEx(const char *s, int *startpos);
    #
    # On input:
    #
    #   [esp+4] is s
    #   [esp+8] is pointer to startpos.
    #
    # On output:
    #
    #   startpos is the (zero based) offset into (s) where match begins.
    #   Length of match (or -1 if match not found) is returned in eax.
    #
    # __cdecl allows the callee (that's us) to modify any of EAX, ECX, 
    # and EDX. All other registers must be returned unchanged.
    #
    
    # Use intel syntax
    .intel_syntax noprefix
    
    # export our symbol (note __cdecl prepends an underscore to names).
    .global _DoRegEx
    
    # Start code segment
    .text
    
    _DoRegEx:
       mov ecx, [esp+4] # Load pointer to (s)
    
    Phase1:
       mov dl, [ecx]    # Read next byte
    
       test dl, dl 
       jz NotFound      # Hit end of string
    
       inc ecx          # Move to next byte
    
       cmp dl, 'a'      # Check for 'a'
       je Phase2
    
       cmp dl, 'b'      # Check for 'b'
       jne Phase1
    
       ... blah blah blah ...
    
       mov edx, [esp+8]          # get pointer to startpos
       mov DWORD PTR [edx], ecx  # write startpos
    
       ret
    

    You can compile+link both files at once using gcc -m32 -o foo.exe foo1.c foo2.s.

    If you end up working with assembler for a living, it's more likely to look like this than what you see using gcc's extended asm (which is ugly at the best of times). It also deals with common real-world concepts like reading parameters from the stack, preserving registers and using assembler directives (.text, .global, etc). Those things are mostly hidden from you when inlining this into C, but are essential components of working in and understanding assembly language.

    FWIW.

    PS Did you get your code working? If the other answer gave sufficient information to create your program, don't forget to 'accept' it. If you are stuck again, edit your original post to add your current code, and include a description of what still doesn't work right.

提交回复
热议问题