javascript regex for extracting filename from Content-Disposition header

后端 未结 6 1534
忘掉有多难
忘掉有多难 2021-01-11 11:58

Content-disposition header contains filename which can be easily extracted, but sometimes it contains double quotes, sometimes no quotes and there are probably some other va

相关标签:
6条回答
  • 2021-01-11 12:12

    Here is my regular expression. It works on Javascript.

    filename\*?=((['"])[\s\S]*?\2|[^;\n]*)
    

    I used this in my project.

    0 讨论(0)
  • 2021-01-11 12:14

    You could try something in this spirit:

    filename[^;=\n]*=((['"]).*?\2|[^;\n]*)
    
    filename      # match filename, followed by
    [^;=\n]*      # anything but a ;, a = or a newline
    =
    (             # first capturing group
        (['"])    # either single or double quote, put it in capturing group 2
        .*?       # anything up until the first...
        \2        # matching quote (single if we found single, double if we find double)
    |             # OR
        [^;\n]*   # anything but a ; or a newline
    )
    

    Your filename is in the first capturing group: http://regex101.com/r/hJ7tS6

    0 讨论(0)
  • 2021-01-11 12:14
    /filename[^;=\n]*=(?:(\\?['"])(.*?)\1|(?:[^\s]+'.*?')?([^;\n]*))/i
    

    https://regex101.com/r/hJ7tS6/51

    Edit: You can also use this parser: https://github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js

    0 讨论(0)
  • 2021-01-11 12:30
    filename[^;\n]*=(UTF-\d['"]*)?((['"]).*?[.]$\2|[^;\n]*)?
    

    I have upgraded Robin’s solution to do two more things:

    1. Capture filename even if it has escaped double quotes.

    2. Capture UTF-8'' part as a separate group.

    This is an ECMAScript solution.

    https://regex101.com/r/7Csdp4/3/

    0 讨论(0)
  • 2021-01-11 12:34

    Disclaimer: the following answer only works with PCRE (e.g. Python / PHP), if you have to use javascript, use Robin's answer.


    This modified version of Robin's regex strips the quotes:

    filename[^;\n=]*=(['\"])*(.*)(?(1)\1|)
    
    filename        # match filename, followed by
    [^;=\n]*        # anything but a ;, a = or a newline
    =
    (['"])*         # either single or double quote, put it in capturing group 1
    (?:utf-8\'\')?  # removes the utf-8 part from the match
    (.*)            # second capturing group, will contain the filename
    (?(1)\1|)       # if clause: if first capturing group is not empty,
                    # match it again (the quotes), else match nothing
    

    https://regex101.com/r/hJ7tS6/28

    The filename is in the second capturing group.

    0 讨论(0)
  • 2021-01-11 12:36

    Slightly modified to match my use case (strips all quotes and UTF tags)

    filename\*?=['"]?(?:UTF-\d['"]*)?([^;\r\n"']*)['"]?;?

    https://regex101.com/r/UhCzyI/3

    0 讨论(0)
提交回复
热议问题