Content-disposition header contains filename which can be easily extracted, but sometimes it contains double quotes, sometimes no quotes and there are probably some other va
Here is my regular expression. It works on Javascript.
filename\*?=((['"])[\s\S]*?\2|[^;\n]*)
I used this in my project.
You could try something in this spirit:
filename[^;=\n]*=((['"]).*?\2|[^;\n]*)
filename # match filename, followed by
[^;=\n]* # anything but a ;, a = or a newline
=
( # first capturing group
(['"]) # either single or double quote, put it in capturing group 2
.*? # anything up until the first...
\2 # matching quote (single if we found single, double if we find double)
| # OR
[^;\n]* # anything but a ; or a newline
)
Your filename is in the first capturing group: http://regex101.com/r/hJ7tS6
/filename[^;=\n]*=(?:(\\?['"])(.*?)\1|(?:[^\s]+'.*?')?([^;\n]*))/i
https://regex101.com/r/hJ7tS6/51
Edit: You can also use this parser: https://github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js
filename[^;\n]*=(UTF-\d['"]*)?((['"]).*?[.]$\2|[^;\n]*)?
I have upgraded Robin’s solution to do two more things:
Capture filename even if it has escaped double quotes.
Capture UTF-8'' part as a separate group.
This is an ECMAScript solution.
https://regex101.com/r/7Csdp4/3/
Disclaimer: the following answer only works with PCRE (e.g. Python / PHP), if you have to use javascript, use Robin's answer.
This modified version of Robin's regex strips the quotes:
filename[^;\n=]*=(['\"])*(.*)(?(1)\1|)
filename # match filename, followed by
[^;=\n]* # anything but a ;, a = or a newline
=
(['"])* # either single or double quote, put it in capturing group 1
(?:utf-8\'\')? # removes the utf-8 part from the match
(.*) # second capturing group, will contain the filename
(?(1)\1|) # if clause: if first capturing group is not empty,
# match it again (the quotes), else match nothing
https://regex101.com/r/hJ7tS6/28
The filename is in the second capturing group.
Slightly modified to match my use case (strips all quotes and UTF tags)
filename\*?=['"]?(?:UTF-\d['"]*)?([^;\r\n"']*)['"]?;?
https://regex101.com/r/UhCzyI/3