Extract filename and path from URL in bash script

后端 未结 13 972
别跟我提以往
别跟我提以往 2021-01-30 14:17

In my bash script I need to extract just the path from the given URL. For example, from the variable containing string:

http://login:password@example.com/one/more/dir/fi

13条回答
  •  粉色の甜心
    2021-01-30 14:33

    The Perl snippet is intriguing, and since Perl is present in most Linux distros, quite useful, but...It doesn't do the job completely. Specifically, there is a problem in translating the URL/URI format from UTF-8 into path Unicode. Let me give an example of the problem. The original URI may be:

    file:///home/username/Music/Jean-Michel%20Jarre/M%C3%A9tamorphoses/01%20-%20Je%20me%20souviens.mp3
    

    The corresponding path would be:

    /home/username/Music/Jean-Michel Jarre/Métamorphoses/01 - Je me souviens.mp3
    

    %20 became space, %C3%A9 became 'é'. Is there a Linux command, bash feature, or Perl script that can handle this transformation, or do I have to write a humongous series of sed substring substitutions? What about the reverse transformation, from path to URL/URI?

    (Follow-up)

    Looking at http://search.cpan.org/~gaas/URI-1.54/URI.pm, I first saw the as_iri method, but that was apparently missing from my Linux (or is not applicable, somehow). Turns out the solution is to replace the "->path" part with "->file". You can then break that further down using basename and dirname, etc. The solution is thus:

    path=$( echo "$url" | perl -MURI -le 'chomp($url = <>); print URI->new($url)->file' )
    

    Oddly, using "->dir" instead of "->file" does NOT extract the directory part: rather, it formats the URI so it can be used as an argument to mkdir and the like.

    (Further follow-up)

    Any reason why the line cannot be shortened to this?

    path=$( echo "$url" | perl -MURI -le 'print URI->new(<>)->file' )
    

提交回复
热议问题