问题
Say we have the following urls:
1. http://example.com#hash0
2. http://example.com#hash0#hash1
3. http://example.com#hash0/sample.net/
4. http://example.com#hash0/sample.net/#hash1
5. http://example.com#hash0/image.jpg
6. http://example.com#hash0/image.jpg#hash1
7. something.php#?type=abc&id=123
8. something.php#?type=abc&id=123#hash0
9. something.php/?type=abc&id=#123
....................................
and more permutations of this kind, you got the point. How can I selectively remove the "irrelevant" hashes from this kind of URLs without affecting the functionality of those URLs (so that they remain complete links or images)?
For example, from number 1 in this list I would like #hash0 to be removed, from 2 both #hash0 and #hash1, from 3 I'd like to keep it, since it's followed by a continuation of the path (yes, it's possible, check here), from 4 remove only #hash1, from 5 keep it, but from 6 remove just #hash1, ... , and from 9 I think keep it, since it might have relevance to the query (not sure about it though), and so on. Basically I'd like to remove only the hashes that don't have anything usable (like paths, queries, image files, etc.) after them - "irrelevant" hashes like #top, #bottom and such, that are referring to the current page.
I'm working on something that also involves getting the absolute URLs from relative ones (with the help of either a new anchor's href or new URL object's href), so a solution (like here) that can "blend in" with the location object's properties (like .protocol, .host, .pathname, .search, .hash, etc.) is preferable - since it might be more "trustworthy" since it's built in, but a good (and shorter) regex would be acceptable as well. All in all, shorter solutions are preferable, as I don't want my project to do extra unnecessary work for every link or image link that it encounters while it parses the entire current URL.
回答1:
Maybe this what you want, with a regular expression.
var urls = [
'http://example.com#hash0', // remove
'http://example.com#hash0#hash1', // remove
'http://example.com#hash0/sample.net/', // keep
'http://example.com#hash0/sample.net/#hash1', // remove #hash1
'http://example.com#hash0/image.jpg', // keep
'http://example.com#hash0/image.jpg#hash1', // remove #hash1
'something.php#?type=abc&id=123', // keep
'something.php#?type=abc&id=123#hash0', // remove #hash0
'something.php/?type=abc&id=#123', // remove #123
],
result = urls.map(h => h.replace(/(?:#[^#\/\?\.]*)*#[^#\/\?\.]*$/gi, ''));
console.log(result);
.as-console-wrapper { max-height: 100% !important; top: 0; }
来源:https://stackoverflow.com/questions/45886463/javascript-selectively-remove-hash-or-hashes-from-urls-so-that-the-url-remai