Relative to absolute paths in HTML (asp.net)

瘦欲@ 提交于 2019-12-01 04:26:00
jo_asakura

One of the possible ways to resolve this task is the use the HtmlAgilityPack library.

Some example (fix links):

WebClient client = new WebClient();
byte[] requestHTML = client.DownloadData(sourceUrl);
string sourceHTML = new UTF8Encoding().GetString(requestHTML);

HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(sourceHTML);

foreach (HtmlNode link in htmlDoc.DocumentNode.SelectNodes("//a[@href]"))
{
    if (!string.IsNullOrEmpty(link.Attributes["href"].Value))
    {
        HtmlAttribute att = link.Attributes["href"];
        att.Value = this.AbsoluteUrlByRelative(att.Value);
    }
}

if the request comes in from your site (same domain links) then you can use this:

new Uri(Request.Uri, "/img/welcome.png").ToString();

If you're in a non-web app, or you want to hardcode the domain name:

new Uri("http://www.mysite.com", "/img/welcome.png").ToString();

You have some options:

  1. You can convert your byte array to a string and find replace.
  2. You can create a DOM object, convert the byte array to string, load it and append the value to the attributes where needed (basically you are looking for any src, href attribute that doesn't have http: or https: in it).
    Console.Write(ControlChars.Cr + "Please enter a Url(for example, http://www.msn.com): ")
    Dim remoteUrl As String = Console.ReadLine()
    Dim myWebClient As New WebClient()
    Console.WriteLine(("Downloading " + remoteUrl))
    Dim myDatabuffer As Byte() = myWebClient.DownloadData(remoteUrl)
    Dim download As String = Encoding.ASCII.GetString(myDataBuffer)
    download.Replace("src=""/", "src=""" & remoteUrl & "/")
    download.Replace("href=""/", "href=""" & remoteUrl & "/")
    Console.WriteLine(download)
    Console.WriteLine("Download successful.")

This is super contrived and actually the main brunt of it is taken directly from : http://msdn.microsoft.com/en-us/library/xz398a3f.aspx but it illustrates the basic principal behind method 1.

Just use this function

'# converts relative URL ro Absolute URI
    Function RelativeToAbsoluteUrl(ByVal baseURI As Uri, ByVal RelativeUrl As String) As Uri
        ' get action tags, relative or absolute
        Dim uriReturn As Uri = New Uri(RelativeUrl, UriKind.RelativeOrAbsolute)
        ' Make it absolute if it's relative
        If Not uriReturn.IsAbsoluteUri Then
            Dim baseUrl As Uri = baseURI
            uriReturn = New Uri(baseUrl, uriReturn)
        End If
        Return uriReturn
    End Function

Instead of resolving/completing relative paths, you can try to set the base-element with the href-attrib = the original baseURI in question.

Placed as the first child of the header-element, all following relative paths should be resolved by browser to point to the original destination, not to where the doc (newsletter) is located/comes from.

on firefox, some tautologic(<-in formal logics) to-and-fro of getting/setting of all src/href-attribs resumes in having COMPLETE paths written to all layers(serialized) of the html-doc, thus scriptable, saveable ...:

var d=document;
var n= d.querySelectorAll('[src]'); // do the same for [href] ...
var i=0; var op ="";var ops="";
for (i=0;i<n.length;i++){op = op + n[i].src + "\n";ops=n[i].src;
n[i].src=ops;}
alert(op);

Of course, the url()-func bases as given in the STYLE-Element(s, - for background-img or content-rules) as well as in style-attrib's at node-level and in particular the url()-func-stated src/href-values are NOT regarded/tested by any of the solutions above.

Therefore, to get the base-Elem approach to a valid, tested (compat-list) state, seems the more promising notion to me.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!