wordpress - Insert posts programmatically while maintaining links

早过忘川 提交于 2021-01-28 05:41:57

问题


I am currently working on a migration script to insert articles from XML into Wordpress.

So far I parsed the XML and created arrays in PHP, I am looping through these arrays and insert them all one by one into Wordpress with the following code:

$post = array(
            'post_title'    => wp_strip_all_tags($article['title']),
            'post_content'  => $article['description'],
            'post_status'   => 'publish',
            'post_author'   => 1,
            'ping_status'   => 'closed',
            'post_date'     => $dateTime->format('Y-m-d H:i:s'),
            'post_type'     => $post_type
        );

        $result = wp_insert_post($post);

That all goes well, however here comes the issue: the XML's are an export from a website (unfortunately I do not know which CMS ) and in the content there can be links to files on the same site, for example:

<![CDATA[<p><strong>Shortcuts:</strong></p>
<p/>
<ul>
<li><a href="http://www.testsite.fi/julkaisut/5440/julkaisut?contentPath=fi/julkaisut/esitteet/elakkeen_hakeminen_ulkomailta">(Booklet in Finnish)</a> 
</li>
<li><a href="http://www.testsite.fi/julkaisut/5440/julkaisut?contentPath=fi/julkaisut/esitteet/sa_har_soker_du_pension_fran_utlandet">(Booklet in Swedish)</a> 
</li>
<li><a href="http://www.testsite.fi/julkaisut/5440/julkaisut?contentPath=fi/julkaisut/esitteet/pensioni_taotlemine_valismaalt">(Booklet in Estonian)</a> 
</li>
<li><a href="http://www.testsite.fi/julkaisut/5440/julkaisut?contentPath=fi/julkaisut/esitteet/poluchenie_pensii_iz_drugih_stran">(Booklet in Russian)</a> 
</li>
</ul>]]>

Testsite.fi is my own site, so these are internal links.

Those links are referring to PDF's and this should be inserted into wordpress, but obviously the links will be different. I do have the PDF's that are being referred to ( for example: elakkeen_hakeminen_ulkomailta.pdf, and they are in same folder as this script is ) so all that is required is to upload this file in Wordpress programmatically or manually move it to the correct location, and then update the links so that it still works.

Any clue how to do this? I am guessing something with regular expressions, but can't really figure it out.


回答1:


To change all internal links you can use this:

$content = preg_replace('%href="http://www\.testsite\.fi/(.*)"%', 'href="' get_bloginfo('wpurl') . '/$1"', $article['description'], -1);

$post = array(
    'post_title'    => wp_strip_all_tags($article['title']),
    'post_content'  => $content,
    'post_status'   => 'publish',
    'post_author'   => 1,
    'ping_status'   => 'closed',
    'post_date'     => $dateTime->format('Y-m-d H:i:s'),
    'post_type'     => $post_type
);

$result = wp_insert_post($post);

Since the pdfs in your example do not have a filetype they can't be identified programmatically. Otherwise it would be something along the lines of:

$upload_dir = wp_upload_dir();
$content = preg_replace('%href="http://www\.testsite\.fi/(.*)/(.*).pdf"%', 'href="' . $upload_dir['url'] . '/$2.pdf"', $article['description'], -1);

where $2 is the filename for the pdf.

Note:

The href part in the regex is not neccesary but assures that you are not changing urls that are not inside a href atrribute. Depending on the scenario you can leave that part out.



来源:https://stackoverflow.com/questions/31870958/wordpress-insert-posts-programmatically-while-maintaining-links

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!