Convert HTML Table to plain Text in Power BI

烂漫一生 提交于 2021-02-10 14:57:09

问题


I am a beginner in power BI. I have to create a report with share point data. I have imported the data into dataset. However, some columns have text with html table tags or style like below -

<div class="ExternalClass5DA0D04953B047459697675F266FEABF">
   <p>​</p>
   <table width="395" border="0" cellspacing="0" cellpadding="0" style="width&#58;296pt;">
  <tbody>
     <tr height="115" style="height&#58;86.4pt;">
        <td width="395" height="115" class="xl64" style="width&#58;296pt;height&#58;86.4pt;">
        I am working on issue. I shall update the progress.&#160;<br>
        </td>
     </tr>
  </tbody>
   </table>
   <p><br></p>
</div>

But I would like to show the plain text only which is "I am working on issue. I shall update the progress."


回答1:


From this community thread, you can find a handy function for stripping all the HTML tags:

Here's the core logic (ignoring the documentation metadata for readability):

let func = (HTML) =>
    let 
        Check = if Value.Is(Value.FromText(HTML), type text) then HTML else "",
        Source = Text.From(Check),
        SplitAny = Text.SplitAny(Source,"<>"),
        ListAlternate = List.Alternate(SplitAny,1,1,1),
        ListSelect = List.Select(ListAlternate, each _<>""),
        TextCombine = Text.Combine(ListSelect, "")
    in
        TextCombine
 in 
    func

Having this handy bit of code, create a new blank query and paste the above code into the advanced editor and give it a name, say, TextFromHTML.

Once you have that function defined, you can use it in any of your queries. For example, here's what a step to transform the column ColWithHTML might look like:

Table.TransformColumns(#"Prior Step", {{"ColWithHTML", each TextFromHTML(_), type text}})


来源:https://stackoverflow.com/questions/62329169/convert-html-table-to-plain-text-in-power-bi

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!