I have read that it\'s a bad idea to parse XML/HTML using regular expressions. The alternative suggestion is to use an XML parser. Does one exist in the BigQuery Standard SQ
Here is the documentation to how to use Javascript UDFs in BigQuery like Elliot has mentioned.
https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions
I imagine the UDF might look something like
CREATE TEMPORARY FUNCTION XML(x STRING)
RETURNS STRING
LANGUAGE js AS """
var data = fromXML(x);
return data.title;
"""
OPTIONS(
library="gs:///from-xml.min.js"
);
SELECT XML(a) FROM UNNEST(["Title of Page "]) as a
Where from-xml.min.js is from this library and loaded into your gcs account