BigQuery: SPLIT() returns only one value

前端 未结 5 1245
时光说笑
时光说笑 2020-12-05 11:35

I have a page URL column components of which are delimited by /. I tried to run the SPLIT() function in BigQuery but it only gives the first value.

相关标签:
5条回答
  • 2020-12-05 12:18

    2018 standardSQL update:

    #standardSQL
    SELECT SPLIT(path, '/')[OFFSET(0)] part1,
           SPLIT(path, '/')[OFFSET(1)] part2,
           SPLIT(path, '/')[OFFSET(2)] part3
    FROM (SELECT "/a/b/aaaa?c" path)
    

    Now I understand you want them in different columns.

    An alternative to the query you provided:

    SELECT FIRST(SPLIT(path, '/')) part1,
           NTH(2, SPLIT(path, '/')) part2,
           NTH(3, SPLIT(path, '/')) part3
    FROM (SELECT "/a/b/aaaa?c" path)
    

    NTH(X, SPLIT(s)) will provide the Xth value from the SPLIT. FIRST(s) is the same as NTH(1, s)

    0 讨论(0)
  • 2020-12-05 12:18

    You can also try the following with SPLIT function, however you need to know how many '/' your url would have or make enough entries so that if your url contains more '/' then you can still get those values in the seperate columns

      SPLIT(`url`, '/')[safe_ordinal(1)] AS `Col1`, 
      SPLIT(`url`, '/')[safe_ordinal(2)] AS `Col2`,
      SPLIT(`url`, '/')[safe_ordinal(3)] AS `Col3`, 
      SPLIT(`url`, '/')[safe_ordinal(4)] AS `Col4`,
      .
      .
      SPLIT(`url`, '/')[safe_ordinal(N)] AS `ColN`
    
    0 讨论(0)
  • 2020-12-05 12:19

    Solved it in a way.

       SELECT
       date, 
       hits_time, 
       fullVisitorId, 
       visitNumber, 
       hits_hitNumber,
       X.page_path,
       REGEXP_EXTRACT(X.page_path,r'/(\w*)\/') as one,
       REGEXP_EXTRACT(X.page_path,r'/\w*\/(\w*)') as two,
       REGEXP_EXTRACT(X.page_path,r'/\w*\/\w*\/(\w*)') as three,
       REGEXP_EXTRACT(X.page_path,r'/\w*/\w*/\w*\/(\w*)\/.*') as four
       from
       (
       select 
       date, hits_time, fullVisitorId, visitNumber, hits_hitNumber,
       REGEXP_REPLACE (hits_page_pagePath, '-', '') as page_path
       from
       [Intent.All2mon]
       ) X 
       limit 1000
    
    0 讨论(0)
  • 2020-12-05 12:25

    in standard sql, you can use the following functions:

    array[OFFSET(zero_based_offset)]
    array[ORDINAL(one_based_ordinal)]
    

    so

    SELECT SPLIT(path, '/')[OFFSET(1)] part2,
           SPLIT(path, '/')[ORDINAL(2)] part2_again,
           SPLIT(path, '/')[ORDINAL(3)] part3
    FROM (SELECT "/a/b/aaaa?c" path)
    
    part2   part2_again part3    
    a       a           b
    

    part1 in this case, is empty string (before the first slash)

    0 讨论(0)
  • 2020-12-05 12:27

    This works for me:

    SELECT SPLIT(path, '/') part
    FROM (SELECT "/a/b/aaaa?c" path)
    
    Row part     
    1   a    
    2   b    
    3   aaaa?c
    

    Not sure why it wouldn't work for you. What does your data look like?

    0 讨论(0)
提交回复
热议问题