Test for numeric value?

前端 未结 3 964
别那么骄傲
别那么骄傲 2021-01-20 17:44

The vendor data we load in our staging table is rather dirty. One column in particular captures number data but 40% of the time has garbage characters or random strings.

相关标签:
3条回答
  • 2021-01-20 18:15
    WITH x (stringval) AS 
    (
    VALUES ('x2'),(''),('2.2.'),('5-'),('-5-'),('--5'),('.5'),('2 2'),('0.5-'),(' 1 '),('2  '),('3.'),('-4.0')
    )
    SELECT stringval,
    CASE WHEN (
    
    -- Whitespace must not appear in the middle of a number 
    -- (but trailing and/or leading whitespace is permitted)
    RTRIM(LTRIM( stringval )) NOT LIKE '% %'
    
    -- A number cannot start with a decimal point
    AND LTRIM( stringval ) NOT LIKE '.%'
    
    -- A negative decimal number must contain at least one digit between 
    -- the negative sign and the decimal point
    AND LTRIM( stringval ) NOT LIKE '-.%'
    
    -- The negative sign may only appear at the beginning of the number
    AND LOCATE( '-', LTRIM(stringval)) IN ( 0, 1 )
    
    -- A number must contain at least one digit
    AND TRANSLATE( stringval, '0000000000', '123456789') LIKE '%0%'
    
    -- Allow up to one negative sign, followed by up to one decimal point
    AND REPLACE( 
        TRANSLATE( RTRIM(LTRIM(stringval)), '000000000', '123456789'), 
        '0', '') IN ('','-','.','-.')
    )
    THEN 'VALID'
    ELSE 'INVALID'
    END AS stringisvalidnumber
    FROM x
    ;
    
    0 讨论(0)
  • 2021-01-20 18:20

    A fairly reliable (but somewhat hackish) way is to compare the string to its upper- and lower-case self (numbers don't have different cases). As long as your data that is bringing in characters only includes Latin characters, you should be fine:

    SELECT input, CASE
        WHEN UPPER(input) = LOWER(input) THEN TO_NUMBER(input)
        ELSE 0
    END AS output
    FROM source
    

    Another option would be to use the TRANSLATE function:

    SELECT input,
        CASE 
            WHEN TRANSLATE(CAST(input as CHAR(10)), '~~~~~~~~~~~~~', '0123456789-. ') = '~~~~~~~~~~' THEN CAST(input AS DECIMAL(12, 2))
            ELSE 0
        END AS num
    FROM x
    
    0 讨论(0)
  • 2021-01-20 18:31

    Check this out:

    SELECT Mobile, 
     TRANSLATE(Mobile, '~~~~~~~~~~', '0123456789') AS FirstPass, 
     TRANSLATE(TRANSLATE(Mobile, '~~~~~~~~~~', '0123456789'), '', '~') AS Erroneous, 
    REPLACE(TRANSLATE(Mobile, '', TRANSLATE(TRANSLATE(Mobile, '~~~~~~~~~~', '0123456789'), '', '~')), ' ', '') AS Corrected 
     FROM Person WHERE Mobile <> '' FETCH FIRST 100 ROWS ONLY
    

    The table is "Person" and the field that you want to check is "Mobile". If you work a little bit more on this, you can build an UPDATE to fix the entire table

    0 讨论(0)
提交回复
热议问题