Left padding a string in pig

后端 未结 1 579
情话喂你
情话喂你 2021-01-23 18:40

I would like to left pad a string data type field with 0-s. Is there any way to do that? I need to have fixed length (40) values.

thanks in advance, Clairvoyant

1条回答
  •  礼貌的吻别
    2021-01-23 19:20

    The number of zeros needs to be generate dynamically based on the length of the remaining string, so i don't think its possible in native pig.
    This is very much possible in UDF.

    input.txt

    11111
    222222222
    33
    org.apache.hadoop.util.NativeCodeLoader
    apachepig
    

    PigScript:

    REGISTER leftformat.jar;
    
    A = LOAD 'input.txt' USING PigStorage() AS(f1:chararray);
    B = FOREACH A GENERATE format.LEFTPAD(f1);
    DUMP B;
    

    Output:

    (0000000000000000000000000000000000011111)
    (0000000000000000000000000000000222222222)
    (0000000000000000000000000000000000000033)
    (0org.apache.hadoop.util.NativeCodeLoader)
    (0000000000000000000000000000000apachepig)
    

    UDF code: The below java class file is compiled and generated as leftformat.jar
    LEFTPAD.java

    package format;
    import java.io.IOException;
    import org.apache.commons.lang.StringUtils;
    import org.apache.pig.EvalFunc;
    import org.apache.pig.data.Tuple;
    
    public class LEFTPAD extends EvalFunc {
    @Override
    public String exec(Tuple arg) throws IOException {
           try
            {
                String input = (String)arg.get(0);
                return StringUtils.leftPad(input, 40, "0");
            }
            catch(Exception e)
            {
                throw new IOException("Caught exception while processing the input row ", e);
            }
        }
    }
    

    UPDATE:

    1.Download 4 jar files from the below link(apache-commons-lang.jar,piggybank.jar, pig-0.11.0.jar and hadoop-common-2.6.0-cdh5.4.5)
    http://www.java2s.com/Code/Jar/a/Downloadapachecommonslangjar.htm
    http://www.java2s.com/Code/Jar/p/Downloadpiggybankjar.htm
    http://www.java2s.com/Code/Jar/p/Downloadpig0110jar.htm
    
    2. Set all the 3 jar files to your class path
      >> export CLASSPATH=/tmp/pig-0.11.1.jar:/tmp/piggybank.jar:/tmp/apache-commons-lang.jar
    
    3. Create directory name format 
        >>mkdir format
    
    4. Compile your LEFTPAD.java and make sure all the three jars are included in the class path otherwise compilation issue will come
        >>javac LEFTPAD.java
    
    5. Move the class file to format folder
        >>mv  LEFTPAD.class format
    
    6. Create jar file name leftformat.jar
        >>jar -cf leftformat.jar format/
    
    7. jar file will be created, include into your pig script
    
    Example from command line:
    $ mkdir format
    $ javac LEFTPAD.java 
    $ mv LEFTPAD.class format/
    $ jar -cf leftformat.jar format/
    $ ls
    LEFTPAD.java    format      input.txt   leftformat.jar  script.pig
    

    0 讨论(0)
提交回复
热议问题