Blob fields in SAS gets truncated

坚强是说给别人听的谎言 提交于 2019-12-25 07:35:03

问题


I have been working on a SAS job that extracts a table from SQL server and then loads that table to an Oracle table.

One of the fields there in SQL server is blob and they can be as big as 1G. I am getting length warnings when I run this blobs on oracle table seems to be truncated and as a result files there are corrupt.

I have seen SAS stating that character variable can be max 32K but SAS also states it can access blobs up to 2G.

How can we achieve that?

proc sql;
create view work.W2K3NU8 as
  select
     ID,
     DNUMBER,
     FILENAME,
     FILE   
        format = $HEX2048.
        informat = $HEX2048.,
     (input(compress(DATEENTERED),YYMMDD10.)) as DATEENTERED length = 8
        format = date.
        informat = date.
        label = 'DATEENTERED',
     (input(compress(DATEADDED),YYMMDD10.)) as DATEADDED length = 8
        format = date.
        informat = date.
        label = 'DATEADDED',
     (input(compress(DATECHANGED),YYMMDD10.)) as DATECHANGED length = 8
        format = date.
        informat = date.
        label = 'DATECHANGED',
     TYPE
from &SYSLAST;
quit;

AND here is data step

      data trd.GAFILES
          (dbnull = (
                     ID = NO
                     DNUMBER = YES
                     FILENAME = YES
                     GA_FILE = YES
                     DATEENTERED = YES
                     DATAADDED = YES
                     DATECHANGED = YES
                     TYPE = YES
                     ETL_CREATE = YES
                     ETL_UPDATE = YES));
     attrib ID length = $255
        format = $255.
        informat = $255.
        label = 'ID'; 
     attrib DNUMBER length = $10
        format = $10.
        informat = $10.
        label = 'DNUMBER'; 
     attrib FILENAME length = $255
        format = $255.
        informat = $255.
        label = 'FILENAME'; 
     attrib GA_FILE length = $4096
        format = $HEX2048.
        informat = $HEX2048.
        label = 'GA_FILE'; 
     attrib DATEENTERED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATEENTERED'; 
     attrib DATAADDED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATAADDED'; 
     attrib DATECHANGED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATECHANGED'; 
     attrib TYPE length = $100
        format = $100.
        informat = $100.
        label = 'TYPE'; 
     attrib ETL_CREATE length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'ETL_CREATE'; 
     attrib ETL_UPDATE length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'ETL_UPDATE'; 
     call missing(of _all_);
     stop;
  run;

回答1:


SAS datasets won't support > 32767 character size. I'm not sure where you saw that it supports greater; you're probably reading the ACCESS reference, which is describing the different data types of the DBMSs (ie, in the DB2 section, it describes BLOBs and CLOBs as allowing up to 2GB in size, but that's describing what DB2 supports - not SAS's support).

SAS will happily access BLOBs, but it won't take more than 32767 from it. You would have to read in chunks, or use DBMS-specific language in a pass through session (which would have to pass it through without touching it). You can read in chunks like so (fill in appropriate substring function and connection information):

proc sql;
connect to <>;
create table SASTBL as 
  select * from connection to <> (
   select substring_Function(blobfield,1,32767) as blob_1,
          substring_Function(blobfield,32768,32767) as blob_2,
          substring_Function(blobfield,65535,32767) as blob_3,
(... etc ... )
  from your_tbl;
);
quit;

If you have 9.4, you also might be able to use FedSQL to do the conversion; I'm not very familiar with FedSQL, but the purpose of it is to support more data types than SAS supports. It doesn't explicitly say that it can support BLOBs (the notes on BLOB are consistently 'mapped to a similar data type', which presumably means char or varchar) but it might be worth a shot if you have 9.4.



来源:https://stackoverflow.com/questions/24919190/blob-fields-in-sas-gets-truncated

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!