Replace double quotes (within qualifiers) in CSV for SSIS import

╄→гoц情女王★ 提交于 2019-12-04 17:38:12

To replace double quotes with single quotes according to your specifications, use this simple regex. This regex will allow whitespace at the beginning and/or end of lines.

string pattern = @"(?<!^\s*|,)""(?!,""|\s*$)";
string resultString = Regex.Replace(subjectString, pattern, "'", RegexOptions.Multiline);

This is the explanation of the pattern:

// (?<!^\s*|,)"(?!,"|\s*$)
// 
// Options: ^ and $ match at line breaks
// 
// Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!^\s*|,)»
//    Match either the regular expression below (attempting the next alternative only if this one fails) «^\s*»
//       Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
//       Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*»
//          Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
//    Or match regular expression number 2 below (the entire group fails if this one fails to match) «,»
//       Match the character “,” literally «,»
// Match the character “"” literally «"»
// Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!,"|\s*$)»
//    Match either the regular expression below (attempting the next alternative only if this one fails) «,"»
//       Match the characters “,"” literally «,"»
//    Or match regular expression number 2 below (the entire group fails if this one fails to match) «\s*$»
//       Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*»
//          Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
//       Assert position at the end of a line (at the end of the string or before a line break character) «$»

You can split columns with regex match pattern

/(?:(?<=^")|(?<=",")).*?(?:(?="\s*$)|(?=","))/g

See this demo.

while loading CSV with double quotes and comma there is one limitation that extra double quotes has been added and the data also enclosed with the double quotes you can check in the preview of source file. So, add the derived column task and give the below expression:-

(REPLACE(REPLACE(RIGHT(SUBSTRING(TRIM(COL2),1,LEN(COL2) - 1),LEN(COL2) - 2)," ","@"),"\"\"","\""),"@"," ")

the bold part removes the data enclosed with double quotes.

Try this and do let me know if this is helpful

user11096939

Use text qualifier " for CSV destination before inserting values to CSV destination, add a derived column expression

REPLACE(REPLACE([Column1],",",""),"\"","")

This will retain " in your text field

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!