问题
I am trying to automate a few reports that are built off of CSV exports from Netsuite (our ERP software). The files never import directly into PowerQuery correctly because there are 6 rows that are "header" rows. These header rows do not have the correct amount of commas so PowerQuery only shows 1 column of data. I currently am opening the files with Notepad++ and deleting those 6 rows and then importing the file into PowerQuery.
Is there a way to skip the first 6 rows using PowerQuery code so the csv is read correctly? Below is an example of the data I am working with.
Fake Cocoa Company LLC
"Fake Cocoa Company, LLC (Consolidated)"
Sales Order Detail - Hotel - BH
"January 1, 2016 - December 31, 2016"
"Options: Show ZerosFilters: Customer/Project (equal to FCC - Hotel Hotel ), Validated Status (not equal to Cancelled, Closed )"
Document Number ,Date ,Ship To ,Item: Description (Sales) ,Quantity ,Validated Status ,Unit Price ,Aggregate Amount
Sales Orders,,,,,,,
669,9/15/2016,Receiving - CCLV Hotel 2880 Some Place Blvd South Hotel Hotel Some Place CA 91089,100% Country Caf Liquid Cocoa,5,Billed,$75.68,$378.40
660,,,,,,,
,9/15/2016,Receiving - MAIN OCEAN Hotel 4300 Some Place Blvd SO Some Place CA 91089,100% Country Caf Liquid Cocoa,10,Billed,$7.68,$75.80
,9/15/2016,Receiving - MAIN OCEAN Hotel 4300 Some Place Blvd SO Some Place CA 91089,Fake Cocoa Grand - Whole Bean 5/5LB,8,Billed,$17.80,$72.00
,9/15/2016,Receiving - MAIN OCEAN Hotel 4300 Some Place Blvd SO Some Place CA 91089,Fake Cocoa Grand 28/9oz,6,Billed,$5.54,$39.24
,9/15/2016,Receiving - MAIN OCEAN Hotel 4300 Some Place Blvd SO Some Place CA 91089,Fake Cocoa Grand 42/2oz,4,Billed,$1.32,$7.28
,9/15/2016,Receiving - MAIN OCEAN Hotel 4300 Some Place Blvd SO Some Place CA 91089,Fake Cocoa Caf - Whole Bean 5/5LB,2,Billed,$2.80,$28.00
Total - 660,,,,,,,"$203.32"
回答1:
Using native CSV PowerQuery parser
let
file_path = "C:\your_path\csv.txt",
file = File.Contents(file_path),
src = Lines.FromBinary(file),
skip = List.Skip(src,6),
combine = Text.Combine(skip, "#(lf)"),
csv = Csv.Document(combine),
promote = Table.PromoteHeaders(csv)
in
promote
回答2:
table.skip can do what you want
The second parameter can either be a number (e.g. 6
) or a condition (e.g. (#"Position of ""Options: Show ZerosFilters: Customer/Project (equal to FCC - Hotel Hotel ), Validated Status (not equal to Cancelled, Closed )""" + 1)
)
回答3:
The issue is that you try to import the CSV in Power Query "from CSV" as source. Having the first lines with description content, will break the automatic transformation. So to prevent this, you have to import the file into PQ in another way. The problem is well described in Excelguru Blog from Ken (BTW: I warmly recommend his book).
Here the code:
let
/* Get the raw line by line contents of the file, preventing PQ from interpreting it */
fnRawFileContents = (fullpath as text) as table =>
let
Value = Table.FromList(Lines.FromBinary(File.Contents(fullpath)),Splitter.SplitByNothing())
in Value,
/* Use function to load file contents */
Source = fnRawFileContents("D:\yourfile.csv"),
#"Removed Top Rows" = Table.Skip(Source,6),
#"Split Column by Delimiter" = Table.SplitColumn(#"Removed Top Rows","Column1",Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),{"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5", "Column1.6", "Column1.7", "Column1.8"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}, {"Column1.3", type text}, {"Column1.4", type text}, {"Column1.5", type text}, {"Column1.6", type text}, {"Column1.7", type text}, {"Column1.8", type text}}),
#"Promoted Headers" = Table.PromoteHeaders(#"Changed Type")
in
#"Promoted Headers"
来源:https://stackoverflow.com/questions/39535275/skip-6-rows-before-reading-into-powerquery