问题
I need to create the process ETL in the next file:
LIST OF TRANSACTIONS
COD. SE ,COMERCIAL NAME ,TYPE ,DATE OP ,TIME OP ,DATE TX ,ID UNIQUE ,
1010101 ,CARL GAME ,1244 ,09/12/2020 ,190047 ,201207 ,73777777777777777777777,
2020202 ,UNIQUE KINGDOM ,1244 ,08/12/2020 ,84943 ,201208 ,73777888888888888888888,
Cantidad de Registros : 2
Cantidad Importe
Soles 3 0000000.00
Dolares 0 000000.00
I download that kind of file from a Data Server, theres no way i can change that.
I need the next process with Python:
- Extract the zip information
- Take the .csv file and delete the first line and the last 4, because dont have usefull information.
- Use the information and give format because there are numbers, strings, date.
- With the information i need two options:
4.1. Add the information to SQL.
4.2. Replace the information to SQL. - Create a script apply everyday and make this automatized.
There are two kind of files:
- One who adds information who can zise: 320 MB
- Theres other who need to replace who size is: 3 GB
Let me know what can i do, I try only this:
import pandas as pd
df=pd.read_csv (r'C:\\Users\....\OP00077.csv',header=None)
filas=len(df.index)
print("Filas: ",filas)
df.drop(df.index[[filas-1]],inplace=True)
df.drop(df.index[[filas-2]],inplace=True)
df.drop(df.index[[filas-3]],inplace=True)
df.drop(df.index[[filas-4]],inplace=True)
df.drop(df.index[0],inplace=True)
filas=len(df)
print("Filas: ",filas)
print (df)
来源:https://stackoverflow.com/questions/65330842/flat-file-python-sql-automatize-process-etl-script