问题
I have dbf database encoded in cp1250 and I am reading this database using folowing code:
import csv
from dbfpy import dbf
import os
import sys
filename = sys.argv[1]
if filename.endswith('.dbf'):
print "Converting %s to csv" % filename
csv_fn = filename[:-4]+ ".csv"
with open(csv_fn,'wb') as csvfile:
in_db = dbf.Dbf(filename)
out_csv = csv.writer(csvfile)
names = []
for field in in_db.header.fields:
names.append(field.name)
#out_csv.writerow(names)
for rec in in_db:
out_csv.writerow(rec.fieldData)
in_db.close()
print "Done..."
else:
print "Filename does not end with .dbf"
Problem is, that final csv file is wrong. Encoding of the file is ANSI and some characters are corrupted. I would like to ask you, if you can help me how to read dbf file correctly.
EDIT 1
I tried different code from https://pypi.python.org/pypi/simpledbf/0.2.4, there is some error.
Source 2:
from simpledbf import Dbf5
import os
import sys
dbf = Dbf5('test.dbf', codec='cp1250');
dbf.to_csv('junk.csv');
Output:
python program2.py
Traceback (most recent call last):
File "program2.py", line 5, in <module>
dbf = Dbf5('test.dbf', codec='cp1250');
File "D:\ProgramFiles\Anaconda\lib\site-packages\simpledbf\simpledbf.py", line 557, in __init__
assert terminator == b'\r'
AssertionError
I really don't know how to solve this problem.
回答1:
Try using my dbf library:
import dbf
with dbf.Table('test.dbf') as table:
dbf.export(table, 'junk.csv')
回答2:
I wrote simpledbf. The line that is causing you problems was from some testing I was doing when developing the module. First of all, you might want to update your installation, as 0.2.6 is the most recent. Then you can try removing that particular line (#557) from the file "D:\ProgramFiles\Anaconda\lib\site-packages\simpledbf\simpledbf.py". If that doesn't work, you can ping me at the GitHub repo for simpledbf, or you could try Ethan's suggestion for the dbf module.
回答3:
You can decode and encode as necessary. dbfpy
assumes strings are utf8
encoded, so you can decode as it isn't that encoding and then encode again with the right encoding.
import csv
from dbfpy import dbf
import os
import sys
filename = sys.argv[1]
if filename.endswith('.dbf'):
print "Converting %s to csv" % filename
csv_fn = filename[:-4]+ ".csv"
with open(csv_fn,'wb') as csvfile:
in_db = dbf.Dbf(filename)
out_csv = csv.writer(csvfile)
names = []
for field in in_db.header.fields:
names.append(field.name)
#out_csv.writerow(names)
for rec in in_db:
row = [i.decode('utf8').encode('cp1250') if isinstance(i, str) else i for i in rec.fieldData]
out_csv.writerow(rec.fieldData)
in_db.close()
print "Done..."
else:
print "Filename does not end with .dbf"
来源:https://stackoverflow.com/questions/31270429/dbf-encoding-cp1250