Problems opening DBF files in python

♀尐吖头ヾ 提交于 2020-05-08 15:47:05

问题


I am trying to open en transform several DBF files to a dataframe. Most of them worked fine, but for one of the files I receive the error: "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 15: invalid start byte"

I have read this error on some other topics such as opening csv and xlsx and other files. The proposed solution was to include encoding = 'utf-8' in the reading the file part. I haven't found a solution for DBF files unfortunately and I have very limited knowledge on DBF files.

What I have tried so far:

1)

from dbfread import DBF
dbf = DBF('file.DBF')
dbf = pd.DataFrame(dbf)

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 8: character maps to <undefined>

2)

from simpledbf import Dbf5
dbf = Dbf5('file.DBF')
dbf = dbf.to_dataframe()

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 15: invalid start byte

3)

# this block of code copied from https://gist.github.com/ryan-hill/f90b1c68f60d12baea81 
import pysal as ps

def dbf2DF(dbfile, upper=True): #Reads in DBF files and returns Pandas DF
    db = ps.table(dbfile) #Pysal to open DBF
    d = {col: db.by_col(col) for col in db.header} #Convert dbf to dictionary
    #pandasDF = pd.DataFrame(db[:]) #Convert to Pandas DF
    pandasDF = pd.DataFrame(d) #Convert to Pandas DF
    if upper == True: #Make columns uppercase if wanted 
        pandasDF.columns = map(str.upper, db.header) 
    db.close() 
    return pandasDF

dfb = dbf2DF('file.DBF')

AttributeError: module 'pysal' has no attribute 'open'

And last, if I try to install the dbfpy module, I receive: SyntaxError: invalid syntax

Any suggestions on how to solve this?


回答1:


Try using my dbf library:

import dbf

table = dbf.Table('file.DBF')

Print it to see if an encoding is present in the file:

print table    # print(table) in Python 3

One of my test tables looks like this:

    Table:         tempy.dbf
    Type:          dBase III Plus
    Codepage:      ascii (plain ol ascii)
    Status:        DbfStatus.CLOSED
    Last updated:  2019-07-26
    Record count:  1
    Field count:   2
    Record length: 31 
    --Fields--
      0) name C(20)
      1) desc M

The important line being the Codepage line -- it sounds like that is not properly set for your DBF file. If you know what it should be, you can either open it with that codepage (temporarily) with:

table = dbf.Table('file.DBF', codepage='...')

Or you can change it permanently (updates the DBF file) with:

table.open()
table.codepage = dbf.CodePage('cp1252') # for example
table.close()



回答2:


from simpledbf import Dbf5 dbf2 = Dbf5('/Users/joselin.ceron/Documents/Joselin Ceron/OD/bd_eod_2017_dbf/TCAT_MUNICIPIOS.dbf', codec='latin') df2 = dbf2.to_dataframe() df2.head(3)



来源:https://stackoverflow.com/questions/57215656/problems-opening-dbf-files-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!