Read MDF file int Pandas DataFrame

梦想的初衷 提交于 2019-12-11 15:19:32

问题


I'm attempting to read a Microsoft SQL Server MDF file into a Python pandas DataFrame. I'm extremely lost on the subject and would really appreciate any push in the right direction. Please let me know any information that would make answering this question easier.

Here are some of the resources I've found but can't seam to bring to a finished product: https://pandas.pydata.org/pandas-docs/stable/io.html#engine-connection-examples http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql.html Python open Microsoft SQL Server MDF file


回答1:


I had to 1.) attach the .mdf file to my local instance of sql server via sql server studio and then 2.) open a port on my computer to allow read/write access to sql server. If we can make the .mdf available on sql server, then we can connect to it via pandas/sqlAlchemy with pd.read_sql() and the sql server connection string.

  1. Go to the folder of your .mdf file, and share the file with everyone on the machine. On windows 10, I did this by right clicking the file, highlighting Give access to, and selecting Specific people.... I then added everyone from the drop down, giving them read/write permissions. This step is important, as I could not attach the file to sql server otherwise. Now, open sql server management studio, connect to your instance of sql server, and right click Databases. There you have an option Attach... where you can follow the menu to select your .mdf file. Your .mdf is now attached to sql server.

  2. Now that your .mdf is in sql server, we still have the issue of being able to access it. This can be resolved by opening a port on your machine to allow access to sql server. I followed this guide to be able to do that: https://docs.microsoft.com/en-us/sql/relational-databases/lesson-2-connecting-from-another-computer?view=sql-server-2017. Once you finish this how-to, your sql server should be open to access. Now there may be easier ways to access sql server on a local machine, but the nice thing about this method is that it should also work if you want to read sql server from a remote machine, too. All you'll have to do is change the ip address in our connection string (below).

Now that we have an accessible instance of sql server and our .mdf file as a database inside of it, we can read the file like we would any other database from pandas. I used the following setup to now read my .mdf file.

# MS SQL Server Config
server = '127.0.0.1'  # mssql is set up on localhost
port = '1433'  # the port I opened to access mssql
database = 'database_name'
username = 'username'
password = 'password'
driver = 'SQL+SERVER'
schema = 'dbo'

# create a sqlAlchemy engine with the above credentials
connection_str = f'mssql+pyodbc://{server}:{port}/{database}?driver={driver}'
engine = create_engine(connection_str)

# read our mdf file!
query = 'select * from table_name;'
df = pd.read_sql(query, engine)

Now I don't have my sql server password protected since I am just running locally, but in case you have yours, you'll want to adjust your connection string as follows:

connection_str = f'mssql+pyodbc://{username}:{password}@{server}:{port}/{database}?driver={driver}'

With that, you should now have table_name from your .mdf file read into a pandas dataframe.



来源:https://stackoverflow.com/questions/51734803/read-mdf-file-int-pandas-dataframe

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!