问题
I'm attempting to read a Microsoft SQL Server MDF file into a Python pandas DataFrame. I'm extremely lost on the subject and would really appreciate any push in the right direction. Please let me know any information that would make answering this question easier.
Here are some of the resources I've found but can't seam to bring to a finished product: https://pandas.pydata.org/pandas-docs/stable/io.html#engine-connection-examples http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql.html Python open Microsoft SQL Server MDF file
回答1:
I had to 1.) attach the .mdf
file to my local instance of sql server via sql server studio and then 2.) open a port on my computer to allow read/write access to sql server. If we can make the .mdf available on sql server, then we can connect to it via pandas/sqlAlchemy with pd.read_sql()
and the sql server connection string.
Go to the folder of your
.mdf
file, and share the file with everyone on the machine. On windows 10, I did this by right clicking the file, highlightingGive access to
, and selectingSpecific people...
. I then addedeveryone
from the drop down, giving them read/write permissions. This step is important, as I could not attach the file to sql server otherwise. Now, open sql server management studio, connect to your instance of sql server, and right clickDatabases
. There you have an optionAttach...
where you can follow the menu to select your.mdf
file. Your .mdf is now attached to sql server.Now that your
.mdf
is in sql server, we still have the issue of being able to access it. This can be resolved by opening a port on your machine to allow access to sql server. I followed this guide to be able to do that: https://docs.microsoft.com/en-us/sql/relational-databases/lesson-2-connecting-from-another-computer?view=sql-server-2017. Once you finish this how-to, your sql server should be open to access. Now there may be easier ways to access sql server on a local machine, but the nice thing about this method is that it should also work if you want to read sql server from a remote machine, too. All you'll have to do is change the ip address in our connection string (below).
Now that we have an accessible instance of sql server and our .mdf
file as a database inside of it, we can read the file like we would any other database from pandas. I used the following setup to now read my .mdf
file.
# MS SQL Server Config
server = '127.0.0.1' # mssql is set up on localhost
port = '1433' # the port I opened to access mssql
database = 'database_name'
username = 'username'
password = 'password'
driver = 'SQL+SERVER'
schema = 'dbo'
# create a sqlAlchemy engine with the above credentials
connection_str = f'mssql+pyodbc://{server}:{port}/{database}?driver={driver}'
engine = create_engine(connection_str)
# read our mdf file!
query = 'select * from table_name;'
df = pd.read_sql(query, engine)
Now I don't have my sql server password protected since I am just running locally, but in case you have yours, you'll want to adjust your connection string as follows:
connection_str = f'mssql+pyodbc://{username}:{password}@{server}:{port}/{database}?driver={driver}'
With that, you should now have table_name
from your .mdf
file read into a pandas dataframe.
来源:https://stackoverflow.com/questions/51734803/read-mdf-file-int-pandas-dataframe