Convert BibTex file to database entries using Python

后端未结

关注

 5  1595

Given a bibTex file, I need to add the respective fields(author, title, journal etc.) to a table in a MySQL database (with a custom schema).

After doing some initial re

相关标签:

5条回答

既然无缘

2021-02-02 15:45
My workaround is to use bibtexparser to export relevant fields to a csv file;
```
import bibtexparser
import pandas as pd

with open("../../bib/small.bib") as bibtex_file:
    bib_database = bibtexparser.load(bibtex_file)
    
df = pd.DataFrame(bib_database.entries)
selection = df[['doi', 'number']]
selection.to_csv('temp.csv', index=False)
```
And then write the csv to a table in the database, and delete the temp.csv.

This avoids some complication with pybtex I found.
0 讨论(0)
发布评论:

提交评论
- 加载中...
伪装坚强ぢ

2021-02-02 15:49

You could use the Perl package Bib2ML (aka. Bib2HTML). It contains a bib2sql tool that generates a SQL database from a BibTeX database, with the following schema:

An alternative tool: bibsql and bibtosql.

Then you can feed it to your schema by writing some SQL conversion queries.

0 讨论(0)
发布评论:

提交评论
- 加载中...

醉话见心

2021-02-02 15:50

Old question, but I am doing the same thing at the moment using the Pybtex library, which has an inbuilt parser:

from pybtex.database.input import bibtex

#open a bibtex file
parser = bibtex.Parser()
bibdata = parser.parse_file("myrefs.bib")

#loop through the individual references
for bib_id in bibdata.entries:
    b = bibdata.entries[bib_id].fields
    try:
        # change these lines to create a SQL insert
        print b["title"]
        print b["journal"]
        print b["year"]
        #deal with multiple authors
        for author in bibdata.entries[bib_id].persons["author"]:
            print author.first(), author.last()
    # field may not exist for a reference
    except(KeyError):
        continue

0 讨论(0)

春和景丽

2021-02-02 15:52

Converting to XML is a fine idea.

XML exists as an application-independent data format, so that you can parse it with readily-available libraries; using it as an intermediary has no particular drawbacks. In fact, you can usually import XML into a database without even going through a programming language such as Python (although the amount of Python you'd have to write for a task like this is trivial).

So far as I know, there is no direct, mature bibTeX reader for Python.

0 讨论(0)
发布评论:

提交评论
- 加载中...
梦谈多话

2021-02-02 15:56

You can also use Python BibtexParser: https://github.com/sciunto/python-bibtexparser

Documentation: https://bibtexparser.readthedocs.org

It's very straight forward (I use it in production).

For the record, I am not the developer of this library.

0 讨论(0)
发布评论:

提交评论
- 加载中...