How to give multiple values to a single key using a dictionary?

问题

I have a html form which has Firstname, LastName, Age and Gender and a ADD button. I enter the data into the form and that gets into the Berkeelys db. What my code does is it prints only the last values. I want that it should show all the values related to particular key

#!/usr/bin/python

import bsddb
import cgi

form = cgi.FieldStorage()

print "Content-type:text/html\n"
Fname = form.getvalue('firstname', '')
Lname = form.getvalue('lastname', '')
Age = form.getvalue('age', 0)
Gender = form.getvalue('gender', '')

#print Fname, Lname, Age 

db = bsddb.hashopen("/home/neeraj/public_html/database/mydb.db","w")
db['FirstName'] = Fname  
db['LastName'] = Lname
db['Age'] = Age 
db['Gender'] = Gender
db.close()
db = bsddb.hashopen("/home/neeraj/public_html/database/mydb.db","r")
#db = bsddb.hashopen("/home/neeraj/public_html/database/mydb.db")
print db['FirstName'], db['LastName'], db['Age'], db['Gender']
db.close()
print db

回答1:

You should use an SQL database instead of the dict-based interface, since SQL databases already handle multiple tuples in a table.

Anyway, if you want to have a dict interface you can use the shelve module (bsddb is deprecated, so you should avoid it) and save each value in a list:

import shelve

COLUMNS = ('FirstName', 'LastName', 'Age', 'Sex')

the_db = shelve.open('test.db', writeback=True)
for col_name in COLUMNS:
    if col_name not in the_db:
        the_db[col_name] = []

records = [
    ('John', 'Deer', 20, 'M'),
    ('Ada', 'Lovelace', 23, 'F'),
]

for record in records:
    for col_name, value in zip(COLUMNS, record):
        the_db[col_name].append(value)

the_db.close()

the_db = shelve.open('test.db')

for record in zip(*(the_db[col_name] for col_name in COLUMNS)):
    print(record)

the_db.close()

The above code outputs:

('John', 'Deer', 20, 'M')       
('Ada', 'Lovelace', 23, 'F')

If you want to use an SQL database you could use the sqlite3 module. For example:

import sqlite3

conn = sqlite3.connect('test.sqlite')

cursor = conn.cursor()

cursor.execute('''
CREATE TABLE people (
    FirstName text,
    LastName text,
    Age int,
    Sex text
    )''')

cursor.execute('''
INSERT INTO people values ('John', 'Deer', 20, 'M')''')

cursor.execute('''
INSERT INTO people values ('Ada', 'Lovelace', 23, 'F')''')

conn.commit()

for record in cursor.execute('''SELECT * FROM people'''):
    print(record)

The above code outputs:

(u'John', u'Deer', 20, u'M')
(u'Ada', u'Lovelace', 23, u'F')

(Note the u'...' simply means that the strings are unicode, it doesn't change their value)

However this code has some problems (e.g. try to run it twice...), but if you want to follow this path then you must learn SQL first, so go ahead and stufy it (there are a lot of online tutorials. For example w3schools ones).

回答2:

The correct method to associate multiple values to a single key is to pack a list or dictionary value using for instance "json.dumps".

Here is an exemple:

#!/usr/bin/python
from json import dumps
from json import loads
import bsddb

# write
users = bsddb.hashopen("users.db","w")
primarykey = 'amz'
users[primarykey] = dumps(dict(username="amz", age=30, bio="craftsman"))
users.close()

# read

users = bsddb.hashopen("users.db","r")

for key in users.keys():
    print loads(users[key])

users.close()

This is the basic pattern to be used with bsddb and is applicable to other key/value dbs like leveldb.

Extra:

Given the fact that bsddb hashmap keys are ordered lexicographically (ie. like python 2 strings) you can build hashmaps with keys having a predicatable order saving you the trouble of going through all the table.

To put that feature to good use you have to build useful keys. Again you need a packing function that translates python sort order to lexigraphic order (ie. 11 > 2 but "11" < "2") . Here is an exemple such packing function:

def pack(*values):
    def __pack(value):
        if type(value) is int:
            return '1' + struct.pack('>q', value)
        elif type(value) is str:
            return '2' + struct.pack('>q', len(value)) + value
        else:
            data = dumps(value, encoding='utf-8')
            return '3' + struct.pack('>q', len(data)) + data
    return ''.join(map(__pack, values))

This is a kind of naive, you could go the extra mile and support float and better pack int to save space.

For instance, given the simplified schema User(username, age) you can build another hashmap we call age_index with which you can easily retrieve every user at age of 30. The hashmap can look like the following:

   key    | value
 -----------------
  29  tom |   X
  30  amz |   X
  30  joe |   X
  30  moh |   X

This is a human readable view of the hasmap: the key is actually packed with the above pack function. As you can see the key is the composition of the age and the primarykey of the the item stored previously. In this case the value is not used because we don't need it. Mind the fact that each key is and must be unique.

Once that schema in place, you do "select queries", called range queries in bsddb using Cursor.set_range(key). This will set the cursor at the nearest key and return the key/value pair associated (the semantic can be a bit different depending on the database).

For instance to retrieve the first person that has age=30, one use the following code:

def get_first_person_with_age_thirty()
    key, _ = age_index.set_range(pack(30))  # we don't need value
    age, pk = unpack(key)
    # set_range, will set the key to "30" and a pk
    # or with a key prefix superior to 30.
    #  So we need to check that age is really 30.
    if age == 30:
        return loads(users[pk])

This will return the document associated with user amz.

To go further one needs to make use of the other bsddb interface which entry point is bsddb.db.DB and bsddb.db.DBEnv (documentation]. With this interface you can have several hashmap bound to the same transaction context even if it's not required to use transactions.

回答3:

You can save multiple values against single key in berkeley db by setting the duplicate flag

filename = '/path/to/berkeley/db/file'
fruitDB = db.DB()
fruitDB.set_flags(db.DB_DUP)
fruitDB.open(filename, None, db.DB_BTREE, db.DB_CREATE)
fruitDB.put(str(1), "Waqar")
fruitDB.put(str(1), "Umer")
fruitDB.put(str(2), "x")
fruitDB.put(str(2), "y")
fruitDB.put(str(4), "z")
fruitDB.put(str(5), "e")

But you cant retrieve all of them using 'Get' method of BDB you have to use cursor to retrieve items see documentation for this or you can retrieve all the items belongs to single key using

cursor = fruitDB.cursor()
cursor.set(str(1))
record = cursor.current()
listTup = []
while record:
    print record
    listTup.append(record)
    record = cursor.next_dup()

output will be

('1', 'Waqar')
('1', 'Umer')

This will return list of tuple having all the values belong to key '1' Hope this helps.

回答4:

I suppose opening the database everytime with w option, you are overwriting everytime & only storing the last entry. You should instead use a.

db = bsddb.hashopen("/home/neeraj/public_html/database/mydb.db","a")

Extract from the official Python manual -->

    open() returns a file object, and is most commonly used with two arguments:
    open(filename, mode).

f = open('workfile', 'w') The first argument is a string containing the filename. The second argument is another string containing a few characters describing the way in which the file will be used. mode can be 'r' when the file will only be read, 'w' for only writing (an existing file with the same name will be erased), and 'a' opens the file for appending; any data written to the file is automatically added to the end.

回答5:

The correct answer is: do not use the BDB for such a case. For simple applications you can use the builtin sqlite module that was first introduced in python 2.5.

来源：https://stackoverflow.com/questions/19873620/how-to-give-multiple-values-to-a-single-key-using-a-dictionary

标签

python

dictionary

cgi

berkeley-db

bsddb