Do you know a solution to the following interview question?
Design a data structure for a phone book that can safely and efficiently search a number
Such data structures are known as multi-index containers. They are not very common in most programming languages, because the interface can get quite complex. There are implementations in Java, C# and - most prominently - C++ with the Boost library, see Boost.MultiIndex Documentation, in particular example 4 about bidirectional maps:
A bidirectional map is a container of (const FromType,const ToType) pairs such that no two elements exists with the same first or second component (std::map only guarantees uniqueness of the first component). Fast lookup is provided for both keys. The program features a tiny Spanish-English dictionary with online query of words in both languages.
The basic idea of multi-index containers is that a lot of containers store their elements in nodes that contain pointers/references to other nodes (e.g. double linked lists). Instead of only storing the pointers/references for a single container, a node contains the links for several index structures. This works at least with linked lists, sorted trees and unique hash indices. And the implementation is very efficient, because only one memory allocation is required per element.
Well.. I agree with the multi-index and it is the right way to do it. However, for a job interview, they probably want you to think about it and explain something, not just say use boost. If they ask about the internals of boost, it might be embarrassing if you can't explain it properly.
So, here is a possible solution.
First of all, do not use a hashtable for this. Phones and names can be easily sorted and you should probably use some balanced search tree, or a trie if you want interactive search (http://en.wikipedia.org/wiki/Trie). IMHO, a hashtable is a big waste of space in this case.
Let us assume that names are unique and numbers are unique. Then, you can do:
1- Data structure to hold your data
struct Phone {
// implement the phone here whatever they need
// assume that whatever representation used can be converted into a unique id (number)
}; //
struct PhoneBookEntry {
std::string name;
Phone number;
};
2- Create two trees one for the name and one for the phone
BalancedSearchTree<PhoneBookEntry> tree_by_name;
BalancedSearchTree<PhoneBookEntry> tree_by_number;
3- That is it. Look up each tree for whatever you need
bool PhoneBook::getByName(const std::string &name, PhoneBookEntry &o) {
tree_by_name.get(name, o);
return !o.empty();
}
bool PhoneBook::getByNumber(const Phone &p, PhoneBookEntry &o) {
tree_by_number.get(p, o);
return !o.empty();
}
Good luck
Isn't this easy?
Using an array of record/struct/tuple of the value pair (telephone-number, name).
Do a linear search looking for the search key; O(n/2) for match, O(n) for miss,
return record/struct/tuple and do whatever needs to be done.
Edit:
This algorithm can be improved in many ways.
I think this interview question might be deliberately under specified in order to find out how the interviewee reacts. (That is what I do when I interview). So it may be more important to treat it as such, rather than assume it is just a Computer Science question.
I think it is worth engaging with the interviewer. For example:
etc.
I think it is more important to engage with the interviewer, rather than only focusing on the technical solution. When I interview, I am looking for someone who tries to actually understand the whole problem rather than the (usually small) part which is easy to define.
Hash everything (i.e. both directions) into the same table.
You can use a 36-ary tree (26 for alphabets and 10 for digits) and have links pointing from both an alphabet as well as a digit to the same node. (This won't be a "tree" strictly speaking, but you get the idea). This way you won't get a constant time lookup but you won't have to repeat data and the lookup would still be pretty quick.
Of course, you'd have to handle collisions, but you'd have to handle them while using hashtables too. Maybe you can use two pointers to point to the next collided node by name and by number.