问题
In CLRS excise 22.1-8 (I am self learning, not in any universities)
Suppose that instead of a linked list, each array entry Adj[u] is a hash table containing the vertices v for which (u,v) ∈ E. If all edge lookups are equally likely, what is the expected time to determine whether an edge is in the graph? What disadvantages does this scheme have? Suggest an alternate data structure for each edge list that solves these problems. Does your alternative have disadvantages compared to the hash table?
So, if I replace each linked list with hash table, there are following questions:
- what is the expected time to determine whether an edge is in the graph?
- What are the disadvantages?
- Suggest an alternate data structure for each edge list that solves these problems
- Does your alternative have disadvantages compared to the hash table?
I have the following partial answers:
- I think the expected time is O(1), because I just go Hashtable t = Adj[u], then return t.get(v);
- I think the disadvantage is that Hashtable will take more spaces then linked list.
For the other two questions, I can't get a clue.
Anyone can give me a clue?
回答1:
It depends on the hash table and how it handles collisions, for example assume that in our hash table each entry points to a list of elements having the same key.
If the distribution of elements is sufficiently uniform, the average cost of a lookup depends only on the average number of elements per each list(load factor). so the average number of elements per each list is n/m where m is the size of our hash table.
- The expected time to determine whether an edge is in the graph is O(n/m)
- more space than linked list and more query time than adjacency matrix. If our hash table supports dynamic resizing then we would need extra time to move the elements between the old and new hash tables and if not we would need O(n) space for each hash table in order to have O(1) query time which results in O(n^2) space. also we have just checked expected query time, and In worst case we may have query time just like linked list(O(degree(u))) so it seems better to use adjacency matrix in order to have deterministic O(1) query time and O(n^2) space.
- read above
- yes, for example if we know that every vertices of our graph has at most d adjacent vertices and d less than n, then using hash table would need O(nd) space instead of O(n^2) and would have expected O(1) query time.
回答2:
The answer to question 3 could be a binary search tree.
In an adjacency matrix, each vertex is followed by an array of V elements. This O(V)-space cost leads to fast (O(1)-time) searching of edges.
In an adjacency list, each vertex is followed by a list, which contains only the n adjacent vertices. This space-efficient way leads to slow searching (O(n)).
A hash table is a compromise between the array and the list. It uses less space than V, but requires the handle of collisions in searching.
A binary search tree is another compromise -- the space cost is minimum as that of lists, and the average time cost in searching is O(lg n).
回答3:
Questions 3 and 4 are very open. Besides the thoughts from other two, one problem with hash table is that it's not an efficient data structure for scanning elements from the beginning to the end. In a real world, sometimes it's pretty common to enumerate all the neighbors for a given vertex (e.g., BFS, DFS), and that somehow compromises the use of a direct hash table.
One possible solution for this is to chain existing buckets in hash table together so that they form a doubly-linked list. Every time a new element is added, connect it to the end of the list; Whenever an element is removed, remove it from the list and fix the link relation accordingly. When you want to do an overall scan, just go through this list.
The drawback of this strategy, of course, is more space. There is a two-pointer overhead per element. Also, the addition/removal of an element takes more time to build/fix the link relation.
I'm not too worried about collisions. The hash table of a vertex stores its neighbors, each of which is unique. If its key is unique, there is no chance of collision.
来源:https://stackoverflow.com/questions/9667571/graph-what-are-the-disadvantages-if-i-replace-each-linked-list-in-adjacency-li