The deadline for this project is closing in very quickly and I don\'t have much time to deal with what it\'s left. So, instead of looking for the best (and probably more complic
List all users in the graph network given a distance X
A distance X
from what? from a starting node or a distance X
between themselves? Can you give an example? This may or may not be as simple as doing a BF search or running Dijkstra.
Assuming you start at a certain node and want to list all nodes that have distances X
to the starting node, just run BFS from the starting node. When you are about to insert a new node in the queue, check if the distance from the starting node to the node you want to insert the new node from + the weight of the edge from the node you want to insert the new node from to the new node is <= X
. If it's strictly lower, insert the new node and if it is equal just print the new node (and only insert it if you can also have 0 as an edge weight).
List all users in the graph network given a distance X and the type of relation
See above. Just factor in the type of relation into the BFS: if the type of the parent is different than that of the node you are trying to insert into the queue, don't insert it.
Calculate the shortest path between 2 users on the graph network given a type of relation
The algorithm depends on a number of factors:
Since you want easy, the easiest are Roy-Floyd and Dijkstra's.
Here are C implementations: Roy-Floyd and Dijkstra_1, Dijkstra_2. You can find a lot on google with "<algorithm name> c implementation"
.
Edit: Roy-Floyd is out of the question for 18 000 nodes, as is an adjacency matrix. It would take way too much time to build and way too much memory. Your best bet is to either use Dijkstra's algorithm for each query, but preferably implementing Dijkstra using a heap - in the links I provided, use a heap to find the minimum. If you run the classical Dijkstra on each query, that could also take a very long time.
Another option is to use the Bellman-Ford algorithm on each query, which will give you O(Nodes*Edges)
runtime per query. However, this is a big overestimate IF you don't implement it as Wikipedia tells you to. Instead, use a queue similar to the one used in BFS. Whenever a node updates its distance from the source, insert that node back into the queue. This will be very fast in practice, and will also work for negative weights. I suggest you use either this or the Dijkstra with heap, since classical Dijkstra might take a long time on 18 000 nodes.
Calculate the maximum distance between 2 users on the graph network
The simplest way is to use backtracking: try all possibilities and keep the longest path found. This is NP-complete, so polynomial solutions don't exist.
This is really bad if you have 18 000 nodes, I don't know any algorithm (simple or otherwise) that will work reasonably fast for so many nodes. Consider approximating it using greedy algorithms. Or maybe your graph has certain properties that you could take advantage of. For example, is it a DAG (Directed Acyclic Graph)?
Calculate the most distant connected users on the graph network
Meaning you want to find the diameter of the graph. The simplest way to do this is to find the distances between each two nodes (all pairs shortest paths - either run Roy-Floyd or Dijkstra between each two nodes and pick the two with the maximum distance).
Again, this is very hard to do fast with your number of nodes and edges. I'm afraid you're out of luck on these last two questions, unless your graph has special properties that can be exploited.
Do you think it would help if I "converted" the graph to an adjacency matrix to represent the links weight and relation type? Would it be easier to perform the algorithm on that instead of the linked lists? I could easily implement a function to do that conversion when needed. I'm saying this because I got the feeling it would be easier after reading a couple of pages about the subject, but I could be wrong.
No, adjacency matrix and Roy-Floyd are a very bad idea unless your application targets supercomputers.
The simplest algorithm to compute shortest path between two nodes is Floyd-Warshall. It's just triple-nested for loops; that's it.
It computes ALL-pairs shortest path in O(N^3)
, so it may do more work than necessary, and will take a while if N
is huge.
This assumes O(E log V)
is an acceptable running time, if you're doing something online, this might not be, and it would require some higher powered machinery.
Djikstra's algorithm is good for this, for one time use. You can save the result for future use, with a linear scan through all the vertices (or better yet, sort and binary search).
Might be nearly the same as above -- just use some function where the weight would be infinity if it is not of the correct relation.
Same as above, essentially, just determine early if you match the two users. (Alternatively, you can "meet in the middle", and terminate early if you find someone on both shortest path spanning tree)
Longest path is an NP-complete problem.
This is the diameter of the graph, which you can read about on Math World.
As for the adjacency list vs adjacency matrix question, it depends on how densely populated your graph is. Also, if you want to cache results, then the matrix might be the way to go.