I am having difficulties in representing a dataframe as a network using networkx. The problem seems to be related to the size of dataframe, or, to better explaining, to the
The issue you're facing is because some of the items in your data are duplicated. To solve it, you need to use drop_duplicates
in the relevant places:
df["color"] = "blue"
df.loc[df.Src.isin(["x.serm.cool", "cdc.fre.gh"]), "color"] = "green"
df["Dst"] = df.Dst.apply(lambda x: x[1:-1].split(","))
df = df.explode("Dst").drop_duplicates()
G = nx.from_pandas_edgelist(df, 'Src', 'Dst')
colors = df[["Src", "color"]].drop_duplicates()["color"]
nx.draw(G, node_color = colors)
The output: