spark-graphx

Cartesian product between vertices of a GraphX

故事扮演 提交于 2019-12-11 05:36:38
问题 I will like to do a cartesian product between the nodes of a Graph. I want to build their distance matrix. Maybe this is not a very good approach, so, any suggestion is welcome. This is my code, and it's not working, I don't have any warning nor exception, it just does not work. I think maybe is because I'm trying to make a cartesian product with the same RDD, but I don't know how to fix it, how to make a nested loop or something that can help me to compute this matrix. val indexes1 = graph

I need to do join/joinVertices or add a field in tuple in graph by Spark Graphx

给你一囗甜甜゛ 提交于 2019-12-08 00:27:32
问题 I have a RDF graph( link ) with tuples(s,p,o) and I made a property graph from that. My RDF property graph is obtained by following code( Complete code ): val propGraph = Graph(vertexArray,edgeArray).cache() propGraph.triplets.foreach(println(_)) with output as below: ((vId_src,src_att),(vId_dst,dst_att),property) and RDF data as: ((0,<http://umkc.edu/xPropGraph#franklin>),(1,http://umkc.edu/xPropGraph#rxin>),<http://umkc.edu/xPropGraph#advisor>) ((1,<http://umkc.edu/xPropGraph#rxin>),(2,

How to create a VertexId in Apache Spark GraphX using a Long data type?

风流意气都作罢 提交于 2019-12-07 09:55:10
问题 I'm trying to create a Graph using some Google Web Graph data which can be found here: https://snap.stanford.edu/data/web-Google.html import org.apache.spark._ import org.apache.spark.graphx._ import org.apache.spark.rdd.RDD val textFile = sc.textFile("hdfs://n018-data.hursley.ibm.com/user/romeo/web-Google.txt") val arrayForm = textFile.filter(_.charAt(0)!='#').map(_.split("\\s+")).cache() val nodes = arrayForm.flatMap(array => array).distinct().map(_.toLong) val edges = arrayForm.map(line =>

Problems running Spark GraphX algorithms on generated graphs

隐身守侯 提交于 2019-12-06 07:24:16
问题 I have created a graph in Spark GraphX using the following codes. (See my question and solution) import scala.math.random import org.apache.spark._ import org.apache.spark.graphx._ import org.apache.spark.rdd.RDD import scala.util.Random import org.apache.spark.HashPartitioner object SparkER { val nPartitions: Integer = 4 val n: Long = 100 val p: Double = 0.1 def genNodeIds(nPartitions: Int, n: Long)(i: Int) = { (0L until n).filter(_ % nPartitions == i).toIterator } def genEdgesForId(p:

I need to do join/joinVertices or add a field in tuple in graph by Spark Graphx

痴心易碎 提交于 2019-12-06 04:49:12
I have a RDF graph( link ) with tuples(s,p,o) and I made a property graph from that. My RDF property graph is obtained by following code( Complete code ): val propGraph = Graph(vertexArray,edgeArray).cache() propGraph.triplets.foreach(println(_)) with output as below: ((vId_src,src_att),(vId_dst,dst_att),property) and RDF data as: ((0,<http://umkc.edu/xPropGraph#franklin>),(1,http://umkc.edu/xPropGraph#rxin>),<http://umkc.edu/xPropGraph#advisor>) ((1,<http://umkc.edu/xPropGraph#rxin>),(2,<http://umkc.edu/xPropGraph#jgonzal>),<http://umkc.edu/xPropGraph#collab>) ((2147483648,<http://umkc.edu

Does Spark Graphx have visualization like Gephi

我只是一个虾纸丫 提交于 2019-12-06 01:43:06
问题 Hi I am new to graph world. I have been assigned to work on graph processing now I know Apache Spark so thought of using it Graphx to process large graph. Then I came across Gephi provides nice GUI to manipulate graphs. Does Graphx have such tools or it is mainly parallel graph processing library. Can I import json graph data came from Gephi into graphx? Please guide. I know It's basic but valid question. Thanks in advance. 回答1: Adding to that you can as well try Graphlab https://dato.com

How to create a VertexId in Apache Spark GraphX using a Long data type?

喜欢而已 提交于 2019-12-05 13:09:57
I'm trying to create a Graph using some Google Web Graph data which can be found here: https://snap.stanford.edu/data/web-Google.html import org.apache.spark._ import org.apache.spark.graphx._ import org.apache.spark.rdd.RDD val textFile = sc.textFile("hdfs://n018-data.hursley.ibm.com/user/romeo/web-Google.txt") val arrayForm = textFile.filter(_.charAt(0)!='#').map(_.split("\\s+")).cache() val nodes = arrayForm.flatMap(array => array).distinct().map(_.toLong) val edges = arrayForm.map(line => Edge(line(0).toLong,line(1).toLong)) val graph = Graph(nodes,edges) Unfortunately, I get this error:

Weekly Aggregation using Windows Function in Spark

独自空忆成欢 提交于 2019-12-05 07:16:44
问题 I have data which starts from 1st Jan 2017 to 7th Jan 2017 and it is a week wanted weekly aggregate. I used window function in following manner val df_v_3 = df_v_2.groupBy(window(col("DateTime"), "7 day")) .agg(sum("Value") as "aggregate_sum") .select("window.start", "window.end", "aggregate_sum") I am having data in dataframe as DateTime,value 2017-01-01T00:00:00.000+05:30,1.2 2017-01-01T00:15:00.000+05:30,1.30 -- 2017-01-07T23:30:00.000+05:30,1.43 2017-01-07T23:45:00.000+05:30,1.4 I am

Timeout Exception in Apache-Spark during program Execution

◇◆丶佛笑我妖孽 提交于 2019-12-05 02:40:48
I am running a Bash Script in MAC. This script calls a spark method written in Scala language for a large number of times. I am currently trying to call this spark method for 100,000 times using a for loop. The code exits with the following exception after running a small number of iterations, around 3000 iterations. org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48) at org.apache.spark.rpc

Spark GraphX Aggregation Summation

孤者浪人 提交于 2019-12-04 22:28:16
问题 I'm trying to compute the sum of node values in a spark graphx graph. In short the graph is a tree and the top node (root) should sum all children and their children. My graph is actually a tree that looks like this and the expected summed value should be 1850 : +----+ +---------------> | VertexID 14 | | | Value: 1000 +---+--+ +----+ +------------> | VertexId 11 | | | Value: +----+ | +------+ Sum of 14 & 24 | VertexId 24 +---++ +--------------> | Value: 550 | | VertexId 20 +----+ | | Value: +