问题
I have a requirement to generate a XML which has a below structure
<parent>
<name>parent</name
<childs>
<child>
<name>child1</name>
</child>
<child>
<name>child1</name>
<grandchilds>
<grandchild>
<name>grand1</name>
</grandchild>
<grandchild>
<name>grand2</name>
</grandchild>
<grandchild>
<name>grand3</name>
</grandchild>
</grandchilds>
</child>
<child>
<name>child1</name>
</child>
</childs>
</parent>
As you see a parent will have child(s) and a child node may have grandchild(s) nodes.
https://github.com/databricks/spark-xml#conversion-from-dataframe-to-xml
I understand from spark-xml that when we have an nested array structure the data-frame should be as below
+------------------------------------+
| a|
+------------------------------------+
|[WrappedArray(aa), WrappedArray(bb)]|
+------------------------------------+
Can you please help me with this small example on how to make a flattened DataFrame for my desired xml. I am working on Spark 2.X Spark-Xml 0.4.5(Latest)
My Schema
StructType categoryMapSchema = new StructType(new StructField[]{
new StructField("name", DataTypes.StringType, true, Metadata.empty()),
new StructField("childs", new StructType(new StructField[]{
new StructField("child",
DataTypes.createArrayType(new StructType(new StructField[]{
new StructField("name", DataTypes.StringType, true, Metadata.empty()),
new StructField("grandchilds", new StructType(new StructField[]{
new StructField("grandchild",
DataTypes.createArrayType(new StructType(new StructField[]{
new StructField("name", DataTypes.StringType, true,
Metadata.empty())
})), true, Metadata.empty())
}), true, Metadata.empty())
})), true, Metadata.empty())
}), true, Metadata.empty()),
});
My Row RDD data.. Not actual code, but somewhat like this.
final JavaRDD<Row> rowRdd = mapAttributes
.map(parent -> {
return RowFactory.create(
parent.getParentName(),
RowFactory.create(RowFactory.create((Object) parent.getChild))
);
});
What i have tried till now i have the WrappedArray within parent WrappedArray which does not work.
来源:https://stackoverflow.com/questions/50007809/spark-xml-array-within-an-array-in-dataframe-to-generate-xml