问题
I am interested in performing Big Data Geospatial analysis on Apache Spark. My data is stored in Azure data lake, and I am restricted to use Azure Databricks. Is there anyway to download Geomesa on Databrick? Moreover, I would like to use the python api; what should I do?
Any help is much appreciated!!
回答1:
As a starting point, without knowing any more details, you should be able to use the GeoMesa filesystem data store against files stored in WASB.
回答2:
You can install GeoMesa Library directly into your Databricks cluster.
1) Select the Libraries option then a new window will open.
2) Select the maven option and click on 'search packages' option
3) Search the required library and select the library/jar version and choose the 'select' option.
Thats it.
After the installation of the library/jar, restart your cluster.
Now import the required classes in your Databricks notebook.
I hope it helps. Happy Coding..
回答3:
Running GeoMesa within Databricks is not straightforward:
- GeoMesa’s artifacts are published on Maven Central, but require dependencies that are only available on third-party repositories, which is cumbersome given Databricks’ library import mechanism.
- GeoMesa conflicts with an older version of the scalalogging library present in the Databricks runtime (the infamous JAR Hell problem).
Reference: Use GeoMesa in Databricks
Hope this helps.
来源:https://stackoverflow.com/questions/58596798/how-can-i-download-geomesa-on-azure-databricks