Apache Drill vs Spark

前端 未结 3 1047
有刺的猬
有刺的猬 2021-02-05 14:30

I have some expirience with Apache Spark and Spark-SQL. Recently I\'ve found Apache Drill project. Could you describe me what are the most significant advantages/differences bet

3条回答
  •  执念已碎
    2021-02-05 14:41

    Apache Spark-SQL:

    • You need to write code (Scala, Java or Python) to access the data and process it.
    • SQL queries can be executed against Dataframes.
    • Execution can be done in a distributed fashion (cluster).
    • Almost every data storage has a Spark driver or connector.
    • Used for massive parallel computing/ data analytics.
    • Support stream processing.
    • Has a bigger support community.

    Apache Drill:

    • No need to write code, Drill will explore the data source and create its own data catalog.
    • Easier to use, just SQL.
    • Execution can be done in a distributed fashion (cluster).
    • It can be used to read data from many data sources such as MongoDB, Parquet files, MySQL and any JDBC database.
    • Used for ad-hoc data exploration.
    • It does not support stream processing.
    • It has a smaller support community.

提交回复
热议问题