Hive service, HiveServer2 & MetaStore service?

后端 未结 2 1544
情话喂你
情话喂你 2021-02-04 13:12

I am trying to understand hive in terms of architecture, and I am referring to Tom White\'s book on Hadoop.

I came across the following terms in regards to

2条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-04 13:34

    Hive Services

    • HiveServer2
    • Hive Metastore
    • HCatalog + WebHcat
    • Beeline & Hive CLI
    • Thrift client
    • FileSystem :: HDFS and other compatible filesystems like S3
    • Execution engine :: MapReduce, Tez, Spark
    • Hive Web UI (added in Hive 2.x). Maybe also Tez or Spark UI, but not really

    Driver

    The JDBC/ODBC or Thrift interfaces have drivers.
    There are also the processes that interpret the query and compile it down to the execution engine code. I personally call that an interpreter or compiler, not a driver

    Metastore Server

    Not part of HiveServer2. It is literally a process running on top of an RDBMS (yes, you still need these when running Hive & Hadoop).

    Supported Remote Metastore servers = Oracle, MySQL, Postgres
    Embedded Metastore (not recommended for production) = Derby

    See Hive Wiki

    Metastore JVM

    The orange boxes are showing you can deploy these services as part of the same JVM as the driver (interpreter) or as a remote server. The wiki describes these setups.

    I believe this is a side-car process that maps the HiveServer2 queries to the MetaStore queries. For example, how do you translate the HiveQL into a process that reads metadata from MySQL or Postgres?

    It can run on the server-side, yes, but this is not a recommended setup for fault tolerance and performance reasons.

    HiveServer1 is deprecated. Feel free to read about it, but don't use it.

提交回复
热议问题