hive

hive

  • Hive是一个SQL解析引擎,将SQL语句转译成MR Job,然后再Hadoop平台上运行。
  • Hive中的表是纯逻辑表,就只是表的定义等,即表的元数据。本质就是Hadoop的目录/文件,达到了元数据与数据存储分离的目的
  • Hive本身不存储数据,完全依赖HDFS和MapReduce

hive server

启动服务

hive --service hiveserver2 –help 查看帮助
hive --service hiveserver2
netstat -anp | grep 10000 或 ss -lnp | grep 10000

配置

/etc/hive/conf/hive-site.xml

使用了远程metastore配置

<property>
    <name>hive.metastore.uris</name>
    <value>thrift://hd1.mingchao.com:9083</value>
</property>

增加配置,配置启动的Host、端口

<property>
    <name>hive.server2.thrift.bind.host</name>
    <value>hd2.mingchao.com</value>
</property>
<property>
    <name>hive.server2.thrift.port</name>
    <value>10000</value>
</property>

hiveserver2会同时启动一个webui,端口号默认为10002,配置一个端口转发访问

cat hive.mingchao.com.conf
server {
  listen    80;
  server_name hive.mingchao.com;

  location / {
      proxy_pass http://127.0.0.1:10002;
  }
}

beeline连接hive

beeline -u jdbc:hive2://192.168.8.207:10000
beeline
!connect jdbc:hive2://192.168.8.207:10000
!quit

配置hive odbc连接

官网下载驱动

/etc/odbc.ini

[ODBC Data Sources]
Cloudera Impala DSN             = Cloudera Impala ODBC Driver
Cloudera Hive DSN = Cloudera Hive ODBC Driver

[Cloudera Impala DSN]
Driver          = /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so
HOST            = 192.168.8.207
PORT            = 21050
AuthMech                = 0

[Cloudera Hive DSN]
Driver=/opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
HOST=192.168.8.207
PORT=10000
HiveServerType=2

[ODBC Drivers]
Impala          = Installed
Hive = Installed

/etc/odbcins.ini

[Impala]
Description     = ODBC for impala
Driver          = /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so

[Hive]
Description     = ODBC for Hive
Driver          = /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so

[ODBC Drivers]
Impala          = Installed
Hive = Installed
isql "Cloudera Hive DSN"
SQL> select count(1) FROM t_log_login;
+---------------------+
| EXPR_1              |
+---------------------+
| 122149              |
Query ID = root_20181203235454_c671a6b2-16a8-4678-bd20-1948a52ef51f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1540986756395_0618, Tracking URL = http://hd1.mingchao.com:8088/proxy/application_1540986756395_0618/
Kill Command = /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hadoop/bin/hadoop job  -kill job_1540986756395_0618
Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 1
2018-12-03 23:55:26,779 Stage-1 map = 0%,  reduce = 0%
2018-12-03 23:55:48,254 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU 11.11 sec
2018-12-03 23:55:51,562 Stage-1 map = 25%,  reduce = 0%, Cumulative CPU 11.94 sec
2018-12-03 23:56:03,778 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 22.75 sec
2018-12-03 23:56:05,982 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 29.57 sec
2018-12-03 23:56:07,078 Stage-1 map = 75%,  reduce = 0%, Cumulative CPU 29.73 sec
2018-12-03 23:56:11,353 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 41.54 sec

资料

Hive原理详解

bigdata11