hive
- Hive是一个SQL解析引擎,将SQL语句转译成MR Job,然后再Hadoop平台上运行。
- Hive中的表是纯逻辑表,就只是表的定义等,即表的元数据。本质就是Hadoop的目录/文件,达到了元数据与数据存储分离的目的
- Hive本身不存储数据,完全依赖HDFS和MapReduce
hive server
启动服务
hive --service hiveserver2 –help 查看帮助
hive --service hiveserver2
netstat -anp | grep 10000 或 ss -lnp | grep 10000
配置
/etc/hive/conf/hive-site.xml
使用了远程metastore配置
<property>
<name>hive.metastore.uris</name>
<value>thrift://hd1.mingchao.com:9083</value>
</property>
增加配置,配置启动的Host、端口
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hd2.mingchao.com</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
hiveserver2会同时启动一个webui,端口号默认为10002,配置一个端口转发访问
cat hive.mingchao.com.conf
server {
listen 80;
server_name hive.mingchao.com;
location / {
proxy_pass http://127.0.0.1:10002;
}
}
beeline连接hive
beeline -u jdbc:hive2://192.168.8.207:10000
beeline
!connect jdbc:hive2://192.168.8.207:10000
!quit
配置hive odbc连接
/etc/odbc.ini
[ODBC Data Sources]
Cloudera Impala DSN = Cloudera Impala ODBC Driver
Cloudera Hive DSN = Cloudera Hive ODBC Driver
[Cloudera Impala DSN]
Driver = /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so
HOST = 192.168.8.207
PORT = 21050
AuthMech = 0
[Cloudera Hive DSN]
Driver=/opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
HOST=192.168.8.207
PORT=10000
HiveServerType=2
[ODBC Drivers]
Impala = Installed
Hive = Installed
/etc/odbcins.ini
[Impala]
Description = ODBC for impala
Driver = /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so
[Hive]
Description = ODBC for Hive
Driver = /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
[ODBC Drivers]
Impala = Installed
Hive = Installed
isql "Cloudera Hive DSN"
SQL> select count(1) FROM t_log_login;
+---------------------+
| EXPR_1 |
+---------------------+
| 122149 |
Query ID = root_20181203235454_c671a6b2-16a8-4678-bd20-1948a52ef51f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1540986756395_0618, Tracking URL = http://hd1.mingchao.com:8088/proxy/application_1540986756395_0618/
Kill Command = /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hadoop/bin/hadoop job -kill job_1540986756395_0618
Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 1
2018-12-03 23:55:26,779 Stage-1 map = 0%, reduce = 0%
2018-12-03 23:55:48,254 Stage-1 map = 7%, reduce = 0%, Cumulative CPU 11.11 sec
2018-12-03 23:55:51,562 Stage-1 map = 25%, reduce = 0%, Cumulative CPU 11.94 sec
2018-12-03 23:56:03,778 Stage-1 map = 33%, reduce = 0%, Cumulative CPU 22.75 sec
2018-12-03 23:56:05,982 Stage-1 map = 67%, reduce = 0%, Cumulative CPU 29.57 sec
2018-12-03 23:56:07,078 Stage-1 map = 75%, reduce = 0%, Cumulative CPU 29.73 sec
2018-12-03 23:56:11,353 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 41.54 sec