HIVE-常用函数 | 易学教程

空值赋值

nvl函数
nvl(列名1, replace_with): 如果列名1为空值，则replace_with的内容代替，此处可以为一个固定的值，也可以是别的列名。

select nvl(name1, name2) 
from table_name 
或
select nvl(name1, 'zhangsan') 
from table_name 
如果name1 为NULL， 则只用name2替代，如果name2还是NULL， 则就是空；


# hive不支持这样写， 只能写2个
select nvl(name1, name2，name3) from table_name

时间函数

时间格式化

select date_format('2019-12-29', 'yyyy-MM')
date_format是只接受yyyy-MM-dd 格式的日期

时间相加减

select date_add('2019-12-29', 1)
select date_add('2019-12-29', -1)

select date_sub('2019-12-29', 1)
select date_sub('2019-12-29', -1)

select datediff('2019-12-29', '2019-12-20') 
是前面的减去后面的， 9， 反过来是-9

替换函数

select regexp_replace('2019/12/29', '/', '-')


select datediff(regexp_replace('2019/12/29', '/', '-'), regexp_replace('2019/12/28', '/', '-'))

case when

在这里插入图片描述

select dept_id,
		sum(case sex when '男' then 1 else 0 end) male,
		sum(case sex when '女' then 1 else 0 end) female
from table_name
group by dept_id

或者 
select dept_id,
	   sum(if(sex='男',1,0)) male,
	   sum(if(sex='女',1,0)) female
from table_name
group by dept_id 

总结：
如果分支少的时候可以用if代替了；
如果多的时候可以用case when

if函数

参考上面的sql代码 
if(sex='男',1,0)
if函数三个参数： 
	第一个是布尔类型表达式；
	第二个是正确的返回结果；
	第三个是错误的返回结果

行列互转

相关函数

concat(): 字符串拼接函数

select concat('1', '323','dsadf') 


select concat(cast(model_id as string), public_type, '222')
from table_name
where pt='2019-12-25'
limit 10

model_id 原为int， 使用cast强制转换为string， 才能使用concat拼接；

concat_ws(): 字符串拼接函数是concat的特殊形式;
是吧一行中的多个字段连接起来

select concat_ws('-', 'a', 'b', 'c')			'a-b-c'


select concat_ws('-', cast(model_id as string), public_type, public_author)
from table_name
where pt='2019-12-25'
limit 10

concat_ws: 第一个参数是连接符，后面的都是需要连接的字段

collect_set: 是将多行转换为一行；
collect_set是聚合函数，需要使用group by

id	name 
1	a
2	b
3	c
4	d
5	e
6	f

对id列使用collect_set方法，则会得出 这样的结果[1,2,3,4,5,6]
collect_set方法只接受1个参数；
select collect_set(id), collect_set(name)
from table_name 

得出结果是：

_col			_col2
[1,2,3,4,5,6]	[a,b,c,d,e,f]

collect_list和collect_set使用方法是一致的，区别在于set和list

select concat_ws('-', collect_set(cast(public_type as string))), 
       concat_ws('__',collect_set(cast(model_id as string)))
from table_name
where pt='2019-12-25'
limit 10

先将字段model_id强制转为string类型；使用collect_set方法每一行的id取出来放到一个set中，即转为一行中。再使用concat_ws方法拼接成为一个字符串；

select t.a, collect_list(b)
from 
(
select concat_ws(',', released_time,public_type) a, model_id b
from table_name
where pt='2019-12-25'
limit 10
) t

### group by 聚合的意思是：找出该行所有对应的项，放到一个新的列中：

a		b		c
12-25	type1	111
12-25	type1	222
12-25	type2	111	
12-25	type2	322

针对a b列聚合：
得出的结果是：
列1				列2
12-25，type1 	111，222
12-25，type2	111，322


针对c列聚合：
得出的结果是：
列1 	lie2
111 	12-25，type1；12-25，type2
222		12-25，type1
322 	12-25，type2

列转行

explode()：将hive的一列中的复杂的array，或者map拆开分为多个行
lateral view: 侧写
在这里插入图片描述

最终要得出下面的结果：

	列名1			列名2
	疑犯追踪 		悬疑
	疑犯追踪 		动作
	疑犯追踪		    科幻
	疑犯追踪 		剧情
	lie to me 	    悬疑
	lie to me 	    警匪
	.......
	.....
	...
	..
	.

select movie,
	   category_name
from table_name 
lateral view explode(category) tabel_tmp as category_name

tabel_tmp是表的别名 
category_name是列名

来源：CSDN

作者：唐僧哥哥在这

链接：https://blog.csdn.net/chenzhiqiang1018/article/details/103743722

标签

Hive

select函数

hive函数

nvl