分类: 服务器与存储
2010-12-17 11:22:16
Hive 原始的 列传行 方法 :
Array |
[1,2,3] |
[4,5,6] |
Then running the query:
SELECT explode(myCol) AS myNewCol FROM
myTable;
Will produce:
int myNewCol |
1 |
2 |
3 |
4 |
5 |
6 |
Using the syntax "SELECT udtf(col) AS
colAlias..." has a few limitations:
# 当前 推广码 的 hql 语句,结果和 当前 pvinsight 的比较像,原代码正在看,可能细节逻辑需要注意下。
再 71.122 hive pvlog 上运行过 ,此 hql 结果在 /tmp/lky/channel_pvid
from (
from (
select suv,channelid,regexp_extract(url,'.*pvid=(.*?)(&|#|$)',1) as pvid
from pvlog_pre
where dt='20101202' and
regexp_extract(url,'.*pvid=(.*?)(&|#|$)',1)!=""
) tb LATERAL VIEW explode( tb.channelid ) adTable AS channel
select
channel,suv,pvid
) tp
INSERT OVERWRITE
LOCAL DIRECTORY '/tmp/lky/channel_pvid'
select
tp.pvid,tp.channel,count(1) as pv , count (distinct tp.suv )
group by
tp.pvid,tp.channel
chinaunix网友2010-12-17 15:01:50
很好的, 收藏了 推荐一个博客,提供很多免费软件编程电子书下载: http://free-ebooks.appspot.com