概述
Hive提供了org.apache.hadoop.hive.ql.tools.LineageInfo类,可以用来分析HiveQL中的表级别血缘关系
用法
命令行
该类的main方法已经展示了该类的用法
publicstaticvoidmain(String[] args)throwsIOException, ParseException,SemanticException {String query = args[0];LineageInfo lep =newLineageInfo();lep.getLineageInfo(query);for(String tab : lep.getInputTableList()) {System.out.println("InputTable="+ tab);}for(String tab : lep.getOutputTableList()) {System.out.println("OutputTable="+ tab);}}
可以直接在命令行调用
[hadoop@dx2 ~]$ hadoop jar/usr/local/hive/lib/hive-exec-1.1.0-cdh5.9.0.jar org.apache.hadoop.hive.ql.tools.LineageInfo"INSERT OVERWRITE TABLE cxy7_dw.tmp_zone_info PARTITION (dt='20171109') SELECT z.zoneid AS zone_id,z.zonename AS zone_name, c.cityid AS city_id, c.cityname AS city_name FROM dict_zoneinfo z LEFT JOIN dict_cityinfo c ON z.cityid = c.cityid AND z.dt='20171109' AND c.dt='20171109' WHERE z.dt='20171109' AND c.dt='20171109'"InputTable=dict_cityinfoInputTable=dict_zoneinfoOutputTable=cxy7_dw.tmp_zone_info
Code调用
importorg.apache.hadoop.hive.ql.tools.LineageInfo;publicclassLineageInfoTest {publicstaticvoidmain(String[] args)throwsException {String query ="INSERT OVERWRITE TABLE cxy7_dw.tmp_zone_info PARTITION (dt='20171109') SELECT z.zoneid AS zone_id,z.zonename AS zone_name, c.cityid AS city_id, c.cityname AS city_name FROM dict_zoneinfo z LEFT JOIN dict_cityinfo c ON z.cityid = c.cityid AND z.dt='20171109' AND c.dt='20171109' WHERE z.dt='20171109' AND c.dt='20171109'";LineageInfo.main(newString[] { query });}}
控制台输出
InputTable=dict_cityinfoInputTable=dict_zoneinfoOutputTable=cxy7_dw.tmp_zone_info