Hive functions that are hard to remember
1. Aggregate function
Avg (col) # returns the average value of the shuffled count(col) # Returns the number of rows retrieved Max (col) # Returns the maximum value of the group min(col) # Returns the minimum value of the group collect_set(col) # Returns a list of objects that eliminate duplicate elements, Collect_set can be used with concat_ws and explode. Collect_set means not to repeat. Select c1,collect_set(c2)[0] from AA group by c1 C1 c2 a [" 1 ", "2", "3"] b [" 4 ", "5", "6"] and concat_ws application example: Select c1,concat_ws(",",collect_set(c2)) from AA group by c1Copy the code
Explode function
Posexplode (array) # explode(array) # explode(map) # explode(map) # explode(array) # explode(array) # Lateral View # lateral view UDTF expression tableAlias AS columnAlias Examples of data with UDTF: Show that aa id col 1 [{" v1 ":" 1 ", "v2" : "2"}, {" v3 ":" 3 ", "v4" : "4"}] 2 [{" v5 ":" 5 ", "v6" : "6"}, {" v7 ":" 7 ", "v8" : "8"}] 3 null Explode: select explode(col) from aa {" v1 ", "1", "v2" : "2"} {" v3 ":" 3 ", "v4" : "4"} {" v5 ":" 5 ", "v6" : "6"} {" v7 ":" 7 ", "v8" : "8"} lateral view ecplode example: Select id,col1 from AA lateral view ecplode(col) col as col1 1 {"v1":"1","v2":"2"} 1 {"v1":"3","v2":"4"} 2 {"v1":"5","v2":"6"} 2 {"v1":"7","v2":"8"} lateral view Select id,col1['v1'] from AA lateral view ecplode(col) col as col1 1 1 1 2 3 2 5 7 lateral view ecplode example of the values in the array and parse map: Select id,ecplode(col1) from AA lateral view ecplode(col) col as col1 1 v1 3 1 1 1 1 v1 v2 2 2 v1 v2 4 5 6 2 v1 v2 7 2 v2 8 lateral VIEW outer explodes retain original null values, would extend the columns of the null set to null data sample: Mhotel POis 90663029 [{"search": 5, "id": 15306667, "POI ":" Wushan County Education Commission Care for the Next Generation working Committee "}, {"search": 2, "ID ": 15381967," POI ": "Style square"}, {" search ": 1," id ", 15433199, "poi" : "wu mountain tsui bus stop"}] [90808282 {9, "search" : "id", 15451636, "poi" : "Wushan County Education Committee "}, {"search": 5," ID ": 15306667, "POI ":" Wushan County Education Committee Care for the Next Generation working Committee "}, {"search": 1, "ID ": 15367082," POI ": }] SQL > select * from 'nanfeng Primary school' SELECT mhotel, json_tuple(poi_json,'search','id','poi') AS (search,id,poi) FROM (SELECT mhotel, regexp_replace(regexp_replace(pois, '\\[|\\]', ''), '\\}, \\{', '\\}&\\{') pois FROM iii) t lateral VIEW explode(split(pois, '\\&')) pois AS poi_jsonCopy the code
Json string parsing
Json_tuple # get_josn_object # get_josn_object # get_josn_object # get_josn_object # get_josn_object # get_josn_object # get_josn_object # get_josn_object # get_josn_object Select get_josn_object(a,'$.type') as type,get_josn_object(a,'$.name') as name from aa json_tuple example: select a.id,b.name,b.type from aa a lateral view json_tuple(first,'name','type') b as name,typeCopy the code
4. Time function
From_timestamp (unix_TIMESTAMP (), 'YYYY-MM-DD' HH:mm:ss ')# current_date # current_date timestamp() # date_add(DATE startdate,int days) # date_sub(DATE) Date_diff (date_diff, date_diff, date_diff) DATE in front of the big return positive add_months (DATE | STRING | TIMESTAMP the start_date, int num) # returns the num of the start_date day, if the DATE for the 2020-08-31, Num = 3 return 2020-11-30 year(stirng date) # Return the month part of the date (stirng date) # return the day part of the date (stirng date) Dayofmonth (string date) # Last_day (string date) # Returns the last dayof the month Months_between (DATE | TIMESTAMP | STRING date1, DATE | TIMESTAMP | STRING date2) # in returns between two dates, not on the same day is calculated according to 31 days, Next_day (STRING starT_date, STRING day_of_week) # example :next_day('2020-03-31','Monday') = 2020-04-06 And return the quarter Monday next week (DATE | TIMESTAMP | STRING a) # for time in a year's quarter weekofyear (STRING DATE) # access to DATE in a year the number of weeks to_date (STRING TIMESTAMP) # return the date portion of the timeCopy the code
5. Mathematical functions
Floor (double a) # return the minimum int value equal to or less than a rand(int seed) Round (double a, int D) # Return the rounded BIGINT value of a or a rounded to d decimal placesCopy the code
6. Window functions
SUM(), MIN(),MAX(),AVG(),COUNT(), SUM(), SUM(), AVG(), SUM(), SUM(), SUM(), SUM(), MAX(),AVG(), AVG(), and COUNT() are used in combination with window functions. Count (distinct XXX) is not allowed in window functions, size(collect_set() over(partition by order by)) can be used instead.
Example: select *, sum(result) over (partition by name order by time rows between 3 preceding and current row) as result FROM aa;Copy the code
2. Analytic aggregate functions RANk(),ROW_NUMBER(),DENSE_RANK() and other common sorting window functions. Ow_number function: generate sequential sequence numbers (same elements have the same sequence number); Rank function: if two elements are in the same order, they have the same number and skip the next number. * rank function: if two elements are in the same order, the sequence number is the same. PERCENT_RANK() : PERCENT_RANK() : PERCENT_RANK() : PERCENT_RANK() : * NTILE() : Number of partitions in which that row belongs if you’re dividing the data into n rows
3. A valued-type window function is LAG, which means that a column goes backwards and backwards. LEAD is the opposite of LAG, the advance of a column by several lines; FIRST_VALUE is the first value to date for this column, and LAST_VALUE is the last value to date for this column. LAG() and LEAD() can take three arguments, the first the returned value, the second the number of rows before or after the LAG, and the third the default.
Conditional function
Assert_true (Boolean condition) # Return null if true, throw exception coalesce(T v1, T v2,...) If (Boolean condation, T valueTrue, T valuefalse) Null (a) # null(b) # null(b) # null(a, b) # null Otherwise return an NVL (T value, T default_value) # return the default if the value is null, otherwise return the valueCopy the code
Reference: juejin. Cn/post / 684490…