Basic data types

The preceding figure lists the basic data types supported by Hive.

Same: These data types are implementations of interfaces in JAVA, for example STRING is a STRING in JAVA

Different:

1. In other SQL dialects, it is common to provide “character arrays” that limit the maximum length, but Hive does not support this.

Since Hive is used to optimize disk read and write performance, column length is not important (fixed length is easy to index)

2, TIMESTAMP can be an integer (the number of seconds since midnight on January 1, 1970)

; They can also be floating point numbers, accurate to nanoseconds (9 decimal places); It can also be a string of characters, yyyY-MM-DD hh: MM :ss. FFFFFFF

TIMESTAMPS indicates UTC time. Hive provides built-in functions such as to_UTc_timestamp and from_utc_timestamp to convert time zones

BINARY is similar to VARCHAR but different from BLOB. BINARY can contain any byte in the record, which prevents Hive from trying to parse it as a number, string, etc.

If you want to omit the end of each row, you do not need to use the BINARY data type. If the standard result of a table specifies three columns, but the actual data file contains five columns per row, the last two columns will be omitted in Hive.

Implicitly use the larger type when a query compares float to double, or int to float.

5, When converting a string to a value, you need to explicitly say… Cast AS INT (s)… ;

Set data types

Hive columns support strut Map and array collection data types, as shown in the following figure

There is no concept of keys in Hive, but users can create indexes for tables.

Create an instance of the table

Employee table of human Resources

CREATE TABLE employees( name STRTING, salary FLOAT, subordinates ARRAY<STRING>, deductions MAP<STRING, STRING>; adress STRUCT<street:STRING, city:STRING>, state:STRING, zip:INT) ); Language-java copies the codeCopy the code

4. Text file data encoding

Default record and field delimiters in Hive

Example use:

CREATE TABLE some_data(
	first FLOAT,
	second FLOAT,
	third FLOAT
)
ROW FORMAT DELIMITED
FIELDS TERMINQTED BY ',' ;
Copy the code