This is the 20th day of my participation in the August More Text Challenge
Hive data type
1.1 Basic data types
Hive Data Types | Java data types | The length of the | example |
---|---|---|---|
TINYINT | byte | 1byte signed integer | 20 |
SMALINT | short | 2byte Signed integer | 20 |
INT | int | 4byte Signed integer | 20 |
BIGINT | long | 8byte signed integer | 20 |
BOOLEAN | boolean | Boolean type, true or false | TRUE FALSE |
FLOAT | float | Single-precision floating point number | 3.14159 |
DOUBLE | double | A double – precision floating – point number | 3.14159 |
STRING | string | Character series. Character sets can be specified. You can use single or double quotation marks. | ‘Now is the time’ “For all Good Men” |
TIMESTAMP | Time to type | ||
BINARY | An array of bytes |
The Hive String type is equivalent to the database vARCHar type. This type is a variable String, but it cannot state how many characters it can store. It can theoretically store up to 2GB of characters.
1.2 Collection data types
The data type | describe | Syntax examples |
---|---|---|
STRUCT | Like structs in C, you can access element content through “dot” notation. For example, if the data type of a column is STRUCT{first STRING, last STRING}, the first element can be referenced by the field.first. 2 | Struct () e.g. struct<street:string, city:string> |
MAP | A MAP is a collection of key-value pairs of tuples that can be accessed using array notation. For example, if the data type of a column is MAP, where the key -> value pairs are ‘first’ -> ‘John’ and ‘last’ -> ‘Doe’, the last element can be retrieved by the field name [‘ last ‘] | Map () for example map<string, int> |
ARRAY | An array is a collection of variables of the same type and name. These variables are called elements of the array, and each element of the array has a number starting from zero. For example, if the array value is [‘ John ‘, ‘Doe’], the second element can be referenced by the array name [1]. | Array () such as Array |
Hive provides three complex data types: Array, Map, and struct. Array and Map are similar to Java’s array and Map, while Struct and C’s Struct are similar in that they encapsulate collections of named word fields, allowing arbitrary levels of nesting of complex data types
Case practice:
-
1. If a table has the following row, we use JSON format to represent the data result.
{"name": "songsong", "friends": ["bingbing", "lili"], // list Array, "children": {// key values Map, "xiao Song ": Struct, "street": "hui long guan", "city": "Beijing"}}Copy the code
-
2. Based on the above data structure, we created corresponding tables in Hive and imported data
Create a local test file named test.txt
songsong,bingbing_lili,xiao song:18_xiaoxiao song:19,hui long guan_beijingyangyang,caicai_susu,xiao yang:18_xiaoxiao yang:19,chao yang_beijing Copy the code
Note: Relationships between elements in maps, structs, and arrays can all be represented by the same character, using “_” here.
-
3. Create the test table on Hive
create table test( name string, friends array<string>, children map<string, int>, address struct<street:string, city:string>)row format delimited fields terminated by ','collection items terminated by '_'map keys terminated by ':'lines terminated by '\n'; Copy the code
Field parsing:
- Row format DELIMited fields terminated by ‘,’ — column separator
- Collection items terminated by ‘_’ –MAP STRUCT and ARRAY separators
- Map keys terminated by ‘:’ — The separator of key and value in map
- lines terminated by ‘\n’; — Line separator
-
4. Import text data into the measurement table
load data local inpath '/opt/module/hive/datas/test.txt' into table test; Copy the code
-
5. Access data in three sets of columns. The following are array, map and struct access methods respectively
hive (default)> select friends[1],children['xiao song'],address.city from testwhere name="songsong"; Ok_c0_c1 Citylili 18 beijingTime taken: 0.076 seconds, 1 row(s)Copy the code
1.3 Type Conversion
Hive atomic data types can be converted implicitly, similar to Java type conversion. For example, if an expression uses an INT, TINYINT is automatically converted to an INT, but Hive does not convert an INT to TINYINT.
1.3.1 Implicit Type Conversion rules
- 1. Any integer type can be implicitly converted to a wider range of types, such as tinyint to int and int to bigint
- 2. All integer types, floats, and strings can be implicitly converted to DOUBLE
- 3, TINYINT, smallint, int can be converted to Float
- Boolean types cannot be converted to any other type
1.3.2 Data type conversion can be performed using cast operation display
For example, cast (‘ 1 ‘as INT) converts string 1 to the integer 1. If the cast force fails, such as cast (‘x’ as INT), the expression returns null.
0: jdbc:hive2://hadoop102:10000> select '1'+2, cast('1'as int) + 2; + -- -- -- -- -- - + -- -- -- -- -- - + - + | _c0 | _c1 | + -- -- -- -- -- - + -- -- -- -- -- - + - + 3.0 | | 3 | + -- -- -- -- -- - + -- -- -- -- -- - + - +Copy the code