What is the Phoenix
Simply put, Phoenix is a framework that allows you to operate HBase databases using SQL.
HBase is a NoSQL database. The shell client only supports simple operations and is prone to dizziness.
For example, the following graph returns all the data in a table:
Is it ugly? In addition, if complex HBase query is performed, only the native API of HBase can be used to write query programs, which is quite troublesome.
Using Phoenix, you can use SQL to query data efficiently, add secondary indexes to HBase, improve query performance, and other features.
Phoenix supports writing JDBC code to operate HBase, which is more convenient than the native API.
Usage examples
Build table
CREATE TABLE IF NOT EXISTS us_population (
state CHAR(2) NOT NULL,
city VARCHAR NOT NULL,
population BIGINT
CONSTRAINT my_pk PRIMARY KEY (state, city));
Copy the code
See the table
0: jdbc:phoenix:localhost> ! tablesCopy the code
Go to HBase.
hbase(main):041:0> list
Copy the code
Result information:
TABLE
SYSTEM.CATALOG
SYSTEM.FUNCTION
SYSTEM.SEQUENCE
SYSTEM.STATS
US_POPULATION
...
Copy the code
The US_POPULATION table was successfully generated.
Add data
Create a new test data file us_population.csv with the following contents:
NY,New York,8143197
CA,Los Angeles,3844829
IL,Chicago,2842518
TX,Houston,2016582
PA,Philadelphia,1463281
AZ,Phoenix,1461575
TX,San Antonio,1256509
CA,San Diego,1255540
TX,Dallas,1213825
CA,San Jose,912332
Copy the code
Run the following command to import the data from the file into the database:
./psql.py localhost us_population.csv
Copy the code
Query table data:
0: jdbc:phoenix:localhost> select * from US_POPULATION;
Copy the code
The sample
Phoenix comes with a small example of web statistics. Run the following command to import it:
bin/psql.py localhost examples/WEB_STAT.sql examples/WEB_STAT.csv
Copy the code
The first implementation of the SQL table, and then import data file CSV.
Query table data:
0: jdbc:phoenix:localhost> select * from WEB_STAT;
Copy the code
The CORE and DB fields are CPU and database usage.
Check the average USAGE of CUP and database in each group according to DOMAIN groups:
SELECT DOMAIN, AVG(CORE) Average_CPU_Usage, AVG(DB) Average_DB_Usage
FROM WEB_STAT
GROUP BY DOMAIN
ORDER BY DOMAIN DESC;
Copy the code
View the number of accesses to each domain in descending order:
select domain,count(1) num
from web_stat
group by domain
order by num desc;
Copy the code
summary
The basic function of Phoenix is to add an SQL layer to HBase to facilitate HBase use.
Phoenix has many excellent features, such as: secondary index, namespace mapping, view, multi-tenant, dynamic column, transaction…
And now it has developed very well, can integrate Spark, Hive, Pig, MapReduce, and Flume plug-ins.