directory
1. Installation environment
Install Hadoop
1. Download Hadoop
2. Modify environment variables
3. Install Hive
1. Download Hive
2. Modify environment variables
3. Modify hivesite configurations
4. Check whether the installation is successful
Hive data integration
1. Hive synchronization configuration integration
2. Configure full synchronization
3. Hook tests
Five, error records
1. Abnormal characters exist in the configuration file
2. Guava versions are inconsistent
1. Installation environment
JDK 1.8
Install Hadoop
1. Download Hadoop
Mirror.bit.edu.cn/apache/hado… Choose the appropriate version
Download hadoop
Wget HTTP: / / http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gzCopy the code
To decompress mv, change the name of mv for easy use
Tar -xzvf hadoop-3.2.0.tar. gz mv hadoop-3.2.0.tar. gz HadoopCopy the code
2. Modify environment variables
Write hadoop environment information to environment variables
vim /etc/profile
export HADOOP_HOME=/opt/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
Copy the code
Run source etc/profile for it to take effect
3. Modify the configuration file
Sh file, vim etc/hadoop/hadoop-env.sh modify JAVA_HOME information
Export JAVA_HOME = / usr/lib/JVM/Java -- 1.8.0 comes with its - 1.8.0.262. B10-0. El7_8. X86_64Copy the code
Execute hadoop jar share/hadoop/graphs/hadoop – graphs – examples – 3.3.0. Jar grep input output ‘DFS [a-z], hadoop’s own example, Verify that Hadoop is installed successfully
3. Install Hive
1. Download Hive
Wget mirror.bit.edu.cn/apache/hive…
Decompress tar -zxvf apache-hive-3.1.2-bin.tar.gz
Change the name mv apache-hive-3.1.2-bin hive
2. Modify environment variables
vim /etc/profile
export HIVE_HOME=/opt/hive
export PATH=$MAVEN_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$PATH
Copy the code
source etc/profile
3. Modify hivesite configurations
<! -- WARNING!!! This file is auto generated for documentation purposes ONLY! -- > <! -- WARNING!!! Any changes you make to this file will be ignored by Hive. --> <! -- WARNING!!! You must make your changes in hive-site.xml instead. --> <! -- Hive Execution Parameters --> <! -- The following configuration has the original configuration. Search after modified or deleted after adding in the same position - > < property > < name > javax.mail. Jdo. Option. ConnectionUserName < / name > user name < value > root < value > / < / property > The < property > < name > javax.mail. Jdo. Option. ConnectionPassword < / name > password < value > 123456 < value > / < / property > < property > <name>javax.jdo.option.ConnectionURL</name>mysql <value>jdbc:mysql: / / 127.0.0.1:3306 / hive < value > / < / property > < property > < name > javax.mail. Jdo. Option. ConnectionDriverName < / name > mysql driver <value>com.mysql.jdbc.Driver</value> </property> <property> <name>hive.exec.script.wrapper</name> <value/> <description/> </property>Copy the code
Copy the mysql driver to hive/lib and go to /hive/bin
schematool -dbType mysql -initSchema
Copy the code
4. Check whether the installation is successful
Hive –version View the current version
Hive Check whether the hive command operation line is displayed. If yes, the command operation line is displayed
Hive data integration
The Ingest module of Atlas consumes the message in Kafka and writes the corresponding Atlas metadata to the underlying Janus graph database for storage and management. The Ingest module of Atlas consumes the message in Kafka as an event.
1. Hive synchronization configuration integration
[distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro, distro
Export HIVE_AUX_JARS_PATH = / opt/apache - atlas - 2.1.0 / hook/hiveCopy the code
Modify hive-site. XML to specify hook execution methods
<property>
<name>hive.exec.post.hooks</name>
<value>org.apache.atlas.hive.hook.HiveHook</value>
</property>
Copy the code
Note that this is actually post-execution monitoring. You can have pre-execution and in-execution monitoring. This is essentially a callback monitor that performs the lifecycle.
2. Configure full synchronization
Copy the atlas configuration file atlas-application.properties to the Hive configuration directory
Add two lines of configuration:
atlas.hook.hive.synchronous=false
atlas.rest.address=http://doit33:21000
Copy the code
Before installing Atlas, hooks will not automatically sense and generate metadata for existing tables in Hive. You can use an Atlas tool to import metadata from existing Hive libraries or tables. This tool also exists in the Hive -hook package generated by atlas compilation.
bin/import-hive.sh
Copy the code
The result is as follows. You need to enter the account password of Atlas to import data. After input, data will be imported.
Hive Meta Data Imported successfully!! Data is successfully imported
sh import-hive.sh Using Hive configuration directory [/opt/hive/conf] Log file for import is /opt/apache-atlas-2.1.0/logs/import-hive.log SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found the binding in [the jar: file: / opt/hive/lib/log4j - slf4j - impl - 2.10.0. Jar! / org/slf4j/impl/StaticLoggerBinder class] slf4j: Found the binding in [the jar: file: / opt/hadoop/share/hadoop/common/lib/slf4j - log4j12-1.7.25. Jar! /org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: The Actual binding is of type [. Org. Apache logging. Slf4j. Log4jLoggerFactory] : the 2021-01-15 T11 41:01, 614 INFO [main] Org. Apache. Atlas. ApplicationProperties - & for atlas - application. The properties in the classpath: the 2021-01-15 T11 41:01, 619 INFO [main] org.apache.atlas.ApplicationProperties - Loading atlas-application.properties from File: / opt/hive/conf/atlas - application. The properties: the 2021-01-15 T11 41:01, 660 INFO [main] Org. Apache. Atlas. ApplicationProperties - Using graphdb backend 'janus' : the 2021-01-15 T11 41:01, 660 INFO [main] Org. Apache. Atlas. ApplicationProperties - Using storage backend 'hbase2' : the 2021-01-15 T11 41:01, 660 INFO [main] Org. Apache. Atlas. ApplicationProperties - Using the index backend 'solr: the 2021-01-15 T11 41:01, 660 INFO [main] org.apache.atlas.ApplicationProperties - Atlas is running in MODE: PROD. 2021-01-15 T11:41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Setting solr - wait - a searcher property 'true' : the 2021-01-15 T11 41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Setting index. The search. The map - the name The property 'false' : the 2021-01-15 T11 41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Setting Atlas. Graph. Index. Search. Max - result - set - size = 150: the 2021-01-15 T11 41:01, 660 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.db-cache = true : the 2021-01-15 T11 41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Property (set to the default) Atlas. Graph. Cache. Db - cache - the clean - wait = 20:2021-01-15 T11 41:01, 660 INFO [main] org. Apache. Atlas. ApplicationProperties - Property (set to default) atlas.graph.cache.db-cache-size = 0.5 2021-01-15T11:41:01.661 INFO [main] org.apache.atlas.ApplicationProperties - Property (set to default) atlas.graph.cache.tx-cache-size = 15000 : the 2021-01-15 T11 41:01, 661 INFO [main] org. Apache. Atlas. ApplicationProperties - Property (set to the default) Atlas.graph.cache. Tx -dirty-size = 120 Enter username for atlas: -admin # Enter username for atlas :- : the 2021-01-15 T11 41:05, 721 INFO [main] org. Apache. Atlas. AtlasBaseClient - Trying with the address http://127.0.0.1:21000 : the 2021-01-15 T11 41:05, 831 INFO [main] org. Apache. Atlas. AtlasBaseClient - method = GET path = API/atlas/admin/status contentType=application/json; charset=UTF-8 accept=application/json status=200Copy the code
3. Hook tests
Once all the hooks are configured, try creating a test table in Hive and see if it is searchable in Atlas. The configuration can be considered successful
Before creating a data table, the following table information is displayed
Then create a table in Hive
` ` ` |
hive> CREATE TABLE teache (>id int , > name string , > age int , > sex string, > peoject string > ) ; OK Time taken: 0.645 seconds hive> show tables; OK class student teache Time taken: 0.108 seconds, 3 row(s) |
` ` ` |
Atlas is automatically available
Five, error records
1. Abnormal characters exist in the configuration file
As specified
Logging the initialized using the configuration in the jar: file: / opt/hive/lib/hive - common - 3.1. 2. The jar! /hive-log4j2.properties Async: true Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at org.apache.hadoop.fs.Path.initialize(Path.java: 263 ) at org.apache.hadoop.fs.Path.<init>(Path.java: 221 ) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java: 710 ) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java: 627 ) at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java: 591 ) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java: 747 ) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java: 683 ) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 62 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43 ) at java.lang.reflect.Method.invoke(Method.java: 498 ) at org.apache.hadoop.util.RunJar.run(RunJar.java: 323 ) at org.apache.hadoop.util.RunJar.main(RunJar.java: 236 ) Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at java.net.URI.checkPath(URI.java: 1823 ) at java.net.URI.<init>(URI.java: 745 ) at org.apache.hadoop.fs.Path.initialize(Path.java: 260 ) ... 12 moreCopy the code
Solution:
Find the specified number of config file lines and delete the description
<property> <name>hive.exec.scratchdir</name> <value>/tmp/hive</value> <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/< username> is created, with ${hive.scratch.dir.permission}.</description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/tmp/hive/local</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/tmp/hive/resources</value> <description>Temporary local directory for added resources in the remote file system.</description> </property>Copy the code
2. Guava versions are inconsistent
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8 at [row,col,system-id]: [3215, 96, "file:/opt/hive/conf/hive-site.xml" ] at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java: 3051 ) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java: 3000 ) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java: 2875 ) at org.apache.hadoop.conf.Configuration.get(Configuration.java: 1484 ) at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java: 4996 ) at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java: 5069 ) at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java: 5156 ) at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java: 5104 ) at org.apache.hive.beeline.HiveSchemaTool.<init>(HiveSchemaTool.java: 96 ) at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java: 1473 ) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 62 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43 ) at java.lang.reflect.Method.invoke(Method.java: 498 ) at org.apache.hadoop.util.RunJar.run(RunJar.java: 323 ) at org.apache.hadoop.util.RunJar.main(RunJar.java: 236 ) Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8 at [row,col,system-id]: [3215, 96, "file:/opt/hive/conf/hive-site.xml" ] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java: 621 ) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java: 491 ) at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java: 2456 ) at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java: 2403 ) at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java: 2369 ) at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java: 1515 ) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java: 2828 ) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java: 1123 ) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java: 3347 ) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java: 3141 ) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java: 3034 ) ... 15 moreCopy the code
Solutions:
1, com.google.com mon. Base. The Preconditions. CheckArgument in jars for this class: guava. Jar
Hadoop-3.2.1 (path: hadoop-share/hadoop-common/lib) : guava-27.0-jre.jar; In hive-3.1.2(path: hive/lib), the jar package is guava-19.0.1.jar
3. Change the JAR packages to the same version: Delete the jar packages of the earlier version of Hive and copy the jar packages of the earlier version of Hadoop to the Hive lib.
Restart problem solved!