This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money.

Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”

preface

Hadoop3.x is a full set of tutorials on big Data for dark Programmers. This series will update the new features of hadoop3. x that are not available in 2.x.

🍑 requirements:

What if we want to operate on a machine without an HDFS client installed? For example, the following scenario:

Next, we will look at several clients based on the HTTP protocol. HTTP is cross-platform and it does not require Hadoop to be installed on the client to operate HDFS directly.

🍑WebHDFS Overview and operations

🐒 profile

WebHDFS is an HTTP RESTFul API provided by HDFS. It is independent of Hadoop versions and supports the complete FileSystem/FileContext interface of HDFS. It allows clients to send HTTP requests to operate HDFS without having to install Hadoop.

In the HDFS Web UI we often use, it is based on WebHDFS to operate HDFS.

🐒 about RESTful

🚲 REST

Anyway REST (Performance layer state transitions) Representational State Transfer (Representational State Transfer) is a web software architecture style developed by Dr. Roy Thomas Fielding in his doctoral thesis in 2000 to facilitate the communication of information between different software/programs over networks, such as the Internet. Anyway, REST, which is a set of constraints and attributes based on the Hypertext Transfer Protocol (HTTP), is a software construction style designed to provide universal web services. Web services that conform to or are compatible with this architectural style (abbreviated as REST or RESTful) allow clients to make requests to access and manipulate network resources using uniform resource Identifiers (URiS), consistent with a predefined set of stateless operations. · Accordingly, REST provides a interoperability between networked computing systems with their resources interoperable. In contrast, other types of network services, such as SOAP services, access resources on the network through their own defined operation sets. Web services are currently one of the three mainstream implementations of Web services, which invest in a reST-style design and implementation because the REST schema is simpler than complex SOAP and XML-RPC. For example, Amazon.com provides a near-REST-style Web service to run book queries; The Web services provided by Yahoo are also RESTFUL. Anyway, it’s worth noting that REST is a design style rather than a standard. REST is typically based on existing widely popular protocols and standards such as HTTP, URI, XML, and HTML. The restoring resource is specified by the URI. (3) The operations on a resource, including obtaining, creating, modifying, and deleting, correspond to the GET, POST, PUT, and DELETE methods provided by the HTTP protocol. Are used to manipulate a resource by manipulating its representation. (3) The resource can be presented as XML or HTML, depending on whether the reader is a machine or a human, a client software consuming a Web service, or a Web browser. Of course, it could be any other format, such as JSON.

🚲 RESTFul API

A Web API that conforms to a RESTful design style is called a RESTful API. (3) A resource is defined in three ways: (3) An intuitive short resource address: URI, for example:example.com/resources(3) Transmitting resources: The types of Internet media that are accepted and returned by the Web service, such as JSON, XML, YAML, etc. (4) Operating on a resource is a set of request methods (such as POST, GET, PUT, or DELETE) supported by the Web service on the resource. Resource GET PUT POST DELETE URIs of a group of resources, for exampleexample.com/resourcesLists the URIs and details for each resource in the resource group. Replaces the current set of resources with the given set of resources. Create/append a new resource to this group of resources. This operation usually returns the URL of the new resource. Delete an entire group of resources. The URI of a single resource, for exampleExample.com/resources/1…Obtain the details of the specified resource in the format of an appropriate network media type (such as XML, JSON, etc.) to replace/create the specified resource. Append it to the appropriate resource group. Treats the specified resource as a resource group, and creates/appends a new element to it that belongs to the current resource. Deletes the specified element. Anyway, the PUT and DELETE methods are idempotent. Anyway, the GET method is safe (it doesn’t change anything on the server, so it’s idempotent anyway).

🚲 The difference between the PUT request type and POST request type

Either PUT or POST can be used to create or update a resource (adding a user, adding a file, for example), regardless of which request we use. Anyway, we essentially decide whether to use PUT or POST using idempotency. PUT is idempotent, that is, putting an object twice does not work. With POST, you get two requests at the same time.

🐒 HDFS HTTP RESTFUL API

HDFS HTTP RESTFUL API Supports the following operations:

🚲 HTTP GET

GETFILESTATUS (equivalent to FileSystem. GETFILESTATUS) buys a ticket for LISTSTATUS (equivalent to FileSystem. LISTSTATUS).   LISTSTATUS_BATCH (equivalent to the FileSystem. ListStatusIterator) GETCONTENTSUMMARY (equivalent to the FileSystem. GETCONTENTSUMMARY)  GETQUOTAUSAGE  (equivalent to the FileSystem. GetQuotaUsage) the GETFILECHECKSUM (equivalent to the FileSystem. The GETFILECHECKSUM)  GETHOMEDIRECTORY (equivalent to the FileSystem. GetHomeDirectory)  GETDELEGATIONTOKEN (equivalent to the FileSystem. GETDELEGATIONTOKEN)  GETTRASHROOT GETXATTRS (equivalent to filesystem.getXattr) sums the cost of buying a ticket by buying a ticket GetXAttrs (equivalent to filesystem.listxattrs) sums the system into a queue to CHECKACCESS (equivalent to filesystem.access).   GETALLSTORAGEPOLICY (equivalent to the FileSystem. GetAllStoragePolicies) GETSTORAGEPOLICY (equivalent to the FileSystem. GETSTORAGEPOLICY)  GETSNAPSHOTDIFF  GETSNAPSHOTTABLEDIRECTORYLIST  GETECPOLICY (equivalent to HDFSErasureCoding getErasureCodingPolicy)  GETFILEBLOCKLOCATIONS (equivalent to the FileSystem. GETFILEBLOCKLOCATIONS)

🚲 HTTP PUT

Buy a way to CREATE a system (equivalent to FileSystem. CREATE)

Buy a mechanism for MKDIRS (equivalent to FileSystem. MKDIRS)

 CREATESYMLINK (equivalent to FileContext CREATESYMLINK)

RENAME (equivalent to FileSystem. RENAME)

 SETREPLICATION (equivalent to the FileSystem. SETREPLICATION)

Anyway, SETOWNER (equivalent to FileSystem. SETOWNER)

Queue SETPERMISSION (equivalent to FileSystem. SETPERMISSION)

Buy a way to buy a digital transaction (equivalent to FileSystem. SETTIMES)

 RENEWDELEGATIONTOKEN (equivalent to DelegationTokenAuthenticator RENEWDELEGATIONTOKEN)

 CANCELDELEGATIONTOKEN (equivalent to DelegationTokenAuthenticator CANCELDELEGATIONTOKEN)

 CREATESNAPSHOT (equivalent to the FileSystem. CREATESNAPSHOT)

 RENAMESNAPSHOT (equivalent to the FileSystem. RENAMESNAPSHOT)

Buy a way to SETXATTR (equivalent to FileSystem. SETXATTR)

REMOVEXATTR (equivalent to FileSystem. REMOVEXATTR)

 SETSTORAGEPOLICY (equivalent to the FileSystem. SETSTORAGEPOLICY)

 ENABLEECPOLICY (equivalent to HDFSErasureCoding enablePolicy)

 DISABLEECPOLICY (equivalent to HDFSErasureCoding disablePolicy)

 SETECPOLICY (equivalent to HDFSErasureCoding setErasureCodingPolicy)

🚲 HTTP POST

Queue APPEND (equivalent to FileSystem. APPEND)

CONCAT (equivalent to FileSystem. CONCAT)

  TRUNCATE (equivalent to the FileSystem. TRUNCATE) UNSETSTORAGEPOLICY (equivalent to the FileSystem. UNSETSTORAGEPOLICY)

 UNSETECPOLICY (equivalent to HDFSErasureCoding unsetErasureCodingPolicy) 1.1.3.4 HTTP DELETE

 the DELETE (equivalent to the FileSystem. DELETE)  DELETESNAPSHOT (equivalent to the FileSystem. DELETESNAPSHOT)

🐒 File system URL and HTTP URL

The file system schema of WebHDFS is WebHDFS ://. The WebHDFS file system URI has the following formats.

webhdfs://:<HTTP_PORT>/

The WebHDFS URI above corresponds to the HDFS URI below.

hdfs://:<RPC_PORT>/

In RESTAPI, insert the prefix “/webhdfs/v1” into the path and append a query to the end. Therefore, the corresponding HTTPURL has the following format.

http://:<HTTP_PORT>/webhdfs/v1/?op=…

Install Postman to test:Node1. Itcast. Cn: 9870 / webhdfs/v1 /…This operation displays all files and directories in the root directory, which is equivalent to HDFS DFS -ls /

We can see in Postman that HDFS returns the following message:

{
    "FileStatuses": {
        "FileStatus": [{"accessTime": 0."blockSize": 0."childrenNum": 2."fileId": 16698,
                "group": "supergroup"."length": 0."modificationTime": 1601513468046,
                "owner": "root"."pathSuffix": "data"."permission": "755"."replication": 0."storagePolicy": 0."type": "DIRECTORY"
            },
            {
                "accessTime": 0."blockSize": 0."childrenNum": 2."fileId": 16386,
                "group": "supergroup"."length": 0."modificationTime": 1600886915849,
                "owner": "root"."pathSuffix": "mr-history"."permission": "770"."replication": 0."storagePolicy": 0."type": "DIRECTORY"},... ] }}Copy the code

🐒 creates and writes to a file using WebHDFS

🚲 Create a file

Submit HTTP PUT requests without automatically following redirects and without sending file data.

curl -i -X PUT “http://:/webhdfs/v1/? op=CREATE

                [&overwrite=<true |false>][&blocksize=<LONG>][&replication=<SHORT>]
                [&permission=<OCTAL>][&buffersize=<INT>][&noredirect=<true|false>]"
Copy the code

Typically, requests are redirected to the DataNode to write file data to.

HTTP/1.1 307 TEMPORARY_REDIRECT Location: http://:/webhdfs/v1/? op=CREATE… Content-Length: 0

If you do not want automatic redirection, you can set the NoreDirected flag.

HTTP/1.1 200 OK Content-type: application/json {“Location”:”http://:/webhdfs/v1/? op=CREATE…” }

For example, create a file named webhdfs_api. TXT in the /data/ hdFS-test directory and write the content to the file. Create a request using postman, set the request type to PUT, and request URL to:

Node1. Itcast. Cn: 9870 / webhdfs/v1 /…

HTTP responds with a URL for uploading data:

{” Location “:” node1. Itcast. Cn: 9864 / webhdfs/v1 /…” }

🚲 Write data

Submit another HTTP PUT request (which returns the returned response if noredirect is specified) using the URL in the Location header and write the file data to be written. curl -i -X PUT -T <LOCAL_FILE> “http://:/webhdfs/v1/? op=CREATE…” The client receives a response Created by 201. The Content Length of the response is zero. The WebHDFS URI of the file in the Location header is HTTP/1.1 201 Created Location: WebHDFS ://:/ Content-Length: 0

Example: Upload a file using Postman based on the PREVIOUSLY returned HTTP response.

When we open the WebUI, the file is uploaded successfully.

For more operations, please refer to: Hadoop.apache.org/docs/r3.1.4/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Cro ss – Site_Request_Forgery_Prevention

Afterword.

📢 : manor.blog.csdn.net

📢 welcome to like 👍 collect ⭐ message 📝 if there is an error please correct!

📢 This article was originally written by Manor and originally appeared on the CSDN blog 🙉

📢Hadoop series will be updated daily! ✨