It is time to share the URI in network protocol again every Tuesday. Today, I will share the related content of URI in network protocol. Because I am on a business trip, I cannot make the head diagram, so I will use the old diagram.
I think we all know what a URL is, because we’re all exposed to it every day, but what is a URI?
So let’s look at what would happen if there were no URIs in the world.
I uploaded some material to share with you without the URI. How can you download it?
First, I have to tell you to use FTP to access Naonao.com, port 8090
Then, tell you that the login user name is Naonao and the password is Handsome
Once logged in, you need to go to /Naonao/Source and convert to binary mode
Finally download again how to avoid the annoyance caused by handsome. Mp4 files
If it weren’t for the fact that the final document was too attractive to me, I wouldn’t want to bother with such a troublesome step
But after a URI, these steps above, we only need to input directly in the browser ftp://Naonao:[email protected]:8090/Naonao/Source/ how to avoid too handsome cause troubles. Mp4 so that you can directly download the resources on the Internet
As a friendly reminder, the URI above is not a completely correct URI, because the Chinese in the last part is not transcoded. We will look at the question of transcoding and decoding later
What is a URI?
What problem does urIs solve
But before we get to URIs, we need to take a quick look at urls and UrNs
A URL is defined in RFC1738(1994.12) as a Uniform Resource Locator, which indicates the location of a Resource and is expected to provide a method to locate the Resource
Urns, defined in RFC2141(1997.5) as Uniform Resource names, are expected to provide persistent, location-independent identification of resources and allow simple mapping of multiple namespaces to a single URN namespace
To get straight to the concept of URN, some people may not know what it is, but I’ll give you an example that guys definitely do, like magnetic links.
I found a calabash baby magnetic download address on the Internet, you look around, do you think this thing is very familiar?
magnet:? xt=urn:btih:bdab9b6759950fab3c8cbde2669bea6195491034
Well, it’s okay if you’re not familiar with it. That’s not the point of today. We know it’s probably the Great Wall. Now, what is the definition of a URI
The full name of a URI is a Uniform Resource Identifier, which is used to distinguish resources. It contains urls and UrNs, which are used to replace them
In other words, A URI can be a URL/URN, but a URL/URN need not be a URI, that is, a URI is a superset of A URL/URN
The difference between URIs and urls
We now know that URIs are supersets of URLS, but on the web, urls and URIs look so much alike that we often confuse them
The difference between URIs and urls is identifiers and Locator. Uris focus on unique identifiers, while urls focus on location
To make a simple analogy, if we use URI to express ourselves, then URI is our ID number, URL is our home address on our ID card, through the ID number (URI) can certainly find me, but you can not necessarily find me through my address (URL)
What does the resource include
The word “resources” covers so many things, from pictures and documents to today’s weather
It can also be an entity that cannot be accessed through the Internet, such as a person or company
It could be something abstract, like kinship or whether you’re a man who cheats on women’s feelings
However, it should be noted that URIs do not correspond to resources one by one. A resource can have many URIs, but one URI only corresponds to one resource, just like we have many bank cards in hand, but each bank card corresponds to an account holder only by ourselves
The practical use of a Identifier is a name that distinguishes the current resource from other resources
From the meanings of identifiers and sources, one of the goals of the URI is clear: it is more likely that resource providers will differentiate their own resources from other resources
For example, for entities that cannot be accessed through the Internet, such as people, we can define Mine, Father, Relationship, etc., through URL. In this way, we can distinguish the resources we want to express
The composition of the URI
Let’s take a look at the components of a URI
We analyze it based on what’s on the picture
Let’s look at the three most important things first, take an example, and then look at the following illustration
https://naonao.com?name=naonao&age=18#page-7
Scheme
Scheme refers to a Scheme, such as HTTP, HTTPS, FTP, etc., that can be used. Don’t be limited by these common protocols. You can also customize the protocol as long as the server supports it
Scheme can be letters, numbers, +, -, and., all of which are allowed
Note: After Scheme, you must distinguish Scheme from the rest with ://
Query
Query is an optional query parameter. If there is one, it must start with? At the beginning
The most common form is to use key=value, as in the above example name=naonao
But Query does not only support this, it also supports pchar,/,? The form such as
? If you want to use the Query parameter, you have to say? And what is pchar? If we want to understand this, we need to refer to the detailed description in the RFC, which is not the focus of today’s lecture
fragment
Fragment is also optional, and must start with a # if any
As in the example above, Page-7 points to a paragraph
It supports the same format as Query supports
authority
The authority contains the user name and password, the host name, and the port number.
For things like usernames and passwords, we don’t really use them this way anymore, because it’s not secure to transmit them in plain text in urIs
It’s still in use today, basically when we’re using FTP to download resources a lot
So we usually just use host:port, which is a host name plus a port number
The host name cannot be omitted, because if omitted, we cannot find the corresponding server
For example, the default HTTP port number is port 80, and the default HTTPS port number is port 443
path
The host name is immediately followed by our path
In urIs, the path part must start with a slash, so don’t mistake the slash before path for the end of the previous authority
There are many types of paths, including path-abempty, path-absolute, path-noscheme, path-rootless, and path-empty
- The path – abempty to
/
An initial path or an empty path - The path – absolute to
/
Start, but not with//
At the beginning - The path – noscheme with non
:
Path starting with a - The path – rootless relatively
path=noscheme
, increase permission to:
Path starting with a - Path – the empty empty path
There are so many paths just to respect the document, but even though there are so many types, it is actually very simple to use. Combining the five paths above, we can find that the limitation is only the beginning character
As long as we do not use Chinese or other special characters as the beginning of the path, so that our path is legal
So the path, we just need to fill in according to the actual situation
URI encoding
Finally, it’s time to fill the hole
We started with an example of how to download resources if there are no URIs in the world. The example I gave was urIs with Chinese characters, but in fact only ASCII characters can be used in URIs
But if we have something other than ASCII in our URI, or if we have identifiers in our URI like what? ‘#’/’ & ‘, and so on, will cause a URI parsing error.
To avoid this, URIs introduce encoding mechanisms
The rules are very simple, special characters in the ASCII code table are directly converted to ASCII code
Anything other than ASCII is converted to a hexadecimal byte, followed by a %, such as a space, which is escaped to %20,? Is translated to %3F
For example, nao is escaped to %e9%97%b9%e9%97%b9. For example, nao is escaped to %e9%97%b9
Because the corresponding hexadecimal UTF-8 encoding is E9 97 B9, and each bytecode is preceded by % to get the above result
Usually we input the URI in the browser address bar, even if the input Chinese can also be used normally, in fact, the browser behind us to help us do the helpless pain of transcoding and decoding
This is actually a very user friendly experience, not something you can’t read directly to the user, and is also a concept worth learning
Write in the last
URI is a content that must be understood in network protocol learning, but in fact, it is not difficult in general, just a little bit more conceptual things, after understanding, in fact, a little bit of content
You may ask, what’s the use of learning this? I can only answer you that it is of no direct use, but indirect use
For example, to do background development, to connect the interface, if the URI given is not standard, then the interface caller can not locate our resources, finally facing Google programming for a long time to solve
Or do front-end development, the interface call is not standard, such as GET call query parameter write error, then naturally can not adjust the interface given by the background
Another example is that when you get an unfamiliar project, you can analyze which resources are used and which pages and interfaces are relied on through the Network of the browser, but you can’t even understand the URI. Then you have to ask your colleagues, and you will be scolded by others after asking
Although these problems can be solved by Google programming or by asking colleagues, it is our time and colleagues’ time that is wasted when we look up information or consult