The selection of technical solution is a very interesting thing, each link has a variety of choices, can combine a variety of possibilities. Sorting out the best of these possibilities is something I really enjoy doing.

I just finished the annotation sharing of Klib recently. While I am in a hot mood, I want to summarize what I struggled with during the process and what I finally chose.

0) Let’s look at the end result first

That’s how Klib share tags work: Click Share and you get an instant web page that’s globally accessible. The operation couldn’t be simpler, but the technical logic behind it is complicated:

The actual development is mixed and cross-cutting, but for the sake of introduction, I’ll roughly follow the data flow.

1) Klib and interface server

This part of the function is straightforward: Klib sends the annotation content to the interface server, which returns the result after processing it.

What needs to be introduced is something other than functionality:

  • How do I prevent the interface from being attacked
  • How to do identification

This part was actually quite complicated, and I ended up with a solution commensurate with Klib security.

1.0) Preventing the interface from being attacked

1.0.0) Interface server uses HTTPS

This is the most basic, but very effective, way to use HTTPS encryption all the way, which can already greatly improve security.

1.0.1) Prevent the interface from being used illegally

If the interface is open and everyone can access it at will, the server can be flooded with junk data at will.

For example, it is a good practice to use asymmetric encryption, that is, use a pair of private keys and public keys. Data encrypted with the private key can only be decrypted using the public key. Conversely, data encrypted with a public key can only be decrypted using a private key. The overall process is roughly as follows:

  • Interface server openingPublic key A
  • Each Klib client generates a new oneThe private key BPublic key B
  • Klib clientPublic key AencryptionPublic key BAnd sends it to the interface server
  • Interface server usageThe private key is AAfter decryption, the corresponding of the client is storedPublic key B
  • After that, the Klib client sends data when usedThe private key BEncrypted, used by the interface server after receivingPublic key BDecrypt and usePublic key BData is returned after encryption

Sound like a tongue twister?

The development is a bit tricky, as the server also stores the public key for each Klib client. If you have multiple servers, you need to synchronize the public key between the different servers, which is even more troublesome. For my small product + experimental function, there is no need for such a high level of security.

Instead, I used the simpler, but sufficient, AES symmetric encryption. That is, the Klib client and the interface server use the same AES encryption method and the same password to encrypt the request and response data. If proper encryption is not provided, the server interface cannot be used.

The main risk is that hackers can decompile Klib to get passwords. In addition to compiling and signing Klib itself, I also encrypt the stored password in the code. Basically, 99.9999% of people don’t bother to crack the code, except for the fact that they’re stuck with me for the rest of their lives.

1.0.2) Use timestamp + MD5

Even encrypted data ends up as an HTTP request that can be intercepted locally and used to simulate a normal user request.

The corresponding protection is to add a timestamp in the HTTP request, calculate MD5 (or CRC) for the content part of the HTTP header, and verify it on the server side to ensure that the HTTP header is not abused.

Actually, this is OAuth’s category. Fortunately, when I was developing iPic, I successively implemented OAuth of Qiniu, Youopai, Ali Cloud, Imgur, Flickr and Amazon S3 from the perspective of the client side. This time, I realized a simple part of the server side, which was not too much trouble.

1.1) How to do identification

It’s about protection from hackers. Sounds a little dizzy, right? Let’s talk about normal identification.

For example, if a user tries to stop a share, how do I determine whether the user has permission?

This is easier to solve if you have an account system. What if Klib doesn’t yet reference the account system? The more advanced approach is to use blockchain (ahem). My current approach is that when a user shares a book annotation using Klib, the server returns a random number. The next time the user stops sharing, as long as they can provide this random number, the request is considered valid. Under the premise of all the above protection, it can effectively prevent malicious stop sharing.

2) Interface server

Interface server is the most complex part of the whole system, and it has many responsibilities:

  • Validate the request and receive the data
  • Store the data
  • Generate static web pages from data
  • Transport static web pages to static servers
  • Update data stores and static servers when updating and deleting shares

The validation request corresponds to the previous description and is omitted here.

2.0) Implement the functional parts using Python + Flask

The so-called interface server, the first is to open the interface (open the door to receive guests), in particular, is the HTTP request routing table. For example, when a Klib client sends data to api.klib.me/share, there is code to receive and process the request.

I have covered using the Flask framework in the basic tutorial I concluded after getting started with Python in previous articles, which I will not repeat here.

2.1) Use Nginx + Gunicorn to build the server

As above, please refer to my basic tutorial after getting started with Python.

In addition, the container is designed to ensure reliable operation of the service.

2.2) Use MySQL + SQLAlchemy to store data

From a data storage point of view, books are neatly labeled with titles, authors, notes, and so on. So I chose the most common relational database: MySQL

If directly using SQL statements to operate the database is tedious and unsafe, here I use SQLAlchemy, which can be called the de facto standard in the world of Object Relational Mapping (ORM), to build the Model and operate the database.

I was going to say “there’s nothing to explain”, but in fact, MySQL has a lot of bugs. For example, to support Emoji, use UTF8MB4 encoding throughout. There are many other pits, 10,000 words omitted here…

2.3) Generate static web pages using the Jinja template

For the annotation part, Klib sends Markdown format, such as:

# Think simple

# # frontispiece

- The essence of business is to "keep delivering what people really want" and nothing else.- Recruit people with the passion and ability to respond to user needs and create an unfettered environment in which their talents can be maximized, nothing else.## Chapter 1 Business is Not "War"

- The important thing is to hone the perception of the "real needs of the masses" and the technology to materialize them.- Unlike sports, you don't have to fight anyone.Copy the code

You need to convert it to HTML using markdown mode, as in:

<h1>Simple thinking</h1>
<h2>The frontispiece language</h2>
<ul>
   <li>
      <p>The essence of business is to "keep delivering what people really want" and nothing else.</p>
   </li>
   <li>
      <p>Recruit people with the passion and ability to respond to user needs and create an unfettered environment in which their talents can be maximized, nothing else.</p>
   </li>
</ul>
<h2>Chapter 1 Business is Not war</h2>
<ul>
   <li>
      <p>The important thing is to hone the perception of the "real needs of the masses" and the technology to materialize them.</p>
   </li>
   <li>
      <p>Unlike sports, you don't have to fight anyone.</p>
   </li>
</ul>Copy the code

Here’s the compliment: Python wheels are just a lot. By gently importing the MarkDown module, you can elegantly convert markDown to HTML for comfort.

import markdown
html_str = markdown.markdown(markdown_str)Copy the code

For the final static site generated, CSS/JS and other parts are the same, but the content of the page title, body and other content is different. Therefore, Jiaja mode is used to represent these common parts, and annotations such as {{title}} indicate the different parts of the content of each share. Replace the contents of the template with the render_template method to generate the corresponding static file.

Exclamation: such a simple and direct operation, without all kinds of complex configuration, can get what you want in the end, is really the most lovely part of programming.

3) Static server and CDN

Having a static server is like having a treasure. You can’t just hide it, you have to show it to everyone. That’s what static (Web) servers do.

Of course, a static server and an interface server can be physically the same server, but the distinction is made in terms of roles.

In the display of static web pages, there are mainly two aspects of technical selection requirements:

  • Web content can be updated in real time
  • Fast user access speed

Among them, the update of the content corresponds to the following three operations:

  • Create a share
    • Create a static HTML file
  • Update shared content
    • Update the CONTENT of the HTML file
  • Stop sharing
    • Delete the HTML file

Well, with the need to create, update, and delete HTML files “in real time,” let’s look at how to speed up access.

3.0) 😔 use only a single server

First, how to do nothing, which means that users all over the world (Klib has to be an international product, consider global users, yeah) have to connect to this server.

Not to mention the number of concurrent restrictions, just from the speed of the network, if the server is placed in the domestic, foreign users will be slow; And vice versa. Not to mention the domestic or telecommunications, netcom, as well as the magical length and width, there are more than N countries abroad.

If you really want to do so, the better solution is to use Ali Cloud Hong Kong server, can take into account domestic and foreign users. For the time being, not using this option saves $19 per month…

3.1) 😔 CDN

Further, the common practice is to use A CDN.

CDN can effectively improve the access speed in different regions and different network environments, and greatly reduce the pressure on static servers. However, CDN has a fatal limitation: content updates are slow. This slowness can cause business problems, especially when updating and deleting content.

For example, when users share annotations in Klib and then stop, only to find that the previous page is still accessible, users will think this is a Bug, which will bring great pressure to customer service. So, skip that option.

3.2) 😔 multiple domestic and foreign servers

The next solution is that one or more domestic and multiple foreign servers are deployed through the DNS server, which is equivalent to the self-established CDN.

Unexpectedly, encounter a pit however: the outside net speed of domestic server is generally slower. For example, I tried ali Cloud Shanghai node, and it takes 4s to transfer a 10 KB file from a foreign server using SCP or Rsync, which is lower than my glasses. And, ali cloud I also only bought 1 MB bandwidth of the small water pipe, the speed will be very slow. This plan was abandoned.

3.3) 😃 Final scheme

How to achieve the national second open, and more detailed analysis, as in the “comfortable development” public number:

Free development


3.4) Search engine optimization

4) Adaptation of web pages

5) other

5.0) Distinguish between development, test, and online environments

5.1) Service monitoring

5.2) Keep a journal

5.3) Make sure you have a backup

6) The affairs of zhuge Liang