Every once in a while you’ll see an article about GraphQL, but when you open it up, it’s basically a brief introduction to GraphQL’s features. A lot of articles are just a github project to find GraphQL and run the corresponding demo. Some of the articles obviously boast about GraphQL’s various advantages without complete project practice experience, making students who are not familiar with GraphQL think that it is a magic elixir, and they need to practice it in the project.

Due to the background of the project (to be covered later), I had the honor to participate in the implementation of GraphQL in the actual project. In this article, I will talk about my understanding of GraphQL, of course, this is only for readers’ reference.

GraphQL advantage

I don’t know if you’ve ever encountered a scenario where a service has dozens or even hundreds of interfaces. For an APP or other downstream to encapsulate a function, you need to call about 10 interfaces, possibly involving different teams. There are too many functions in the chain for development, tuning, testing, and for the caller. As these functions are upgraded through multiple versions of iteration, no one dares to change the interface of new + old version on a large scale, but can only make code splicing on the original basis, which is basically the origin of ancestral code. Most of the students have basic code literacy, but they can only let these ancestral code slowly rot, the reason is very simple, no one can guarantee that there is something missing after changing the function.

Is there a way to aggregate these interfaces and return a set of results back to the front end? In the current popular microservices architecture, there is a special intermediate layer to handle this task. This intermediate layer is called BFF (Backend For Frontend). A company I once worked for wanted to sell related functions of a certain business to other businesses of the company, but the access party decided not to accept them immediately after seeing so many interfaces. As a result, the business platform urgently developed BFF related functions for other businesses to access.

What some of you are doing at a slightly larger company is merging various interfaces and returning them to the caller, and that’s basically what BFF does. Is there any other way to get a set of interface return values with a single request, other than the way BFF sets up an access platform?

I took a random screenshot of a product on jingdong APP

Similar to this page, when the user opens this page, according to the current popular REST interface, the APP needs to initiate at least the following requests:

  • Get product details interface
  • Get the interfaces related to commodity prices and discounts
  • Get evaluation interface
  • Get the grass show interface
  • Get Q&A interface

These interfaces are generally heavy and contain many fields that are not needed by the current page. Is it possible that the APP can obtain all the fields needed by the page with a single request, while the APP can only request the fields it needs according to its own needs?

The answer is yes, it’s GraphQL.

query jdGoodsQuery {
     goods {
        detail {
          id
          pictures(first: 10) {
            pic_id
            thumb
          }
          spec {
            name
            size
            weight
          }
        }
        price {
          price
          origin_price
          market_price
        }
        comment(first: 10) {
          comment_id
          topic_id
          content
          from_uid
        }
        self_show(first: 10) {
          id
          pic_id
        }
    }
}
Copy the code

For the screenshot of jd product details above, a Query like this would fetch all the fields needed for the page.

There is another tricky problem with REST interfaces. When services are upgraded, interfaces inevitably need to be upgraded. A common problem is that a field is no longer needed after a new version is upgraded. How to gracefully handle the old interface and field?

Some students may say that you can force the downstream to upgrade and limit the time limit for upgrading. If the downstream does not upgrade, you will not be responsible. In this way, you can only deceive yourself, because when your interface is offline and the business side reports an error, you can only be responsible for this. In particular, this problem becomes more difficult when some of the interfaces are directly to the consumer.

In 2016, I worked in an AI product company and launched an intelligent reading product for children. Its basic function is to read picture books from page to page. This is also a successful product for HomeAI, and its sales volume on Tmall and JINGdong is not bad. At the time of AI Tuyere, the function of the initial positioning was reading books. With the expansion of the market, the function gradually became complicated. The most painful thing at that time was that the interface version number increased from V1 to V12 in just 3 months. Since the concept of the product at that time believed that forced upgrade was not beautiful and did not conform to the product design aesthetics, this product did not have the function of forced upgrade. As a result, the interface from V1 to V12 was always used by users. You may say that you will send announcements and SMS messages to inform users to upgrade within a certain period of time. If you do not upgrade, you will not be responsible for the failure of the product. This is also not feasible, especially for those products paid by users, if they cannot be used, 12315 will come to you.

When we were tortured by the API interface that could not be taken offline, after investigation, we found that GraphQL just had a function, “API evolution does not need to be divided into versions”. It was a pillow when I fell asleep, so under the leadership of the technical director (he went to Microsoft before the GraphQL transformation project was launched), We started working on the GraphQL transformation.

That concludes GraphQL’s most appealing advantages. It’s important to know that no technology is a “silver bullet.” When someone talks about the benefits of a technology without mentioning the limitations or limitations of the technology, you need to be careful. You may be the guinea pig.

Here are the problems I encountered in the GraphQL project. Maybe my treatment is not correct. The following points are for reference only.

GraphQL problem

Community activity issues

GraphQL was developed by Facebook, and Facebook also established the GraphQL Foundation. However, Facebook officially only provides the JS version of the open source implementation, and other language implementations are implemented by the unofficial community corresponding to GraphQL language. This results in differences in the understanding and implementation of different languages. For example, the merge tool of Graphql Schema, only JS official implementation has corresponding implementation.

GraphQL is one of those things that everyone thinks is great, but after nearly 10 years of development, it is still tepid.

This is the official GraphQL landscape. If you look closely at the company icon above, you can see that Github is the only big company. Facebook released the GraphQL specification and JS implementation, but did not release the actual interface of GraphQL itself, undermining the credibility of the technology.

When you encounter a GraphQL problem, you will either find JS implementations or no one will answer them. The GraphQL community is very well-documented, but there are not many useful information that you can search for when you encounter a problem.

The cache problem

Caching is easy to handle for REST interfaces, but it gets very complicated in GraphQL. Because of the nature of GraphQL, even though it operates on the same entity, each query may be different.

For example, since the client can customize the fields it needs, for example, one request only needs a person’s name, but in another query you might also want to know his credits. The name may be queried in the User.userProfile library, and spending credits may be queried in third-party systems. This time the query input is a single user_id, and the next query input may be userId_list. To solve the caching problem of these queries, you might set a lot of keys or key-values for each user into the cache, which is not a very elegant solution. The bottom line is that GraphQL is so flexible that the server’s cache design can’t keep up with the client’s flexible queries.

Facebook also has a DataLoader solution for this problem, of course, only the JS version of the solution, the other language community may not have the corresponding implementation. According to my original research, this thing is really not good to use, it is better to do their own cache faster.

The GraphQL cache is not only unfriendly to the server, but also a challenge to the client. Users need to do the client cache themselves, because GraphQL Query has only one route, and all of them are POST.

The gateway problem

GraphQL is strongly typed, so a schema must exist. As a rule of thumb, there was only one schema file on the client, and another tricky problem arose.

Suppose our service looks something like the following:

For example, server1 is a commodity service and server2 is a preferential service. If the client wants to connect to these two services, direct connection is not possible, because the client can only have one schema, but the server has two schema files.

How to deal with this situation? JS provides a tool for schema merge, and it is only a tool. No other language has such a thing.

Another very serious problem with this design is that none of the current API gateways are usable. It’s only a matter of time before you go microservices as your business grows, but if the server is all GraphQL based, what does the gateway do? Recently I looked into the latest versions of APISIX and KONG, two influential gateways in the industry that only support the GraphQL protocol forwarding.

In my opinion, in today’s microservices market, GraphQL is the only suitable scenario is to change the REST of BFF to GraphQL, which does both gateway and business. In fact, this is not perfect, that is, the client can only have this ONE BFF, but also makes the BFF not pure. GraphQL does BFF work for GraphQL, but instead of exposing GraphQL to clients, they put a layer of HTTP interface on top of GraphQL.

Complexity problem

The biggest benefit of GraphQL is that the client can query on demand, which is convenient for the client, but transfers the complexity of the problem to the server. The server is not want to check can check, after all, the server is resource restrictions, can not be unlimited to let the client to request.

query deep3 {
  viewer {
    albums {
      songs{
        author {
          company {
            address {
              ...
            }
          }
        }
      }
    }
  }
}
Copy the code

Since GraphQL pursues any type from a type that can be traced to a schema. Such queries, for example, can be nested indefinitely, and each Type of query is a corresponding query to the server, which the server certainly cannot afford.

How do you limit it? GraphQL offers concepts of complexity and depth, but how these two values are computed is up to the server developer to estimate. At the beginning of development, the complexity of the agreement was 1000, and two days later, the client came to the student and wanted to raise it to 3000. It was all endless wrangling. No matter how many Settings were set, there were always insufficient situations due to the diversity of client queries. Moreover, the complexity and depth are global, and not each Query can be configured individually, resulting in the two values becoming unnecessary.

Current limit problem

Stream limiting is also one of the most difficult problems for GraphQL to solve. It is impossible for the server to have a finite stream, otherwise the server stability will not be guaranteed. As REST interfaces are immutable, limiting traffic for different URIs is easy. But what’s difficult about GraphQL stream limiting?

query maliciousQuery {
  album(id: "some-id") {
    photos(first: 9999) {
      album {
        photos(first: 9999) {
          album {
            photos(first: 9999) {
              album {
                #... Repeat this 10000 times...
              }
            }
          }
        }
      }
    }
  }
}
Copy the code

What problem does this request cause? The client will issue a maliciousQuery, which will query the some-ID album. The album will fetch up to 9999 images in the album, and each image will be queried to the album to which it belongs, nested indefinitely. The server has no impact on such queries, and the complexity and depth mentioned above are useful, but not very useful.

Once encountered such a real scene, the GraphQL project has been deployed online, and the complexity and depth have been configured. When the client student got the commodity paging list, he also took out the corresponding commodity details and the cascading content of commodity details, resulting in the server directly OOM. The reason is similar to the above example, which is caused by too many nested queries. The problem is really related to complexity and depth, but complexity and depth are really hard to evaluate.

So the GraphQL stream limiting problem is that the client only makes a request once, but the request can be magnified countless times on the server side. How to effectively evaluate a value that allows clients to nest properly is beyond the complexity and depth provided by the official experience.

However, the good news is that GraphQL provides a way to evaluate GraphQL limiting, and there is another Chinese version of theoretical parsing to solve GraphQL limiting problems. The theory is one thing, but the implementation remains difficult.

conclusion

This article mainly introduces my experience of GraphQL landing. It has been a while since now, and my impression of GraphQL is stuck in these unsolvable problems. Someone once asked me to use GraphQL to refactor a service, but I was quite excited and dismissed the idea. This article may also have written wrong places, students are welcome to point out.