Dev. to by Mark Michon

With these metrics in mind, your application can better handle the inevitable problems that come with relying on third-party integration.

When monitoring third-party apis and Web services, what is monitored is just as important as how. Data is useful, but actionable data is where the real value lies. Below we list the most common and valuable metrics to monitor when relying on third-party API integration and Web services. Accurate monitoring and alerts can give your business the data it needs to make decisions about which apis to use, how to build resilient applications, and where to focus its efforts.

Here are the metrics we recommend when you start monitoring apis or Web services.

delay

Latency is the amount of time a message takes “online.” Here, the shorter the number, the better. The delay may be caused by the connection between your server and the API server, or it may be caused by the delay that occurs between your server and the API server. This may be the result of network traffic or resource overload, in which case throttling requests may accommodate heavy loads.

To monitor latency, the Web service needs to keep track of the timestamps of incoming and outgoing requests and compare them with past and future requests within a given time. This is still tricky because the response from the server will also be affected by the response time, and if available, ping the endpoint or calling the health check endpoint may be the best way to receive an accurate delay estimate.

This assessment can be useful in geolocation of servers so that your enterprise can decide which vendor to choose by determining the lowest latency. You can also choose a specific regional provider service if it is determined that latency is the real cause of the response delay, or a different provider if the response time of the resource is the problem. In practice, wait time and response time are usually combined into one value.

The response time

Response time is the time it takes for a service to respond to a request. This can be harder to track with third-party apis and Web services because the delay in sending and receiving data is part of the response time. You can estimate response times by comparing the response times of multiple resources on a given API. From there, you can estimate the shared latency between the API server and your server and determine the true value.

Response time has a direct impact on application performance, and delays in API responses can slow down user interactions. You can avoid this by ensuring that the API provider of your choice has a guaranteed response time, or by implementing a solution that uses backup apis or cached resources when spikes are detected.

availability

The availability of an API can be described as down time or uptime. Both are based on the same data, but may be said differently depending on context.

Usability is probably the easiest metric to track. Outage errors are recognizable, and sometimes the API provider announces scheduled outages. However, even the most reliable apis can encounter unexpected downtime. Downtime can be expressed as a single event or as a population average for a given period. While downtime quotas and guarantees such as “99.999% uptime” are valuable when evaluating API providers, even the smallest amount of downtime can have a big impact on your application.

Many apis rely on external providers such as Amazon Web Services (AWS), Microsoft Azure and Google Cloud Services. As a result, individual ISPs now depend on third parties with whom your app doesn’t do business directly for downtime. Even if the API provider’s service works as expected, the third party may not. Therefore, in the event of large outages, you will want to have an alternate program that is not dependent on the same underlying provider as the original API.

While measured similarly to downtime, uptime for an API can provide insight into business decisions. You can use this metric to switch between providers if you know that an API will lead to better uptime for customers during critical work hours.

Some stakeholders may react to downtime when choosing which API provider to abandon, while others may react more to uptime when considering which provider to choose. These numbers are related, but they can tell different data stories.

consumption

When monitoring aN API, it’s easy to forget usage, or consumption. Internal apis may not need to use this metric, but projections of third-party API usage can help make business decisions that can be difficult to estimate when using Web services without proper data. Consumption can be assessed as a whole or as an emergency. Some API vendors charge on a monthly basis, but some vendors may have rate limits on their pricing tiers and also observe usage in smaller time Windows.

By tracking spending and setting high usage alerts, you can avoid unnecessary costs. In addition, it is helpful to recognize when an API is not being used. Lack of consumption is a sign that an API is still part of your code base, but may not be important to your application, in which case you can adjust functionality priorities and gain insight into how your application is being used.

Consumption is best treated as a run value and can be filtered through a time window, which allows the dashboard to provide an overview and details about when to use the API.

Failure rate

Requests fail for a variety of reasons. When a request to a third-party API or Web service fails, it may be due to user error, API downtime, rate limits, or various network-related issues. While API failures can sometimes be caused by your application, when tracking third-party apis, you want to focus on failure rates that are out of your control.

Tracking failures and determining failure rates can help:

  • Report problems to API vendors
  • Make decisions among multiple API providers
  • Make informed decisions related to plan B
  • Build resilience around certain resources

Some errors may come from invalid requests, which can tell you that your application needs to adjust internal validation before making a request. Errors from server-related problems (such as status codes in the 400 and 500 range) indicate that the problem may be related to an API or web service provider.

Status code

Tracking HTTP responses can give you fine-grained details about individual apis, but tracking specific status codes can give you better insight into the type of problem. For example, some API providers respond with a 200 OK status even if an error occurs. This false indicator may lead you to believe that everything is going as expected, but users may be experiencing problems, and your internal logging may tell a different story.

Compare the API provider’s status code metrics to the internal error log to get a better idea of the true error rate of the third-party Web services on which your application depends.

The end of the

With these metrics in mind, your application can better handle the inevitable problems that come with relying on third-party integration.

Measuring all this may sound like a daunting task. Fortunately, certain developer tools, such as Bearer, can help monitor many of these metrics and deal with problems that arise automatically.