The problem background
Product side in order to more accurately identify user orientation after the release of functional requirements. Need to support A/B testing, set up experimental data, do later product analysis.
Practical purposes
The application cache layer at the C end distinguishes A/B access, reduces the return source, improves the user response speed, and relieves the source site pressure.
Technology research
Distributed node network distribution is multifaceted
AWS CloudFront improves the user experience by providing faster content delivery through accelerated distribution and zone edge caching.
Source site page content dynamic cache, with path as the core of the first latitude, c-terminal type as the second latitude. Form multi-page multi-terminal cache.
One of the objectives of A/B Testing does not affect user experience, so it requires quantitative analysis in the same latitude of the page, adding “experimental type” in the second latitude of the content cache, and multi-faceted caching for each page.
Example:
User-defined Experiment type header “Cloudfront-ab-experiment”
Applied edge computing
The primary problem of A/B Testing is how to distribute traffic? At which layer does traffic distribution occur?
(1) Set the proportion of traffic allocation. In A/B Testing group, each page is taken as the first latitude, and the type of end, user category mark and function block are taken as the second latitude. (After the allocation ratio is modified, the sampled data must be based on the data after the modification takes effect)
(2) If the allocated traffic is calculated at the source station, the statistical latitude of the data is the CDN back source traffic rather than the real C-end access traffic.
If all the traffic on the page is returned to the source but the user experience is strongly dependent on the edge cache, the solution is poor.
Therefore, a new technology, Labmda Edge, must be introduced to calculate and distribute traffic in CDN nodes, that is, taking account of user experience can also achieve the purpose of calculation.
Docs.aws.amazon.com/zh\_cn/Amaz…
User PV latitude A/B plane
Viewer request
Page View Each Page request determines the probability of accessing A or B in proportion.
User UV latitude A/B side
Viewer request + viewer response
The User View is the probability that when A new User visits the page, he or she will be labeled as A class A User or A class B User based on the allocation ratio. Type A and type B users access different contents. (User tags support specifying expiration time)
UV multi-page A/B side
Viewer + viewer response + path identifier
The probability of being labeled as class A user or Class B user is determined according to the allocation ratio. Labels are stored in latitude of the page path, and different pages do not affect each other. (User tags support specifying expiration time)
Steps to implement
Functional verification
Lambda@edge Application Process
CodeBase:console.aws.amazon.com/codesuite/c…
Creating an application
Functions are automatically created or updated after the code is submitted
Add trigger
Binding triggers automatically create versions after publication
Technology to the ground
Details page product A/B test release
Determine the first latitude, the second latitude, set the flow ratio.
First latitude content details page: /detail/*
Second latitude TYPE A\B users: type A old version page, type B new version page
Set traffic ratio: 75% for the old version page and 25% for the new page
Create a behavior
Set the traffic ratio and publish the application
Add the trigger and deploy
Lambda Edge computes logical decoupling of application A/B content assembly. (Deployment without dependencies)
The validation test
- Functional verification (Access data analysis)
A/B content verification.
Remove the mark and regain access to A/B content probability assignment.
- Cache validation (stress test)
For the pressure test scheme, the requirement is to enhance the page content caching capability through Lambda@edge, so the cache hit ratio and edge computing performance need to be investigated.
-
Application pressure measurement, client indicators + hardware resource monitoring
-
CDN pressure measurement, client indicators + hardware resource monitoring
Lambda high availability indicator system
-
Cache hit ratio
-
Length-height dynamic range index at response time P90, P99, P99.99 (response time is distributed according to the minimum)
-
Requests per second High dynamic range indicator P90, P99, P99.99 (requests per second according to the maximum distribution)
-
Dynamic range indicator of high downloads per second P90, P99, P99.99 (downloads per second by maximum distribution)
Lambda Performance Analysis Report:
-
RequestId: b37880ae-8356-452e-968c-a6a59c911e67
-
Duration: 19.49 ms
-
Billed Duration: 20 ms
-
Memory Size: 128 MB
-
Max Memory Used: 64 MB
-
Init Duration: 144.18 ms
It is recommended to use the new product CloudFront Function optimization in the future:
-
The maximum excution time is less than 1ms, and the same code execution occupies 45% of the maximum allowed running time
-
Closer to the user, code can be deployed at edge locations, where lambda is currently deployed at secondary POP points
-
No cold start
Data validation
- Buried data analysis (whether A/B content is exposed or requested to be reported)
-
Fetch logic validation
-
Check buried data and set traffic allocation ratio
2. Cache hit data analysis
-
O&m provides data
Use advice
-
You are advised to set the page-level user tag validity period to one day. If the validity period expires, the user will be assigned again. (Adjustable according to product requirements)
-
It is not recommended to change the A/B scheme of the same page frequently, because the tags of the previous group need to be cleared. (Scale can be adjusted through incremental scheme)
-
Plan A/B needs to determine the cycle, clear the current page user mark, and carry out the next group of experiments.
-
It is not recommended to use the Ten function because creating Ten groups of types on a single page will weaken the caching effect.
-
A/B group dynamic distribution of different modules on the same page, can be based on the desired data logic backward. For example, c-terminal dynamic loading is more accurate to collect exposure data.