Talk about phantomJS optimizations

sequence

This article mainly summarizes the optimization measures of PhantomJS

Phantomjs is a background browser with a bit of jetty built in, and is usually used for automated testing or crawlers.

Pooling technology to avoid repeated startup

For interprocess invocation in other languages, frequent calls to the process for context switches and frequent object creation are time-consuming, so connection pooling optimization can be handled
Set about:blank to avoid bugs that the status does not clear

If a Java-like threadLocal is used in the Tomcat connection pool, if the previous thread is not cleared, the next request to reuse the thread will read the dirty data.

Phantomjs doesn’t seem to have a reset interface, so you can use dark magic to open a Blank every time you get, and then request it.

Open the disk cache

If the same page is frequently accessed, enable the cache to cache static resources to avoid repeated requests
Ditch Selenium and use the API directly

If you are using selenium wrappers, you can consider using the original API, which is more direct.
Build distributed REST API services

The processing of requests for network resources can be very time consuming and unstable, so throughput is definitely not high, bottlenecks can easily occur at high concurrency, and distributed deployment is necessary.

In addition to PhantomJS, Chrome and Firefox have similar versions of Headless, so there are a few more options to try out.