Network crawler may encounter IP problem when doing crawler work, no matter IP limit or crawler behavior limit. This time will be more troublesome, but you can change the HTTP proxy or do crawler anti-crawl optimization strategy to solve the problem.

With the progress of The Times and the development of the network, more and more people rely on the network to work. The existence of HTTP proxy IP helps many network workers to achieve efficient work. HTTP proxy IP is not only used by network workers, but also by individuals to protect privacy.

And a lot of work needs to be done online, such as Q&A promotion, Internet marketing, data collection and so on, but frequent operation will cause the target site to block your IP address, so that you can not access the target site. In this case, you need to use a proxy IP address.

Web crawler has high requirements on the quality of proxy IP server, and the quantity and quality of proxy server are constantly improving and improving.

Using proxy IP to collect data can break through the limitation of IP and not only speed up the collection speed. Now many websites on the Internet have anti-crawler mechanism, if it is recognized that normal users can visit, it can be normal, if it is fast and repeated, it is very easy to be identified to limit your access behavior, so it is blocked IP. At this time, the proxy IP is particularly important, the anti-crawler mechanism can only identify THE IP address, using the proxy IP can easily change the IP address, crawler work can be carried out smoothly.

It is necessary to use proxy IP to collect data of network crawler, and the enhanced version of tunnel forwarding crawler with high anonymity must be used to use proxy IP of network crawler. Transparent proxy and ordinary anonymous proxy will be recognized by the other party, and will also be blocked IP.

<? php namespace App\Console\Commands; use Illuminate\Console\Command; class Test16Proxy extends Command { /** * The name and signature of the console command. * * @var string */ protected $signature = 'test:16proxy'; /** * The console command description. * * @var string */ protected $description = 'Command description'; /** * Create a new command instance. * * @return void */ public function __construct() { parent::__construct(); } /** * Execute the console command. * * @return mixed */ public function handle() { $client = new \GuzzleHttp\Client();  $targetUrl = "http://httpbin.org/ip"; // Proxy server (www.16yun.cn) define("PROXY_SERVER", "t.16yun.cn:31111"); // Define ("PROXY_USER", "username"); define("PROXY_PASS", "password"); $proxyAuth = base64_encode(PROXY_USER . ":" . PROXY_PASS); $options = [ "proxy" => PROXY_SERVER, "headers" => [ "Proxy-Authorization" => "Basic " . $proxyAuth ] ]; //print_r($options); $result = $client->request('GET', $targetUrl, $options); var_dump($result->getBody()->getContents()); }}? >Copy the code