This is the seventh day of my participation in the More text Challenge. For details, see more text Challenge

Antecedents to review

{“status_code”:1,”status_msg”:”Url doesn’t match”}

Based on the empirical analysis, it can be basically determined that the reason for the above problems is that the interface may do back-crawling verification, directly verify the user-agent and other parameters to determine whether it is simulated access, so today we change our thinking, we try to use the browser kernel driver to call the interface, and see if it can succeed.

Selenium is introduced

Selenium is what

Selenium is a test automation tool for Web applications. Selenium tests run directly in the browser, just as real users do. Supported browsers include Internet Explorer (7, 8, 9, 10, 11), Mozilla Firefox, Safari, Google Chrome, Opera, etc. Selenium is also an open source framework released using the Apache License 2.0 protocol.

Why use Selenium

Selenium was originally developed for automated testing of websites. Selenium can directly call the browser, receive instructions, make the browser automatically load the page, obtain the required data, and even take screen pictures of the page. It can perfectly simulate the operation on the browser side. So you can solve the previous encounter of the anti – crawling cause call impassability problem.

Continue to resolve the problem that user/info cannot be accessed

Now that you have a basic understanding of what Selenium is for, let’s move on to how to use it

First, we need to install GoogleChrome and check the version number

Then, open the driver download address (address can be baidu) :

Find the driver corresponding to the browser version:

Download and decompress to arbitrary location, here is my storage path:

Then we move on to the code to make the modifications, first adding Selenium references to pom. XML:

        <dependency>
                <groupId>org.seleniumhq.selenium</groupId>
                <artifactId>selenium-java</artifactId>
                <version>3.141.59</version>
        </dependency>
Copy the code

If the project references swagger2, it will cause an error. If the project references swagger2, it will cause an error. If the project references swagger2, it will cause an error.

Let’s continue with the code to invoke the interface. Instead of using the HttpRequest request mode, change it to Selenium mock browser request mode.

	/ * * * *@paramHomeUrl Specifies the home page address *@paramUrl Indicates the interface address */
	public void getUserInfo(String homeUrl,String url){
		String sec_uid=getLocationParam(homeUrl);
		String apiUrl=url+"? sec_uid="+sec_uid;
		System.out.println(apiUrl);
		System.getProperties().setProperty("webdriver.chrome.driver"."F:\\proj\\drivers\\chromedriver.exe");
		 // Start webDriver process
		 ChromeOptions options = new ChromeOptions();
                 // Don't let the browser pop up, otherwise the browser will flash
		 options.addArguments("--headless");
		 options.addArguments("--disable-gpu");
		 WebDriver webDriver = new ChromeDriver(options);
		 webDriver.get(apiUrl);
		 String h=webDriver.getPageSource();
		 System.out.println(h);
		 webDriver.close();
		 webDriver.quit();
// String jsonString = JwtHttpUtil.httpRequest(apiUrl, "POST", null);
// System.out.println(jsonString);
	}
Copy the code

Running result:

After executing, I finally see what I want, and the sec_UID and user/info interface calls have been resolved perfectly, so I’ll move on to the rest of the content next time.

Stage summary:

1. When analyzing the interface, make full use of the Google Chrome debugging tool to search the code content, and analyze and track the code through the context where it appears.

2. HttpRequest will be blocked by the backend anti-crawling rules, which can be flexibly used with Selenium.

3. Selenium is easy to use, but it creates system processes and naturally affects the efficiency of code execution, which needs to be optimized.