background
Last year, I saw a question on Zhihu about how to swipe the cumulative number of songs listened to by netease Cloud Music. Then I got a highly praised answer with a js code, which can be executed directly on the browser console. At that time tried, directly brush tens of thousands of. Unfortunately, the next day it was back to its original appearance, apparently blocked by netease Cloud Music discovery. Moreover, netease cloud also added some limits on the cumulative number of songs to be listened to, adding up to 300 songs per day. Today brings a way to automatically play songs through Java + Selenium to achieve the effect of swiping accumulated songs. In addition, with this demo, you will be more familiar with the use of Selenium and some interesting things in crawler applications.
Train of thought
-
You can log in by either of the following methods: a. Simulate the login process on the web client. Advantages: This method is more general and convenient for dynamic account switching. Disadvantages: slightly more cumbersome than using cookies directly, and there is a certain probability of graphical captcha, which needs to be considered. B. Set cookies. Advantages: No need to deal with the login process, relatively simple and convenient, in the case of a long cookie expiration time is relatively convenient, do not frequently switch. Disadvantages: switching accounts more trouble, can not achieve automation. I chose the way here.
-
Play: After the successful login in the previous step, directly open the playlist page. The following figure
, can be seen on the playlist page. There are three places where you can click the play button. The first thing I think of is the play button at the bottom, and then keep the display of the play component at the bottom all the time to get the real-time play dynamics. Try to simulate by clicking the play button, always unsuccessful, and finally click the top play button can play.
-
Get playback dynamics: In order to determine whether the playback is normal, you can obtain the cumulative number of songs listened on the personal home page in real time for monitoring. Since there is already a page playing songs, in order not to affect the original page playing songs, you can open a new TAB page to obtain the personal home page and open the new Table page. Here we use the js method window.open(‘about:blank’). If you end up with a log format like this, it’s successful:
2019-03-26 09:25:10,406 INFO [,main] - [com.github.wycm.music163] - [com.github.wycm.music163] [,main] - [com.github.wycm.music163] - Yili River bank - 01:00/07:19 -- Now play the first song, [,main] - [com.github.wycm.music163] - Yili River bank - 01:06/07:19 -- Now play the 1st song, [,main] - [com.github.wycm.music163] - Yili River bank - 01:13/07:19 -- Now play the 1st song, [,main] - [com.github.wycm.music163] - Yili River bank - 01:19/07:19 -- Now play the 1st song, [,main] - [com.github.wycm.music163] - Yili River bank - 01:25/07:19 -- Now play the 1st song, Total number of songs listened: 20,572Copy the code
The complete code
package com.github.wycm; import org.openqa.selenium.*; import org.openqa.selenium.chrome.ChromeDriver; import org.openqa.selenium.chrome.ChromeOptions; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import java.util.*; import java.util.concurrent.TimeUnit; import java.util.regex.Matcher; import java.util.regex.Pattern; /** * Created by wycm */ public class Music163 { private static Logger logger = LoggerFactory.getLogger(Music163.class); // Copy the original cookie of the browser. Private final static String RAW_COOKIES ="cookie1=value1; cookie2=value2";
private final static String CHROME_DRIVER_PATH = "/Users/wangyang/Downloads/chromedriver"; // Private static String startId ="22336453";
private static String userId = null;
private static Set<String> playListSet = new HashSet<>();
private static Pattern pattern = Pattern.compile("(.*?) (.*?) ");
private static Pattern songName = Pattern.compile("class=\"f-thide name fc1 f-fl\" title=\"(.*?) \ "");
private static ChromeOptions chromeOptions = new ChromeOptions();
private static WebDriver driver = null;
static {
System.setProperty("webdriver.chrome.driver", CHROME_DRIVER_PATH);
chromeOptions.addArguments("--no-sandbox");
}
public static void main(String[] args) throws InterruptedException {
while (true){ try { driver = new ChromeDriver(chromeOptions); playListSet.add(startId); invoke(); } catch (Exception e){ logger.error(e.getMessage(), e); } finally { driver.quit(); } Thread.sleep(1000 * 10); }} /** * Initialize cookies */ private static voidinitCookies(){
Arrays.stream(RAW_COOKIES.split("; ")).forEach(rawCookie -> {
String[] ss = rawCookie.split("=");
Cookie cookie = new Cookie.Builder(ss[0], ss[1]).domain(".163.com").build();
driver.manage().addCookie(cookie);
});
}
private static void invoke() throws InterruptedException {
driver.manage().timeouts().implicitlyWait(5, TimeUnit.SECONDS);
driver.manage().timeouts().pageLoadTimeout(15, TimeUnit.SECONDS);
String s = null;
driver.get("http://music.163.com/");
initCookies();
driver.get("http://music.163.com/");
s = driver.getPageSource();
userId = group(s, "userId:(\\d+)", 1);
driver.get("https://music.163.com/#/playlist?id=" + startId);
driver.switchTo().frame("contentFrame");
WebElement element = driver.findElement(By.cssSelector("[id=content-operation]>a:first-child"));
element.click();
((JavascriptExecutor) driver).executeScript("window.open('about:blank')");
ArrayList<String> tabs = new ArrayList<String>(driver.getWindowHandles());
driver.switchTo().window(tabs.get(0));
driver.switchTo().defaultContent();
int i = 0;
String lastSongName = "";
int count = 0;
while (true) {if(i > Integer.MAX_VALUE - 2){
break;
}
i++;
s = driver.getPageSource();
driver.switchTo().window(tabs.get(1)); //switches to new tab
String songs = null;
try{
driver.get("https://music.163.com/user/home?id=" + userId);
driver.switchTo().frame("contentFrame");
songs = group(driver.getPageSource(), "Cumulative Listening (\\ D +) song", 1);
} catch (TimeoutException e){
logger.error(e.getMessage(), e);
}
driver.switchTo().window(tabs.get(0));
Matcher matcher = pattern.matcher(s);
Matcher songNameMatcher = songName.matcher(s);
if (matcher.find() && songNameMatcher.find()){
String songNameStr = songNameMatcher.group(1);
if(! songNameStr.equals(lastSongName)){ count++; lastSongName = songNameStr; } logger.info(songNameStr +"-" + matcher.group(1) + matcher.group(2) + "-- currently playing the first" + count + "A song, cumulative listening :" + songs);
} else {
logger.info("Failed to parse song playback record or song name");
}
Thread.sleep(1000 * 30);
}
}
public static String group(String str, String regex, int index) {
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
return matcher.find() ? matcher.group(index) : ""; }}Copy the code
Operation Precautions
- Modify your own chromedriver path configuration
- Log in to netease Cloud music on the Web: music.163.com/
- Copy your original login cookies to the RAW_COOKIES field in your code
- Switch playlist, if the default playlist is finished, you can search for some playlist that has not been played, similar
https://music.163.com/#/playlist?id=22336453
To extract the ID, directly replacing the startId field in the code.
conclusion
- You might wonder, I want to put this task on my own server and run it in the background. This is the problem of setting up selenium running environment on the server, refer to my last article. Ali cloud and Tencent cloud minimum server can run up.
- In addition, why is selenium adopted here? Is there a simpler way to achieve the effect of brushing directly through a simple Http request? I personally tried to find the request to increase my cumulative number of songs through pure HTTP request, but I failed to find the request because the e-banking cloud request was encrypted. So selenium is used instead.
The last
- See the complete project code :github.com/wycm/crawle…
Copyright Notice by WyCM: juejin.cn/post/684490… Your support is the biggest encouragement to bloggers, thank you for reading carefully. The copyright of this article belongs to the author, welcome to reprint, but without the consent of the author must retain this statement, and give the original text link in the obvious position of the article page, otherwise reserve the right to pursue legal responsibility.