Recently search electronic yes time found an interesting website, a lot of fine school version of e-books, because of curiosity, want to do a crawler name summary. Search for the title yourself and find the download address. Ten minutes, the script is basically finished, a night time can almost run out.
Share code, just for reference (rough).
package com.fun
import com.fun.db.mysql.MySqlTest
import com.fun.frame.httpclient.FanLibrary
import com.fun.utils.Regex
import org.slf4j.Logger
import org.slf4j.LoggerFactory
class T extends FanLibrary {
static Logger logger = LoggerFactory.getLogger(T.class)
public static void main(String[] args) {
// test(322) def list = 1.. 1000 as List list.each { x -> try {test(x)} Catch (Exception e) {∫ logger.error(x.tostring ()) output(e)} logger.warn(x.tostring ()) sleep(2000)}testOver()
}
static def test(int id) {
// def get = getHttpGet("https://****/books/9798.html")
def get = getHttpGet("https://****/books/" + id + ".html")
def response = getHttpResponse(get)
def string = response.getString("content")
if (string.contains("The file you requested does not exist.")|| string.contains("Page not found")) return
output(string)
def all = Regex.regexAll(string, "class=\"bookpic\"> ).get(0)
def all2 = Regex.regexAll(string, "Content =\" Content description.*? \ "").get(0)
def all3 = Regex.regexAll(string, "Title =\" author:.*? \ "").get(0)
def all40 = Regex.regexAll(string, "https://sobooks\\.cc/go\\.html\\? Url = HTTPS {0, 1} : / /. *? \\.ctfile\\.com/.*?\"")
def all4 = all40.size() == 0 ? "" : all40.get(0)
def all50 = Regex.regexAll(string, "https://sobooks\\.cc/go\\.html\\? Url = HTTPS {0, 1} : / / pan\\.baidu\\.com/. *? \ "")
def all5 = all50.size() == 0 ? "" : all50.get(0)
output(all)
output(all2)
output(all3)
output(all4)
output(all5)
def name = all.substring(all.lastIndexOf("=") + 2, all.length() - 1)
def author = all3.substring(all3.lastIndexOf("=") + 2, all3.length() - 1)
def intro = all2.substring(all2.lastIndexOf("=") + 2, all2.length() - 1)
def url1 = all4 == "" ? "" : all4.substring(all4.lastIndexOf("=") + 1, all4.length() - 1)
def url2 = all5 == "" ? "" : all5.substring(all5.lastIndexOf("=") + 1, all5.length() - 1)
output(name, author, intro, url1, url2)
def sql = String.format("INSERT INTO books (name,author,intro,urlc,urlb,bookid) VALUES (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",%d)", name, author, intro, url1, url2, id)
MySqlTest.sendWork(sql)
}
}
Copy the code
Personally, I am quite satisfied with it.
Technical articles selected
- Java one line of code to print a heart
- Linux performance monitoring software Netdata Chinese version
- Interface Test Code Coverage (JACOCO) solution sharing
- Performance testing framework
- How to perform performance tests on a Linux command line interface
- Graphic HTTP brain map
- How do I test a probabilistic business interface
- Httpclient handles multiple online users simultaneously
- Automatically turn the Swagger document into test code
- Five lines of code to build a static blog
- How does HttpClient handle 302 redirects
- Probe into the Linear interface testing framework based on Java
- Tcloud cloud measurement platform — the integrator
- How do I test a probabilistic business interface
- Python Plotly encapsulates methods that process interface performance test data
- Single sign-on performance test solution
Selected non-technical articles
- Why software testing as a career path?
- 10 steps to becoming a Great Java developer
- Programming thinking for everyone
- Barriers to automated testing
- The problem with automated testing
- Test the Code Immortal brain map
- 7 Steps to becoming a Good Automation Test Engineer
- Attitude of a good software developer
- How to perform functional API tests correctly
- New trends in software testing in the next 10 years
- New trends in software testing in the next 10 years
- What problems do automated tests solve
- 17 Common Productivity Skills for Software Testers – Up
- 17 Productivity Skills used by Software testers – below
- An important reason for the existence of manual testing
- 17 Tips for writing test Cases
Higher-ups elegant demeanour
- Tcloud cloud measurement platform — the integrator
- Android App test tools and knowledge collection
- 4399AT UI automation CI with CD
- Android App routine test content
- Objects and heaps for the JVM