The glory of the King of web crawlers

Because need, so create. — an open source community

Like to play mobile games friends should have played some news of king pesticide, I as a mobile games dregs also played a few times, commonly used hero for Arthur, Angela, Luban…… . After playing a few games, I am attracted by the beautiful UI design of each hero (but I play shooting games most often, if I like to play, I can pay attention to my private chat), but I don’t know much about the glory of all heroes. So in order to understand the allusion of each hero, FROM 10 o ‘clock yesterday to 2 o ‘clock the next day, I whipped up this open source program (Because need, so create)

Tell me about the program

Modules and technology stacks

First of all, this program mainly includes two parts, namely, data capture and processing, data display. The main technology stacks used are:

  • Java8
  • Okhttp (Application Layer)
  • Jsoup (Data parsing)
  • JSP+CSS (ugly interface, ha ha)
Take a look at the actual effect (haha, ugly)
Take a look at the implementation core code

interface

/ / parsing
public interface Parser {
    void parser(a) throws ExecutionException, InterruptedException;
}
/ / grab
public interface Crawler<T.R> {
   String doGet(String uri, Map<T,R> headers);
   default void setHttpHeaders(Request.Builder builder, Map<T,R> headers){
      if(headers == null || headers.isEmpty()){
         return ;
      }
      for(Map.Entry<T,R> entry : headers.entrySet()){ builder.addHeader(String.valueOf(entry.getKey()),String.valueOf(entry.getValue())); }}}Copy the code

Fetching public methods

public class HttpCrawler implements Crawler<String.String> {
    private OkHttpClient httpClient = new OkHttpClient();
    private static HttpCrawler instance = new HttpCrawler();
    @Override
    public String doGet(String uri, Map<String,String> headers) {
        asserturi ! =null;
        Request.Builder httpBuilder = new Request.Builder();
        // Set the request header
        setHttpHeaders(httpBuilder,headers);
        Request request = httpBuilder.url(uri).build();
        Response response;
        String page = "";
        try{
            response = httpClient.newCall(request).execute();
            if(! response.isSuccessful()){throw new HttpStatusException(http_error.getMsg(),response.code(),uri);
            }
            ResponseBody responseBody = response.body();
            if(Objects.nonNull(responseBody)){
                byte[] bytes = responseBody.bytes();
                page = newString(bytes,Charsets.GB2312.name()); }}catch (IOException e) {
            e.printStackTrace();
        }
        return page;
    }
    public static HttpCrawler getInstance(a) {
        returninstance; }}Copy the code

Parsing the hero

public class KingParser implements Parser {
    private static KingParser kingParser = new KingParser();
    private StoryParser storyParser = StoryParser.getInstance();
    private String page;
    private List<Hero> heros = new ArrayList<>();

    private ExecutorService executors = Executors.newCachedThreadPool(new ThreadFactory() {
        AtomicInteger integer = new AtomicInteger();
        @Override
        public Thread newThread(@NotNull Runnable r) {
            return new Thread(r,"parser-thread-"+integer.getAndIncrement()); }});@Override
    public void parser(a) throws ExecutionException, InterruptedException {
        Document document = Jsoup.parse(page);
        if(document == null || StringUtils.isEmpty(document.body().html())){
            return;
        }
        Elements heroBox = document.getElementsByClass(WebAppConfig.kingClassName);
        Elements heroLists = heroBox.get(0).getElementsByTag(li.name());
        long start;
        System.out.println("Start time ==="+(start=System.currentTimeMillis()));
        AtomicInteger count = new AtomicInteger();
        for(Element element : heroLists){
            Hero hero = new Hero();
            count.getAndIncrement();
            Future<Object> submit = executors.submit(() -> {
                Elements aTag = element.getElementsByTag(a.name());
                String uri = WebAppConfig.baseUri + aTag.attr(href.name());
                hero.setDetail(parserStory(uri));
                hero.setHero(aTag.get(0).getElementsByTag(img.name()).get(0).attr(alt.name()));
                hero.setPicture("http:" + aTag.get(0).getElementsByTag(img.name()).get(0).attr(src.name()));
                return hero;
            });
            heros.add((Hero) submit.get());
        }
        / / 4922
        System.out.println("End time ==="+(System.currentTimeMillis()-start));
        System.out.println("Co-grab :"+count.get());
    }
    public static KingParser getInstance(a){
        return kingParser;
    }
    private String parserStory(String uri){
        storyParser.setUri(uri);
        storyParser.parser();
        return storyParser.getStory();
    }
    public void setPage(String page) {
        this.page = page;
    }
    public List<Hero> getHeros(a){
        if(CollectionUtils.isEmpty(heros)){
            try {
                parser();
            } catch(ExecutionException | InterruptedException e) { e.printStackTrace(); }}returnheros; }}Copy the code

Parsing the story

public class StoryParser implements Parser{
    private String uri;
    private String story;
    private HttpCrawler httpCrawler = HttpCrawler.getInstance();
    private static StoryParser storyParser = new StoryParser();
    @Override
    public void parser(a) {
        String detailPage = httpCrawler.doGet(uri, null);
        Document parse = Jsoup.parse(detailPage);
        Element heroStory = parse.getElementById("hero-story");
        Element element = heroStory.getElementsByClass("pop-bd").get(0);
        story = element.html();
    }
    public String getStory(a) { return story; }
    public void setUri(String uri) { this.uri = uri; }
    public static StoryParser getInstance(a) { returnstoryParser; }}Copy the code
The code still needs to be optimized
  • Caching: Each processing requires multiple request resolution and can be used instead.
  • Interface: The interface is not beautiful, you can use Javascript and CSS3 to make the page dynamic.

Github welcomes issues

Making the address

Pay attention to my

Individual public number: see cross talk also want to knock code