This is the 21st day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

Experiment 3

3.1 the topic

Proficient in Selenium searching HTML elements, crawling Ajax web data, waiting for HTML elements, etc.

Selenium framework + MySQL database storage technology is used to climb the stock data information of “Shanghai and Shenzhen A-shares”, “Shanghai A-shares” and “Shenzhen A-shares”.

Candidate sites: Oriental wealth network: quote.eastmoney.com/center/grid…

3.2 train of thought

3.2.1 Sending a Request

Introduction drive

chrome_path = r"D:\Download\Dirver\chromedriver_win32\chromedriver_win32\chromedriver.exe"  The path of the driver
browser = webdriver.Chrome(executable_path=chrome_path)
Copy the code

Save the sections you need to climb

    target = ["hs_a_board"."sh_a_board"."sz_a_board"]
    target_name = {"hs_a_board": Shanghai and Shenzhen A-shares."sh_a_board": "Shanghai A-Share"."sz_a_board": Shenzhen A-shares}
Copy the code

The plan is to crawl two pages of information from three templates.

Send the request

    for k in target:
        browser.get('http://quote.eastmoney.com/center/gridlist.html#%s'.format(k))
        for i in range(1.3) :print("------------- page {} ---------".format(i))
            if i <= 1:
                get_data(browser, target_name[k])
                browser.find_element_by_xpath('//*[@id="main-table_paginate"]/a[2]').click() # flip
                time.sleep(2)
            else:
                get_data(browser, target_name[k])
Copy the code

Time. Sleep (2)

Otherwise, he will request so quickly that even though you turn to the second page, you still crawl the first page of information!!

3.2.2 Obtaining a Node

Even when parsing web pagesimplicitly_waitWaiting for the

  browser.implicitly_wait(10)
  items = browser.find_elements_by_xpath('//*[@id="table_wrapper-table"]/tbody/tr')
Copy the code

And then this item is all the information

    for item in items:
        try:
            info = item.text
            infos = info.split("")
            db.insertData([infos[0], part, infos[1], infos[2],
                  infos[4], infos[5],
                  infos[6], infos[7],
                  infos[8], infos[9],
                  infos[10], infos[11],
                  infos[12], infos[13]])except Exception as e:
            print(e)
Copy the code

3.2.3 Saving Data

Database class that encapsulates initialization and insert operations

class database() :
    def __init__(self) :
        self.HOSTNAME = '127.0.0.1'
        self.PORT = '3306'
        self.DATABASE = 'scrapy_homeword'
        self.USERNAME = 'root'
        self.PASSWORD = 'root'
        Open a database connection
        self.conn = pymysql.connect(host=self.HOSTNAME, user=self.USERNAME, password=self.PASSWORD,
                                    database=self.DATABASE, charset='utf8')
        Create a cursor object cursor using the cursor() method
        self.cursor = self.conn.cursor()

    def insertData(self, lt) :
        sql = "INSERT INTO spider_gp(serial number, block, stock code, stock name, latest offer, up/down, up/down, volume, turnover, amplitude, high, low, today, yesterday)" \
              "VALUES (%s,%s, %s, %s, %s, %s,%s, %s, %s, %s, %s,%s,%s,%s)"
        try:
            self.conn.commit()
            self.cursor.execute(sql, lt)
            print("Insert successful")
        except Exception as err:
            print("Insert failed", err)
Copy the code

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Selenium captures Eastday data

Experiment 3

3.1 the topic

3.2 train of thought

3.2.1 Sending a Request

3.2.2 Obtaining a Node

3.2.3 Saving Data

Selenium captures Eastday data

Experiment 3

3.1 the topic

3.2 train of thought

3.2.1 Sending a Request

3.2.2 Obtaining a Node

3.2.3 Saving Data

Related Posts

Command mode, updated Thanos snap

Kubernetes Notes (15) – Dashboard

Annotate the underlying implementation principles