It is impossible to avoid using the browser for some of our daily tasks, and Python has its own set of browser automation tools. I have used Selenium, Splinter, and Microsoft’s ourselves, and ultimately opted for the latter. That’s because it automatically installs the browser without having to manually download the browser’s driver, such as chromedriver, making it easy to write automated tools that can be ported to other systems.

Down the drain, the offender automatically executes Chromium, Firefox, and WebKit browsers via a single API, supports Headless browsers, and runs both Linux, macOS, and Windows. The automation technology offered by Ourselves is green, powerful, stable and fast. You can play with the space and imagine what it can do.

Installation:

The official documentation playwright. Dev/python/docs…

pip install playwright
playwright install
Copy the code

Offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended, offended.

Let’s start with some sample code:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("http://playwright.dev")
    print(page.title())
    browser.close()
Copy the code

After running the program, it will automatically open a browser, visit condem.dev and print the title of the website.

Auto-generated code

What appeals to me most about the offender is that it can record what you do to the browser itself and generate executable code, a feat that makes browser automation much more efficient. Generating code just needs to be executed

python -m playwright codegen baidu.com
Copy the code

The following code can be generated:

from playwright.sync_api import Playwright, sync_playwright
def run(playwright: Playwright) - >None:
    browser = playwright.chromium.launch(headless=False)
    context = browser.new_context()
    # Open new page
    page = context.new_page()
    # Go to https://www.baidu.com/
    page.goto("https://www.baidu.com/")
    # Click input[name="wd"]
    page.click("input[name=\"wd\"]")
    # Fill input[name="wd"]
    page.fill("input[name=\"wd\"]"."playwright ")
    # Press CapsLock
    page.press("input[name=\"wd\"]"."CapsLock")
    # Fill input[name="wd"]
    page.fill("input[name=\"wd\"]"."Playwright tutorial")
    # Press Enter
    # with page.expect_navigation(url="https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&tn=baidu&wd=playwright%20%E6%95%99%E 7%A8%8B&fenlei=256&rsv_pq=880cdb05002fe1ed&rsv_t=19abqiURFrqQT3i6%2F84nvsfVrJlI%2B1T6XbVpQkOap78JGssznOJ4%2FVasRzE&rqlan g=cn&rsv_dl=tb&rsv_enter=1&rsv_sug3=23&rsv_sug1=20&rsv_sug7=100&rsv_sug2=0&rsv_btype=i&inputT=6608&rsv_sug4=11435&rsv_jm p=fail"):
    with page.expect_navigation():
        page.press("input[name=\"wd\"]"."Enter")
    # Click text= offended - ourselves -CSDN blog
    # with page.expect_navigation(url="https://blog.csdn.net/lb245557472/article/details/111572119"):
    with page.expect_navigation():
        with page.expect_popup() as popup_info:
            page.click("text=Playwright-python 教程_天下任我行-CSDN博客")
        page1 = popup_info.value
    # Click text = x
    page1.click("Text = x")
    # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -
    context.close()
    browser.close()
with sync_playwright() as playwright:
    run(playwright)

Copy the code

How do I interact with browser elements

Familiarize yourself with some concepts

The browser

A browser is an instance of a browser, be it Chromium, Firefox or WebKit, and the script usually starts off with one browser open and ends with the browser closed. The offender may use a headless browser mode, which means that, while the browser is open, But you can’t see how the browser starts and operates, it’s hidden.

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    browser.close()
Copy the code

Browser context

A browser context is an isolated, anonymous session within a browser instance. Browser context creation is fast and inexpensive. We recommend running each test scenario in your own new browser context to isolate the browser state between tests. The browser context can also be used to simulate multi-page scenarios involving mobile devices, permissions, regions, and color schemes.

browser = playwright.chromium.launch()
context = browser.new_context()
Copy the code

Pages and frames

A browser context can have multiple pages. A page is a single TAB or pop-up window in a browser context. It should be used to navigate to the URL and interact with the page content.

page = context.new_page()

Explicit navigation, similar to entering a URL in a browser.
page.goto('http://example.com')
# Fill in input.
page.fill('#search'.'query')

Click on links to implicitly navigate.
page.click('#submit')
# Expect a new url.
print(page url)# pages can be navigated from script - playwright will pick up.
# window.location.href = 'https://example.com'
Copy the code

One or more Frame objects can be attached to a page. Each page has a main frame, and page-level interactions (such as clicks) are assumed to run in the main frame.

A page can have attached attached frames and IFrame HTML tags. These iframes can be accessed:

# Get frame using the frame's name attribute
frame = page.frame('frame-login')

# Get frame using frame's URL
frame = page.frame(url=r'.*domain.*')

# Get frame using any other selector
frame_element_handle = page.query_selector('.frame-class')
frame = frame_element_handle.content_frame()

# Interact with the frame
frame.fill('#username-input'.'John')
Copy the code

The selector

A selector is a tool that selects elements within an HTML page.

The offender can search for elements using CSS selectors, XPath selectors, HTML attributes like ID, data-test-ID, and even text content.

You can explicitly specify which selector engine you are using, or let ourselves detect it.

The offender’s selectors are intuitive and easy to use. Learn more about the selectors and the selectors engine here.

Realize the automatic playback of video website

The following is a simple open video website, and by refreshing the browser to achieve the perception of the end of the video play code.


from playwright.sync_api import sync_playwright
import re, sys
import progressbar
from log import logger
from urllib.parse import urlparse

import time
from config import chromium, browser_path


current_milli_time = lambda: int(round(time.time() * 1000))


class AutoLearning(object) :
    @staticmethod
    def get_total_seconds(time_str) :
        hour, minute, seconds = 0.0.0
        time = [int(i) for i in time_str.split(":")]
        if len(time) == 2:
            minute, seconds = time
        elif len(time) == 1:
            seconds = time[0]
        elif len(time) == 3:
            hour, minute, seconds = time
        else:
            pass
        return hour * 60 * 60 + minute * 60 + seconds

    def __init__(self, username, passwd, base_url, key=None) :
        self.username = username
        self.passwd = passwd
        urlparseObj = urlparse(base_url)
        self.base_url = f"{urlparseObj.scheme}: / /{urlparseObj.hostname}"
        self.hostname = urlparseObj.hostname
        self.sync_playwright = sync_playwright()
        self.playwright = self.sync_playwright.start()
        if chromium:
            self.browser = self.playwright.chromium.launch(executable_path=browser_path, headless=False)
        else:
            self.browser = self.playwright.firefox.launch(executable_path=browser_path,headless=False)
        self.context = self.browser.new_context()
        self.current_page = self.context.new_page()
        self.cookies = {}
        self.corp_code = "default"
        self.map_url = f"{self.base_url}/els/html/index.parser.do? id=0007"

        self.headers = {
            "Host": self.hostname,
            "User-Agent": "Mozilla / 5.0 (Macintosh; Intel Mac OS X 11_2_2) AppleWebKit (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"."Origin": self.base_url,
        }

        self.eln_session_id = ""

    def __del__(self) :
        self.context.close()
        self.browser.close()
        self.sync_playwright.__exit__()

    def login(self) :
        logger.info(self.base_url)
        self.current_page.goto(url=self.base_url)
        page = self.current_page
        # self.context.set_default_timeout(6000)
        try:
            # Click [placeholder=" placeholder "]
            page.click('[name="loginName"]')

            # Fill [placeholder=" please input username "]
            page.fill('[name="loginName"]', self.username)

            # Click [placeholder= placeholder]
            page.click('[name="password"]')

            # Fill [placeholder=" please input password "]
            page.fill('[name="password"]', self.passwd)

            # Click text= Login
            page.click("input.login_Btn")
            print("If you have a verification code, please log in manually on your browser.")
            # Click text= Continue login
            # if page.is_visible("text= continue login ", timeout=15000):
            page.click("Text = continue login")

        except Exception:
            print("Please log in manually on your browser.")
            time.sleep(3)

        while True:
            try:
                if (
                        page.is_visible("Text =' course center '", timeout=3000)
                        or page.is_visible("Text =' personal center '", timeout=3000)
                        or page.is_visible("Text =' learning center '", timeout=3000)
                ):
                    logger.info("Successful landing.")
                    self.current_page = page
                    break
            except Exception:
                print("Please log in manually on your browser.")
            time.sleep(5)


    def learn_course_from_learn_map(self, which_one_to_learn=1, skip_num=0) :

        logger.info("learn_course_from_learn_map begin.")
        self.current_page.goto(self.map_url)
        self.current_page.wait_for_selector(
            f":nth-match(div.track-list-tit,{which_one_to_learn})"
        )
        item = self.current_page.query_selector(
            f":nth-match(div.track-list-tit,{which_one_to_learn})"
        )
        link = self.current_page.query_selector(
            f":nth-match(a.track-list-linktoName,{which_one_to_learn})"
        )
        item_title = item.inner_text()
        link_title = link.inner_text()
        if "Learning progress: 100%" in item_title:
            logger.info(f"{link_title}Study done, quit.")
            return

        logger.info(F "Start learning{item_title}")
        link.click()

        self.current_page.wait_for_selector("a.innercan.goCourseByStudy")
        courses = self.current_page.query_selector_all("a.innercan.goCourseByStudy")

        for course in courses[skip_num:]:
            time.sleep(2)
            with self.current_page.expect_popup() as popup_info:
                course.click()
            new_page = popup_info.value
            new_page.wait_for_load_state(timeout=60000)
            time.sleep(1)
            h3 = new_page.query_selector("h3.cs-test-title")
            if h3:
                logger.info("This course video has been played, no need to play it.")
                if h3.inner_text() == "Course Assessment":
                    self.evaluation(new_page)
                new_page.close()
                continue

            course_item = {
                "courseId": course.get_attribute("id"),
                "courseName": course.get_attribute("title"),
            }

            logger.info(
                F "is playing{course_item['courseName']}, courseId ={course_item['courseId']}"
            )

            if new_page.is_visible("iframe.url-course-content"):
                self.play_single_course2(new_page)

            if new_page.is_visible("Text = 'ok" ",timeout = 3000):
                new_page.click("Text = 'ok" ")

            if new_page.is_visible("A :has-text(' next ')"):
                new_page.click("A :has-text(' next ')")
                self.evaluation(new_page)
            new_page.close()
        logger.info("Map learning mission completed")







    def play_single_course2(self, page) :
        "" One split screen, two split screen playback ""

        page.wait_for_selector("time.cl-time")
        page.wait_for_selector("id=studiedTime")
        time.sleep(5)
        total_time_ele = page.query_selector("time.cl-time")
        total_minutes = int(0 if total_time_ele.inner_text() == ' ' else total_time_ele.inner_text())
        alread_time_ele = page.query_selector("id=studiedTime")
        alread_minutes = int(0 if alread_time_ele.inner_text() == ' ' else alread_time_ele.inner_text())

        chapters = page.query_selector_all("a.scormItem-no.cl-catalog-link.cl-catalog-link-sub.item-no")
        if len(chapters) > 0:
            logger.info(F "Need to play this time{len(chapters)}Section")
            chapters[0].click()
            logger.info(F "is playing{chapters[0].get_attribute('title')}It takes time{total_minutes}Minutes.")

        bar = None
        if sys.platform == "win32":
            bar = progressbar.bar.ProgressBar(max_value=total_minutes)
        else:
            bar = progressbar.ProgressBar(max_value=total_minutes)

        bar.update(alread_minutes)
        wait_count = 0
        while True:
            time.sleep(60)
            wait_count += 1
            if wait_count >= 7:
                page.reload()
                wait_count = 0
            if wait_count % 3= =0:
                chapters = page.query_selector_all("a.scormItem-no.cl-catalog-link.cl-catalog-link-sub.item-no")
                if len(chapters) > 0:
                    logger.info(F "Need to play this time{len(chapters)}Section")
                    chapters[0].click()
                    logger.info(F "is playing{chapters[0].get_attribute('title')}It takes time{total_minutes}Minutes.")

            page.wait_for_selector("id=studiedTime")
            alread_time_ele = page.query_selector("id=studiedTime")
            alread_minutes = int(0 if alread_time_ele.inner_text() == ' ' else alread_time_ele.inner_text())
            if page.is_visible("A :has-text(' next ')") :break
            bar.update(alread_minutes)
        time.sleep(1)
        bar.update(total_minutes)
        logger.info(F "The video on this page has been played")



if __name__ == "__main__":
    auto = AutoLearning(username='* * * *', passwd=The '*', base_url='http://*****.net')
    auto.login()
    auto.learn_course_from_learn_map(which_one_to_learn=1, skip_num=0)
    time.sleep(100)

Copy the code

(after)