0 x1, introduction

Everything happens for a reason. Recently, I found that the new articles in nuggets are not being read as much as before.

Although do not attach too much importance to this (false), but wrote, sent out, certainly hope that someone read, someone to discuss exchanges, so that there will be progress, otherwise put cloud notes their own taste not sweet?

A brief analysis of the reasons for reading less may be as follows:

  • The highly rewarding essay contest attracted a large number of new writers and produced a large number of articles.
  • Personalized recommendation algorithm, the old text is outdated and dominant, the new text is first released, the specific official interpretation can be seen: the description of personalized recommendation painting style feedback
  • How is it possible that the quality of your own writing has deteriorated? There are all the soul memes, and they’re all hands-on.

All the previous articles were first digs, then used the self-developed mouse feed converter HZWZ-Markdown-WX to convert MD into HTML with custom styles, and then posted to the public number.

Other platforms are too lazy to send, paste and copy too tired, I thought of writing an automatic script, but later because of various reasons shelved and forgot.

Recently I suddenly remembered this matter, how can I bury good articles, other platforms also have to send a, according to the convention, first ask whether there are wheels first, if there are, there is no need to build their own, so the group asked a wave:

em… It seems that there is no appearance, no words, on their own to do a role, not too complicated, just the enemy product manager leave recently, daily work is to change UI, touch fish enough time, do!!

In the case of not cracking the interface, get a number of sites, the fastest, the simplest way to achieve the browser simulation point point

List the sites you want to publish first, and if you have any additions, please leave a message in the comment section

  • The Denver nuggets
  • CSDN blog
  • 51 cto blog
  • Jane’s book
  • zhihu
  • Think no

And the previous two:

The Van came Python | a simple crawl site course “the Van came Python | a simple crawl of the planet

Afraid of the lawyer letter warning, secretly do not dare to send different scripts, this section of the script open source ha, welcome to hand party Clone trial suggestions ~

0x2. Tactical Analysis

The process of issuing a document can be divided into three steps: before release, during release, and after release, and then the specific process of each stage is refined:

A brief interpretation of the main points ~

Before the release of

Is some preparation work before the release, the first article content related, divided into two parts, body + additional information, the body of different platforms, support slightly different compiler, be prepared, so prepare the following three:

  • MD text → Most platforms support this ~
  • MD file → Some platforms do not support MD text, but support importing MD files, such as Zhihu;
  • Rendered text → Some platforms do not support MD and may need to parse and copy after rendering, such as WuKong editor of the old version of 51CTO blog;

Of course, there are miscellaneous additional information, like this:

Title, abstract, cover, label, category

Then login related, account + password, there are other requirements can also add ~

release

All platforms have to log in to publish, so it is necessary to judge the login status before Posting. Generally, if you visit the article publishing page directly without logging in, it will automatically jump to the login page. Of course, there are exceptions, such as digger still editing the page, but can’t post the article, so you need to trigger the login related jump yourself.

Then is the automatic login, is to simulate the process of human login, find the node element, click, input the corresponding information, and then execute the login. In addition, some write sites detect login abnormalities, but also trigger a variety of verification codes (slider, click, text, rotation, etc.), notify the user to take the initiative to deal with, and then poll timeout or hibernate for a period of time to wait.

After processing login, then to fill the text, support direct input of the text node, directly plug, do not support, you can: click to get focus → body content write clipboard → keyboard Ctrl+A select all → keyboard Ctrl+V paste.

Next to the additional information to fill, find the node, dot dot or enter.

Finally, the article is published, some sites published may have some additional operations, if not the implementation of the post action.

After the release of

The release process may not be smooth. Exceptions may occur occasionally. The exception information needs to be written into a file

0x3. Detailed Design

Analysis is almost done, then to the code design, first entity, from the above need two: article information + account password, the latter is generally bound to the website, there is no need to be independent, first write the article information entity:

class Article:
    def __init__(self, md_file=None, md_content=None, render_content=None, tags=None, avatar=None,
                 summary=None, category=None, column=None, title=None) :
        """ Initialization method Args: md_file: MD file md_content: MD text render_content: rendered text tags: avatar: cover summary: summary category: The column is ""
        self.md_file = md_file
        self.md_content = md_content
        self.render_content = render_content
        self.tags = tags
        self.avatar = avatar
        self.summary = summary
        self.category = category
        self.column = column
        self.title = title
Copy the code

Next to the post, each site behaves similarly, extracting common attributes and methods, defining a parent class, and implementing subclasses as needed:

class Publish:
    def __init__(self, website_name=None, write_page_url=None, login_url=None,
                 account=None, password=None, is_publish=True, page=None, article=None) :
        Args: website_name: site name write_page_URL: publish page URL login_URL: login page URL Account: account password: password is_publish: Publish or not, defaults to True Page: page instance of Pyppeteer, representing a page in the browser """
        self.website_name = website_name
        self.write_page_url = write_page_url
        self.login_url = login_url
        self.account = account
        self.password = password
        self.is_publish = is_publish
        self.page = page
        self.article = article
        self.logger = logging.getLogger(self.website_name)
        self.logger.setLevel(logging.INFO)

    # pass in Page and Article
    def set_page(self, page, article) :
        self.page = page
        self.article = article

    Load the release page
    def load_write_page(self) :
        self.logger.info("Load write article page: {}".format(self.write_page_url))

    Check the login status
    def check_login_status(self) :
        self.logger.info("Check login status...")

    # Automatic login
    def auto_login(self) :
        self.logger.info("Start automatic login: {}".format(self.login_url))

    # Content fill
    def fill_content(self) :
        self.logger.info("Start content filling...")

    # Other padding
    def fill_else(self) :
        self.logger.info("Other content to fill...")

    # release
    def publish_article(self) :
        self.logger.info("Post an article...")

    # Result processing
    def deal_result(self) :
        self.logger.info("The article is published...")
Copy the code

Then to release to the Nuggets as an example, how to show the specific play ~

0x4 example explanation — Gold-digging process

â‘  Login status detection

Override the corresponding method in the parent class as required, starting with the published article page: juejin.cn/editor/draf…

No login, can access, will not automatically jump, so we need to judge by ourselves, compare the difference before and after login:

2333, it is not difficult to find, login status, the upper right corner will have the user’s picture, just check whether this node exists, check the node information:

It’s not hard to write code like this:

class JueJinPublish(Publish): async def load_write_page(self): Super ().load_write_page() # await self.page.goto(self.write_page_url, options={'timeout': 60000}) await asyncio.sleep(1) await self.check_login_status() async def check_login_status(self): super().check_login_status() try: await self.page.waitForXPath("//nav//div[@class='toggle-btn']", {'visible': 'Visible ', 'timeout': 3000}) self.logger.info(" in login state...") ) await self.fill_content() except errors.TimeoutError as e: self.logger.warning(e) self.logger.info(" not logged in, execute automatic login...") ) await self.auto_login()Copy the code

â‘¡ Automatic login

Process: Jump to home page → click login button in the upper right corner → other login methods → Enter account → Enter password → click login

And then there’s the intimate swipe verification:

Waiting for user validation. When is it finished? Wait until the login button is not visible, timeout 1 minute, then jump to the article editing page ~

    async def auto_login(self) :
        super().auto_login()
        try:
            await self.page.goto(self.login_url, options={'timeout': 60000})
            await asyncio.sleep(2)
            login_bt = await self.page.Jx("//button[@class='login-button']")
            await login_bt[0].click()
            prompt_box = await self.page.Jx("//div[@class='prompt-box']/span")
            await prompt_box[0].click()
            account_input = await self.page.Jx("//input[@name='loginPhoneOrEmail']")
            await account_input[0].type(self.account)
            password = await self.page.Jx("//input[@name='loginPassword']")
            await password[0].type(self.password)
            login_btn = await self.page.Jx("//button[@class='btn']")
            await login_btn[0].click()
            self.logger.info("Waiting for user verification...")
            # then timeout waits for the login button to disappear, prompting the user that login authentication may be required
            await self.page.waitForXPath("//button[@class='login-button']", {'hidden': True.'timeout': 60000})
            self.logger.info("User authentication successful...")
            await self.load_write_page()
        except errors.TimeoutError:
            self.logger.info("User authentication failed...")
            self.logger.error("Login timeout")
            await self.page.close()
        except Exception as e:
            self.logger.error(e)
Copy the code

â‘¢ Text filling

Jump back to the post page and then the post filling process:

Fill in the title → Fill in the content section → select the Markdown theme → select the code highlighting style

The title is ok, get the text control to fill, the content part can not be directly stuffed, use the clipboard method to solve, then the Markdon theme and code highlight style selection, this is not too easy:

The node gets the focus, and then the option list is dynamically displayed. After Elements is followed, the node option list disappears, and the node information is not available.

Get the focus display list, print the page source, step by step positioning

To determine if the style text is the same as the default, click on it directly. It is not difficult to write code like this:

    async def fill_content(self) :
        super().fill_content()

        # set the title
        title_input = await self.page.Jx("//input[@class='title-input title-input']")
        await title_input[0].type(self.article.title)

        # Content section is not plain text input, click select, then copy and paste a wave ~
        content_input = await self.page.Jx("//div[@class='CodeMirror-scroll']")
        await content_input[0].click()
        cp_utils.set_copy_text(self.article.md_content)
        await cp_utils.hot_key(self.page, "Control"."KeyA")
        await cp_utils.hot_key(self.page, "Control"."KeyV")

        # Nuggets will compress the image, wait a little while before doing any more
        await asyncio.sleep(3)

        # Select the Markdown theme and code highlighting style
        md_theme = await self.page.Jx("//div[@bytemd-tippy-path='16']")
        await md_theme[0].hover()

        # Select your favorite theme, such as SmartBlue
        md_theme_choose = await self.page.Jx(
            "//div[@class='bytemd-dropdown-item-title' and text()='{}']".format('smartblue'))
        await md_theme_choose[0].click()

        # Again, select your favorite code style, such as AndroidStudio
        code_theme = await self.page.Jx("//div[@bytemd-tippy-path='17']")
        await code_theme[0].hover()
        code_theme_choose = await self.page.Jx(
            "//div[@class='bytemd-dropdown-item-title' and text()='{}']".format('androidstudio'))
        await code_theme_choose[0].click()

        # Add additional information
        await self.fill_else()
Copy the code

â‘£ Fill in additional information

The procedure for filling in additional information is as follows:

Click the publish button in the upper right corner → Select categories → Add tags → Upload article cover → Select columns (optional) → Enter abstract → Click OK and publish

The diagram below:

//input[@type=’file’] node, call uploadFile() to complete upload.

Fill the summary with the same code as the text, paste the interface with the clipboard, and finally click OK and publish. It is also not difficult to write the following code:

    async def fill_else(self) :
        super().fill_else()

        # Click the Publish button
        publish_bt = await self.page.Jx("//button[@class='xitu-btn']")
        await publish_bt[0].click()

        # select category
        category_check = await self.page.Jx("//div[@class='item' and text()=' {} ']".format(self.article.category))
        await category_check[0].click()

        # add a tag
        for tag in self.article.tags:
            tag_input = await self.page.Jx("//input[@class='byte-select__input']")
            await tag_input[0].type(tag)
            await asyncio.sleep(1)
            Select the first one by default
            tag_li = await self.page.Jx("//li[@class='byte-select-option byte-select-option--hover']")
            await tag_li[0].click()

        # Add cover
        upload_avatar = await self.page.Jx("//input[@type='file']")
        await upload_avatar[0].uploadFile(self.article.avatar)

        # Fill summary
        summary_textarea = await self.page.Jx("//textarea[@class='byte-input__textarea']")
        await summary_textarea[0].click()
        cp_utils.set_copy_text(self.article.summary)
        await cp_utils.hot_key(self.page, "Control"."KeyA")
        await cp_utils.hot_key(self.page, "Control"."KeyV")
        await self.publish_article()

    async def publish_article(self) :
        super().publish_article()
        publish_btn = await self.page.Jx("//div[@class='btn-container']/button")
        await publish_btn[1].click()
        await asyncio.sleep(2)
        await self.deal_result()
Copy the code

⑤ Release results processing

After the completion of the release, the page will jump, and then display the prompt related to the success or not. Here, we can directly find whether there is information related to the successful node of the release:

What other results are written, then you can slowly change, then run to see the effect of the published article:

Lazy to steal is not too cool!!

The basic prototype is like this, the follow-up is other site scripting, add configuration files, support for multiple sites at the same time, release results processing, there are some logic optimization ~

0 x 5, summary

ChaoMdPublish: if you are interested in this project, you can publish it in the afternoon. If you are interested, you can publish it in the afternoon