I wrote a browser plug-in that can download gold nuggets articles with one click

One day while paddling in the nuggets, I suddenly thought of something

After I issued many articles in Denver, release stored in the local original file because I’m too lazy to finishing, on the desktop and in the way, so conveniently are deleted, that is the article I wrote in the nuggets have only the original orphan, if which day the nuggets closed/walk the nuggets are delete library/the nuggets by hackers rm – rf/the Denver nuggets don’t have the OSS costs were shut down, Isn’t there a great chance that MY article will be lost?

Although the quality of my article is general, but at least I took the time to write, 8 lines, must put an end to this hidden danger, disaster backup go

Pour these articles preserved is not difficult, also directly point to open each article, editor of the background, copy and paste to local, then download the pictures in the article, to the folder, but I was in the local I didn’t have the original file, now also want me to copy and paste over the new file is not touch my fat tiger?

So I thought, I’ll just write a browser plugin and automate it

First, the effect picture:

The final version is subject to change

~~No mocking UI~~

Plug-in support to download a single article and batch download all articles of an author of two functions, you can download your own article, you can also download other people’s articles, and when downloading articles, will be quoted in the article of the picture resources to download down, and the remote address of the picture in the article automatically replaced by the local path

Get article content

I clicked into a random article, then opened the console to look, and found that there was already an interface to the original article detailsmarkdownThe string returns, again discovering that the interface input parameter only needs the current articleid, and found the articleidIn fact, it’s already written on the link, and god help me, one interface can get all the data

Code brush finish, then grab a few articles to test, I didn’t think soon run into problems, some article details interface to return to the original article markdown field is an empty string, what I initially thought the nuggets for anti-cheating technology, then found that is not the case, after a careful analysis should be because of product design through multiple iterations, Earlier releases did not return the original Markdown field for the front end to roll over, but instead returned the transformed HTML string directly

The details of the article are still returned, but the form is changed. Since you are the converted HTML string, why don’t I transfer you back to the original markdown text?

There are a lot of open source libraries, so I looked around and finally chose Turndown, so the reverse translation can be done in one sentence:

turndownService.turndown(html)
Copy the code

However, I also consider that it is difficult to ensure 100% restoration of the original text in such things as reverse translation. I download my article hoping to be 100% original, even if only one comma is inconsistent

If the article I wanted to download was my own article, I would be able to go into the editing background and the editing interface would return the original Markdown string. I could download this content and I would be 100% restored

The edit interface requires a draft_ID, which is available from the article details interface above

For articles that do not belong to their own, certainly can not go into the editing background, it is still in accordance with the method of reverse translation, after my test, the content of reverse translation is basically no difference with the original text

Get the picture in the article

Since it is to download, so even the picture in the article should also download down, one is really play to save to the local purpose, two you want to be published in other platforms is also convenient to save the picture, three if the nuggets in the future which day to add a anti-theft chain to the picture, you may not see

If you pull the text out of the article string, the regular matches are ok, but because some article details interface returns markdown and others return HTML, you need to write a regular rule for each

Match to the image links, then according to these links to download the corresponding pictures, and then have to the image in the original remote link to replace photos downloaded to a local file path, the local image corresponding to the location in the article, or download a lot of pictures, there is no one by the naked eye or manual to replace? So you need to take that into account when you’re dealing with it

Download the article

Now that you have the markdown string from the original article, the next step is to save it locally

There are two ways to convert a string to a binary, use the Blob, and save the binary locally

One is to use the tag to simulate a browser download with the download attribute of the tag, but the limitation of this method is that you cannot specify the download directory, only the default directory or manually specify by the user

And I think that is best able to the author’s name as the folder name, and then the folder to create a new subfolder, these sub folder name is the name of each article, child folder contains a. Md format text file, and some image files, this will be regular, also facilitate the archive, So I chose to use the browser plug-in exposed chrome. The downloads. Download method

var blob = new Blob([message.data.content], { type: 'text/x-markdown' })
chrome.downloads.download({
  url: URL.createObjectURL(blob),
  saveAs: false
})
Copy the code

Download the pictures

Is same with chrome. Downloads. The download method to download photos to the local, this step is of concern, because want to replace in the original article pictures remote path, so need to be determined before the download start good image name

JPG,.png,.gif, and.svg. However, the content-Type field in the response header of the request will tell you what the MIME is, so you can determine the suffix

In general, when you download a file on the page, the browser will pop up a pop-up window, telling you to download the file information and allow you to determine whether really download, given that there may be an article more picture, plus the original paper documents, if one is too much trouble, for sure, if for bulk download again, that’s probably the main points of a hundreds, Don’t friendly

I took a look at the documentation,chrome.downloads.downloadThere is asaveAsAccording toThe documentIf this value is specified asfalse“, then the download confirmation popup will not pop up when downloading, but I tried it and found that it does not work. No matter whether this parameter is added or not, no matter what value is set, the browser download popup should pop up or should pop up. I searched online and found that this is an unfixed download popupOld old bug

So that’s a dead end, but there are other ways to do it

Open thechromeThe browserSet up theTo findDownload contentThis entry (or open it directly in the browserchrome://settings/? search=%E4%B8%8B%E8%BD%BD%E5%86%85%E5%AE%B9), and then closeAsk where to save each file before downloading“Switch, so that when downloading, will not pop up the download dialog box

If you use this plugin to batch download articles, be sure to turn off the above switch, or your browser may pop up dozens or hundreds of confirmation Windows in a flash

Of course, since you probably don’t want to download images, the plugin also offers the option of undownloading images at your discretion

summary

The logical flow of the plug-in is as follows:

Use this plug-in to download gold digging articles to the local, and then cooperate with another plug-in MdPreview I wrote to realize the browser to view the local Markdown file, the whole browser operation effect is smooth

This plug-in technology difficulty is not much, find the process of implementing plug-in and all kinds of case handling accounts for most of the work, my intention is to want to write a plug-in quickly finish to download articles to the local work, finally to achieve complete plug-in takes time, however, is enough to me his own articles one by one manually downloaded to the local ten times, Put the cart before the horse (manual dog head)

And I worked at my computer for more complete testing, basically do not have what problem, but I don’t know if there is something wrong on your computer, because this kind of error is similar to things of the crawler is normal (boundary value too much), so if you use after find out questions, then can talk to me about an issue, or you can directly to raise PR

At present, the known problem is that the images that come in from the external link instead of being uploaded directly to the nuggets may fail to download due to cross-domain problems. This pure front end cannot solve this problem. However, according to my test results, the proportion of images that come in from the external link is very small, so the impact is not significant

Since you need to register an account and upload CTX files, you also need to pay a slot fee, so you don’t want to bother, so you didn’t upload it to Chrome Extension Store. After you download the source file on Github, Perform offline installation (see Adding a plug-in for Offline Installation of Chrome).

There were three interesting things about the test

The titles of your articles are more exaggerated than the last

A lot of people have no ideamarkdownHis command of grammar is worse than mine

Some people don’t write muchdiaoThere are a lot of pictures

recruitment

As soon as the end of the year, each big downsizing activities engaged in system In this everybody is waiting for the award is not willing to leave the moment, it is companies struggling to recruit less than people and lower the threshold, the others are still in the hesitation, I want to see the article you immediately action, and to give your resume to me!!!!!!!

Yes, bytedance commercialization continues to recruit, dozens of HC is not blown out, if anyone sends me a resume, I will follow up the progress, even if the interview is not passed, I will inform you, to prevent the bad situation of resume dropped in the sea, I will pigeon you goo goo, of course not!

Without further ado, send me email [email protected]

I wrote a browser plug-in that can download gold nuggets articles with one click

Get article content

Get the picture in the article

Download the article

Download the pictures

summary

recruitment

Related Posts

7.CSS background application mini-project

Simple use of Flutter animation

Es6-promise 72 line. Anything less than me?