Month: I heard you shared jq this time. I: Yeah, it is. Month: There is nothing good to share I: give you the official website link github.com/stedolan/jq month: the original command line under the JSON beautification tool, seems to have nothing to share me:…… I: I can use it to analyze nginx logs, such as using JSON to represent nginx log month: online ELK is it not good? It’s distributed, indexed, and fast, and even queries are auto-complete: Seems to make sense… Me: I can also use it to make crawlers!!

This article is a practical use case for JQ, and if it can improve your enjoyment of using the command line, so much the better. For more information on jq usage, see my previous article: Jq command details and examples

  • Use JQ and SED to create a nuggets interview essay list
  • Series of articles: Personal server operation and maintenance guide

preview

Use the following command to directly access the front-end interview list

$ curl -s 'https://web-api.juejin.im/query' -H 'Content-Type: application/json' -H 'X-Agent: Juejin/Web' --data-binary '{"operationName":"","query":"","variables":{"tags":["55979fe6e4b08a686ce562fe"],"category":"5562b415e4b00c57d9b94ac8"," first":100,"after":"","order":"HOTTEST"},"extensions":{"query":{"id":"653b587c5c7c8a00ddf67fc66f989d42"}}}' --compressed | \
 jq -c '.data.articleFeed.items.edges | .[].node | { likeCount, title, originalUrl } | select(.likeCount > 600) ' | jq -cs '| sort_by (-. LikeCount) |. [] | "+ [👍 \ (. LikeCount)] [\]. (the title) (\ (. OriginalUrl))"' | sed s/\"/ / g + 5059 】 【 👍 [a good (excellent) front end should be reading these articles] (https://juejin.cn/post/6844903896637259784) + [👍 4695] [2018 front end of the interview summary, finish see to understand, Pay less said plus 3 k | the nuggets technical essay] (https://juejin.cn/post/6844903673009553416) + 4425 】 【 👍 [senior front-end big interview techniques, is the escort you yue four, Direct metal (on)] (https://juejin.cn/post/6844903776512393224) + [👍 3013] [2018 for front end of the interview: Lays down fine alignment (fine) | nuggets technical essay] (https://juejin.cn/post/6844903570001625102) + 2493 】 【 👍 [front-end interview examination site? Look at these articles will be enough (updated version) in June 2019] (https://juejin.cn/post/6844903577220349959)Copy the code

Gets the nuggets list interface

Let’s take a look at the HTTP URL: https://web-api.juejin.im/query, using POST

Look at the body:

{
  "operationName": ""."query": ""."variables": {
    "first": 20."after": "1.0168277174789"."order": "POPULAR"
  },
  "extensions": {
    "query": {
      "id": "21207e9ddb1de777adeaca7a2fb38030"}}}Copy the code

Finally, let’s look at HTTP response:

What a familiar data structure!! !

I wrote a series of articles about graphQL last month (2019/10). I used GraphQL to build Web applications, and a year ago (2018) I used GraphQL to write the front and back ends of a set of poems: Shfshanyue/shici shfshanyue/shici – server

How can you tell it is graphQL?

  • /queryThis is the unified entrance
  • extensions.query.idThis is aAPQ, for cachinggql, reduce the transmission volume, reduce network delay, is conducive to caching, of course, also reduce security issues
  • variablesThis is agraphql variables
  • data.items[].edgesThis is agraphqlTypical paging (although I don’t like it, too much nested data…)

Yeah, I think it went off course

Anyway, we got the data — the data on the front-end interview

$  curl -s 'https://web-api.juejin.im/query' -H 'Content-Type: application/json' -H 'X-Agent: Juejin/Web' --data-binary '{"operationName":"","query":"","variables":{"tags":["55979fe6e4b08a686ce562fe"],"category":"5562b415e4b00c57d9b94ac8"," first":100,"after":"","order":"HOTTEST"},"extensions":{"query":{"id":"653b587c5c7c8a00ddf67fc66f989d42"}}}' --compressed
Copy the code

ETL

“Etl” is still a fancy word

Let’s start with a few numbers using Jq: title and likes. For more usage, see my previous article: JQ command details and examples

To make it easier to see how jq is used, start jq on a separate line, where

  • -c: full line display
  • [].: json-array to jsonl
  • {}: similar tolodash.pick

We now have all the highly liked articles on command, but it is unordered at this point

$ curl -s 'https://web-api.juejin.im/query' -H 'Content-Type: application/json' -H 'X-Agent: Juejin/Web' --data-binary '{"operationName":"","query":"","variables":{"tags":["55979fe6e4b08a686ce562fe"],"category":"5562b415e4b00c57d9b94ac8"," first":100,"after":"","order":"HOTTEST"},"extensions":{"query":{"id":"653b587c5c7c8a00ddf67fc66f989d42"}}}' --compressed | \
  jq -c '.data.articleFeed.items.edges | .[].node | {title, likeCount}'

{"title":"Middle and senior front-end factory interview secrets, escort for you gold, silver and four, direct to the factory (I)"."likeCount": 4423} {"title":"2018 front end of the interview summary, finish see understand, pay less add 3 k | the nuggets technical essay"."likeCount": 4690} {"title":"A good front end should read these articles."."likeCount": 5052} {"title":"Lots of front-end interviews? Just read these articles (Updated June 2019)."."likeCount": 2492} {"title":If you remember "2018 for the front end of the interview: fine alignment (fine) | nuggets technical essay"."likeCount": 3013}Copy the code

Data filtering and sorting

And then we’re going to filter the likes over 600

select(.likeCount > 600)
Copy the code

Let’s do the reverse order

jq  -s '. | sort_by(-.likeCount) | .[]'
Copy the code

Done. At this point, the list is all sorted with more than 600 likes

$ curl -s 'https://web-api.juejin.im/query' -H 'Content-Type: application/json' -H 'X-Agent: Juejin/Web' --data-binary '{"operationName":"","query":"","variables":{"tags":["55979fe6e4b08a686ce562fe"],"category":"5562b415e4b00c57d9b94ac8"," first":100,"after":"","order":"HOTTEST"},"extensions":{"query":{"id":"653b587c5c7c8a00ddf67fc66f989d42"}}}' --compressed | \
 jq -c '.data.articleFeed.items.edges | .[].node | {title, likeCount, originalUrl } | select(.likeCount > 600) ' | jq -s '. | sort_by(-.likeCount) | .[]'

{
  "title": "A good front end should read these articles."."likeCount": 5052,
  "originalUrl": "https://juejin.cn/post/6844903896637259784"
}
{
  "title": "2018 front end of the interview summary, finish see understand, pay less add 3 k | the nuggets technical essay"."likeCount": 4690,
  "originalUrl": "https://juejin.cn/post/6844903673009553416"
}
{
  "title": "Middle and senior front-end factory interview secrets, escort for you gold, silver and four, direct to the factory (I)"."likeCount": 4423,
  "originalUrl": "https://juejin.cn/post/6844903776512393224"
}
Copy the code

Use sed processing to generate markDown

We’ve generated structured data at this point, so json would be nice if we used React to render the data. But now we need to generate markdown, so let’s deal with that

Start by using JQ to generate the link style

"+" 👍 \ (likeCount)] [\]. (the title) (\ (. OriginalUrl))"
Copy the code

Use sed to delete all double quotes. For more information about sed, see my article: sed command description and examples

sed s/\"//g
Copy the code

At this point, markdown data is successfully generated

$ curl -s 'https://web-api.juejin.im/query' -H 'Content-Type: application/json' -H 'X-Agent: Juejin/Web' --data-binary '{"operationName":"","query":"","variables":{"tags":["55979fe6e4b08a686ce562fe"],"category":"5562b415e4b00c57d9b94ac8"," first":100,"after":"","order":"HOTTEST"},"extensions":{"query":{"id":"653b587c5c7c8a00ddf67fc66f989d42"}}}' --compressed | \
 jq -c '.data.articleFeed.items.edges | .[].node | { likeCount, title, originalUrl } | select(.likeCount > 600) ' | jq -cs '| sort_by (-. LikeCount) |. [] | "+ [👍 \ (. LikeCount)] [\]. (the title) (\ (. OriginalUrl))"' | sed s/\"/ / g + 5059 】 【 👍 [a good (excellent) front end should be reading these articles] (https://juejin.cn/post/6844903896637259784) + [👍 4695] [2018 front end of the interview summary, finish see to understand, Pay less said plus 3 k | the nuggets technical essay] (https://juejin.cn/post/6844903673009553416) + 4425 】 【 👍 [senior front-end big interview techniques, is the escort you yue four, Direct metal (on)] (https://juejin.cn/post/6844903776512393224) + [👍 3013] [2018 for front end of the interview: Lays down fine alignment (fine) | nuggets technical essay] (https://juejin.cn/post/6844903570001625102) + 2493 】 【 👍 [front-end interview examination site? Look at these articles will be enough (updated version) in June 2019] (https://juejin.cn/post/6844903577220349959) + [👍 2359] [" senior front-end interview "JavaScript written code invincible secret] (https://juejin.cn/post/6844903809206976520)Copy the code

I am Shanyue, a programmer who likes running and climbing mountains. I will regularly share full stack articles in my personal official account. If you are interested in full stack interviews, front-end engineering, GraphQL, Devops, personal server operations and microservices, please follow me