Wait no longer! Create RSS feeds for all websites you care about and read them from the comfort of your feed reader.

More and more websites don’t support RSS feeds anymore, but as an RSS fan, you want a tool that aggregates your content and pushes it to your phone in real time to get the latest news and updates.

So today let’s crank out an RSS generator for 2 hours.

The main character of this article is Laravel.

1. Build Laravel skeletons

We’ll stick with the Laravel-admin plugin because we need a background to add sites we care about.

Composer create-project --prefer-dist Laravel/Laravel: 5.5lrsscdlrss cp .env.example .env php artisan key:generate // 2. Composer require Encore /laravel-admin using the Laravel-admin plugin"1.5. *"

php artisan vendor:publish --provider="Encore\Admin\AdminServiceProvider"

php artisan admin:install

Copy the code

SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 767 bytes

Solution: Add a default string length to appServiceprovider.php

use Illuminate\Support\Facades\Schema;

public function boot(a)
{
    Schema::defaultStringLength(191);
}
Copy the code

Symfonys provides DomCrawler plugin to parse xpath information for web sites, and laravel-admin plugin is introduced:

2. Parsing XPath

We wanted to use Huginn to generate our RSS feeds. See article: Make all Web pages RSS — Huginn git.huginn.cn/docs/%E8%AE…

However, in actual use, Huginn has been found to break down without reason, or backstage Jobs often fail. That’s where I got the idea of my own wank.

But Huginn gave me the idea to parse XPath to generate RSS feeds.

Creating an Xpath Controller

To verify the accuracy of the input XPath information, we can refer to Huginn,

First test XPath in Huginn, enter the following information in the WebsiteAgent interface:

{
  "expected_update_period_in_days": "2"."url": "http://www.woshipm.com/"."type": "html"."mode": "on_change"."extract": {
    "title": {
      "xpath": "//div[@class=\"postlist-item u-clearfix\"]/div[2]/h2/a/text()"."value": "normalize-space(.) "
    },
    "desc": {
      "xpath": 
"//div[@class=\"postlist-item u-clearfix\"]/div[2]/p/text()"."value": "normalize-space(.) "
    },
    "url": {
      "xpath": "//div[@class=\"postlist-item u-clearfix\"]/div[2]/h2/a"."value": "@href"}}}Copy the code

Then click “Dry Run” to test:

Finally, based on the information Huginn filled in, let’s create the Xpath Controller

// bash
php artisan make:model Xpath -m

// migration
public function up(a)
{
    Schema::create('xpaths'.function (Blueprint $table) {
        $table->increments('id');
        
        // url
        $table->string('url'.250);
        $table->string("urldesc".250);
        
        // title
        $table->string('titlexpath'.250);
        $table->string('titlevalue'.100)
              ->nullable();

        // desc
        $table->string('descxpath'.250);
        $table->string('descvalue'.100)
               ->nullable();

        // url
        $table->string("preurl".50)->nullable();
        
        $table->string('urlxpath'.250);
        $table->string('urlvalue'.100)
              ->nullable();
              
        $table->timestamps();
    });
}

// migrate
php artisan migrate

/ / create the admin/Controller
php artisan admin:make XpathController --model=App\\Xpath

/ / set up the route
$router->resource('xpaths', XpathController::class);

// Add to admin menu
/ / a little
Copy the code

Note: Refer to the previous article: Recommend a Laravel Admin backend plugin

CURD XPath

With the laravel-admin plugin, it’s easy to manipulate XPath information by looking directly at the code:


      

namespace App\Admin\Controllers;

use App\Xpath;

use Encore\Admin\Form;
use Encore\Admin\Grid;
use Encore\Admin\Facades\Admin;
use Encore\Admin\Layout\Content;
use App\Http\Controllers\Controller;
use Encore\Admin\Controllers\ModelForm;

class XpathController extends Controller
{
    use ModelForm;

    /**
     * Index interface.
     *
     * @return Content
     */
    public function index(a)
    {
        return Admin::content(function (Content $content) {

            $content->header('header');
            $content->description('description');

            $content->body($this->grid());
        });
    }

    /**
     * Edit interface.
     *
     * @param $id
     * @return Content
     */
    public function edit($id)
    {
        return Admin::content(function (Content $content) use ($id) {

            $content->header('header');
            $content->description('description');

            $content->body($this->form()->edit($id));
        });
    }

    /**
     * Create interface.
     *
     * @return Content
     */
    public function create(a)
    {
        return Admin::content(function (Content $content) {

            $content->header('header');
            $content->description('description');

            $content->body($this->form());
        });
    }

    /**
     * Make a grid builder.
     *
     * @return Grid
     */
    protected function grid(a)
    {
        return Admin::grid(Xpath::class, function (Grid $grid) {

            $grid->id('ID')->sortable();

            $grid->column('url');
            $grid->column('urldesc'."Description");

            $grid->column('titlexpath');
            $grid->column('titlevalue');

            $grid->column('descxpath');
            $grid->column('descvalue');

            $grid->column('preurl');
            $grid->column('urlxpath');
            $grid->column('urlvalue');

            $grid->created_at();
            $grid->updated_at();
        });
    }

    /**
     * Make a form builder.
     *
     * @return Form
     */
    protected function form(a)
    {
        return Admin::form(Xpath::class, function (Form $form) {

            $form->display('id'.'ID');

            // url
            $form->text('url'.'link')
                ->placeholder('Please enter resolved url')
                ->rules('required|min:5|max:250');

            $form->text('urldesc'.'One Sentence description')
                ->placeholder('One Sentence description')
                ->rules('required|min:5|max:250');

            // title
            $form->divide();
            $form->text('titlexpath'.'title xpath')
                ->placeholder('Please enter the title xpath')
                ->rules('required|min:5|max:250');

            $form->text('titlevalue'.'Title value' can be left blank by default.)
                ->default(' ')
                ->rules('max:100');

            // desc
            $form->divide();
            $form->text('descxpath'.'desc xpath')
                ->placeholder('Please enter details xpath')
                ->rules('required|min:5|max:250');

            $form->text('descvalue'.'desc value ', can be left blank by default)
                ->default(' ')
                ->rules('max:100');

            // url
            $form->divide();
            $form->text('preurl'.'the url prefix')
                ->placeholder('Please enter the url prefix of the article')
                ->rules('max:50');

            $form->text('urlxpath'.'url xpath')
                ->placeholder('Please enter the url xpath for the article')
                ->rules('required|min:5|max:250');

            $form->text('urlvalue'.'URL value' can be left out by default.)
                ->default(' ')
                ->rules('max:100');

            $form->divide();
            $form->display('created_at'.'Created At');
            $form->display('updated_at'.'Updated At'); }); }}Copy the code

Try adding two site information:

XPath 转为 RSS Feed

1. Parse the content according to the input Xpath information:

public static function analysis(XpathModel $model) {
    $html = file_get_contents($model->url);

    $crawler = new Crawler($html);

    $titlenodes = $crawler->filterXPath($model->titlexpath);
    $titles = self::getValueByNodes($titlenodes, $model->titlevalue);

    $descnodes = $crawler->filterXPath($model->descxpath);
    $desces = self::getValueByNodes($descnodes, $model->descvalue);

    $urlnodes = $crawler->filterXPath($model->urlxpath);
    $urls = self::getValueByNodes($urlnodes, $model->urlvalue);

    return RssFeeds::feeds($model, $titles, $desces, $urls);
}

// Get the value of Nodes by the rule
public static function getValueByNodes(Crawler $crawler, $key = null) {
    return $crawler->each(function (Crawler $node) use ($key) {
        if (empty($key)) {
            return trim($node->text());
        } else {
            return$node->attr($key); }}); }Copy the code

2. Load the title, DESC, and URL arrays into the Feed Item to build the RSS.

public static function feeds(Xpath $xpath, $titles = [], $desces = [], $urls = []) {
    if (!empty($xpath->preurl)) {
        $preurl = $xpath->preurl;
        $urlss = collect($urls)->map(function ($url, $key) use ($preurl) {
            return $preurl.trim($url);
        });
    } else {
        $urlss = collect($urls);
    }
    return response()
        ->view('rss'['xpath' => $xpath,
            'titles' => $titles,
            'desces' => $desces,
            'urls' => $urlss->toArray(),
            'pubDate' => Carbon::now()
        ])
        ->header('Content-Type'.'text/xml');
}
Copy the code

3. Write a Blade template

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
   <channel>
       <title>{{ $xpath->url or ' title' }}</title>
       <description>{{ $xpath->urldesc or 'description' }}</description>
       <link>{{ $xpath->url }}</link>
       <atom:link href="{{ url("/feed/$xpath->id")}}" rel="self" type="application/rss+xml"/>
       <pubDate>{{ $pubDate }}</pubDate>
       <lastBuildDate>{{ $pubDate }}</lastBuildDate>
       <generator>coding01</generator>
       @foreach ($titles as$key => $title) <item> <title>{{ $title }}</title> <link>{{ $urls[$key] }}</link> <description>{{ $desces[$key] }}</description> <pubDate>{{ $pubDate }}</pubDate> <author>coding01</author> <guid>{{ $urls[$key] }}</guid> <category>{{  $title }}</category> </item> @endforeach
   </channel>
</rss>
Copy the code

4. Finally, to see the results, make an RSS feed for each website:

Live RSS feeds

At this point, the current Laravel code is over, but to achieve my goal of pushing content to my phone in a timely manner, I use two tools:

  1. Tiny Tiny RSS
  2. IFTTT + nailing

Add the created RSS link to Tiny Tiny RSS and update it every half an hour to get the latest content:

Then with the help of IFTTT binding nail group robot Webhook:

Finally, you can receive the latest information and information in time on your mobile phone or PC:

conclusion

Today, I spent 2 hours building an RSS Feed generation tool Demo by myself with laravel-Amin and Symfony/dom-Crawler plug-ins.

Further improvements are needed, such as feeding feed43.com/ a Web URL that generates RSS feeds, and the ability to set your own update times as needed.

Finally, the code can be put on Github for reference: github.com/fanly/lrss

“To be continued”