Platypush

Use the RSS and email integrations to created automated newsletters.

Author photo Platypush
.

I’ve always been a supporter of well-curated newsletters. They give me an opportunity to get a good overview of what happened in the fields I follow within a span of a day, a week or a month. However, not all the newsletters fit this category. Some don’t think three times before selling email addresses to 3rd-parties — and within the blink of an eye your mailbox can easily get flooded with messages that you didn’t request. Others may sign up your address for other services or newsletters as well, and often they don’t often much granularity to configure which communications you want to receive. Even in the best-case scenario, the most privacy-savvy user may still think twice before signing up for a newsletter — you’re giving your personal email address to someone else you don’t necessarily trust, implying “yes, this is my address and I’m interested in this subject”. Additionally, most of the newsletters spice up their URLs with tracking parameters, so they can easily measure user engagement — something you may not necessarily be happy with. Moreover, the customization junkie may also have a valid use case for a more finely tuned selection of content in his newsletter — you may want to group some sources together into the same daily/weekly email, or you may be interested only in some particular subset of the subjects covered by a newsletter, filtering out those that aren’t relevant, or customize the style of the digest that gets delivered. Finally, a fully automated way to deliver newsletters through 5 lines of code and the tuning of a couple of parameters is the nirvana for many companies of every size out there.

Feed up the newsletter

Those who read my articles in the past may know that I’m an avid consumer of RSS feeds. Despite being a 21-year-old technology, they do their job very well when it comes to deliver the information that matters without all the noise and trackers, and they provide a very high level of integration being simple XML documents. However, in spite of all the effort I put to be up-to-date with all my sources, a lot of potentially interesting content inevitably slips through — and that’s where newsletters step in, as they filter and group together all the content that was generated in a given time frame and periodically deliver it to your inbox.

My ideal solution would be something that combines the best aspects of both the worlds: the flexibility of an RSS subscription, combined with a flexible way of filtering and aggregating content and sources, and get the full package delivered at my door in whichever format I like (HTML, PDF, MOBI…). In this article I’m going to show how to achieve this goal with a few tools:

  • One or more sources that you want to track and that support RSS feeds (in this example I’ll use the MIT Technology Review RSS feed, but the procedure works for any RSS feed).

  • An email address.

  • Platypush to do the heavy-lifting job — monitor the RSS sources at custom intervals, trigger events when a source has some new content, create a digest out of the new content, and deliver the full package to a list of email addresses.

Let’s cover these points step by step.

Installing and configuring Platypush

We’ll be using the http.poll backend configured with one or more RssUpdate objects to poll our RSS sources at regular intervals and create the digests, and either the mail.smtp plugin or the google.mail plugin to send the digests to our email.

You can install Platypush on any device where you want to run your logic — a RaspberryPi, an old laptop, a cloud node, and so on. We will install the base package with the rss module. Optionally, you can install it with the pdf module as well (if you want to export your digests also to PDF) or the google module (if you want to send the newsletter from a GMail address instead of an SMTP server).

The first option is to install the latest stable version through pip:

[sudo] pip install 'platypush[rss,pdf,google]'

The other option is to install the latest git version:

git clone https://git.platypush.tech/platypush/platypush
cd platypush
[sudo] pip install '.[rss,pdf,google]'

Monitoring your RSS feeds

Once the software is installed, create the configuration file ~/.config/platypush/config.yaml if it doesn't exist already and add the configuration for the RSS monitor:

backend.http.poll:
    requests:
        # Add a new RSS feed to the pool
        - type: platypush.backend.http.request.rss.RssUpdates
          # URL to the RSS feed
          url: https://www.technologyreview.com/feed/
          # Title of the feed (shown in the head of the digest)
          title: MIT Technology Review
          # How often we should monitor this source (24*60*60 secs = once a day)
          poll_seconds: 86400
          # Format of the digest (HTML or PDF)
          digest_format: html

You can also add more sources to the http.poll requests object, each with its own configuration. Also, you can customize the style of your digest by passing some valid CSS to these configuration attributes:

# Style of the body element
body_style: 'font-size: 20px; font-family: "Merriweather", Georgia, "Times New Roman", Times, serif'

# Style of the main title
title_style: 'margin-top: 30px'

# Style of the subtitle
subtitle_style: 'margin-top: 10px; page-break-after: always'

# Style of the article titles
article_title_style: 'font-size: 1.6em; margin-top: 1em; padding-top: 1em; border-top: 1px solid #999'

# Style of the article link
article_link_style: 'color: #555; text-decoration: none; border-bottom: 1px dotted font-size: 0.8em'

# Style of the article content
article_content_style: 'font-size: 0.8em'

The digest_format attribute determines the output format of your digest - you may want to choose html if you want to deliver a summary of the articles in a newsletter, or pdf if you want instead to deliver the full content of each item as an attachment to an email address. Bonus point: since you can send PDFs to a Kindle if you configured an email address, this mechanism allows you to deliver the full digest of your RSS feeds to your Kindle's email address.

The RssUpdates object also provides native integration with the Mercury Parser API to automatically scrape the content of a web page - I covered some of these concepts in my previous article on how to parse RSS feeds and send the PDF digest to your e-reader. The same mechanism works well for newsletters too. If you want to parse the content of the newsletter as well, all you have to do is configure the http.webpage Platypush plugin. Since the Mercury API doesn't provide a Python binding, this requires a couple of JavaScript dependencies:

# Install Node and NPM, e.g. on Debian:
apt-get install nodejs npm

# Install the Mercury Parser API
npm install [-g] @postlight/mercury-parser

# Make sure that the Platypush PDF module dependencies
# are installed if you plan HTML->PDF conversion
pip install 'platypush[pdf]'

Then, if you want to parse the full content of the items and generate a PDF digest out of them, change your http.poll configuration to something like this:

backend.http.poll:
    requests:
        - type: platypush.backend.http.request.rss.RssUpdates
          url: https://www.technologyreview.com/feed/
          title: MIT Technology Review
          poll_seconds: 86400
          # PDF digest format
          digest_format: pdf
          # Extract the full content of the items
          extract_content: True

WARNING: Extracting the full content of the articles in an RSS feed has two limitations — a practical one and a legal one:

  • Some websites may require user login before displaying the full content of an article. Some websites perform such checks client-side — and the parser API can usually circumvent them, especially if the full content of an article is actually just hidden behind a client-side paywall. Some websites, however, implement their user checks server-side too before sending the content to the client — and in those cases the parser API may return only a part of the content or no content at all.

  • Always keep in mind that parsing the full content of an article behind a paywall may represent a violation of intellectual property under some jurisdictions, so make sure to do it only for content that is either free or that you have to permission to scrape.

Configuring the mail delivery

When new content is published on a subscribed RSS feed Platypush will generate a NewFeedEvent and it should create a copy of the digest under ~/.local/share/platypush/feeds/cache/{date:time}_{feed-title}.[html|pdf]. The NewFeedEvent in particular is the link you need to create your custom logic that sends an email to a list of addresses when new content is available.

First, configure the Platypush mail plugin you prefer. When it comes to sending emails you primarily have two options:

  • The mail.smtp plugin — if you want to send emails directly through an SMTP server. Platypush configuration:
mail.smtp:
    username: you@gmail.com
    password: your-pass
    server: smtp.gmail.com
    port: 465
    ssl: True
  • The google.mail plugin — if you want to use the native GMail API to send emails. If that is the case then first make sure that you have the dependencies for the Platypush Google module installed:
[sudo] pip install 'platypush[google]'

In this case you’ll also have to create a project on the Google Developers console and download the OAuth credentials:

  • Click on “Credentials” from the context menu > OAuth Client ID.

  • Once generated, you can see your new credentials in the “OAuth 2.0 client IDs” section. Click on the “Download” icon to save them to a JSON file.

  • Copy the file to your Platypush device/server under e.g. ~/.credentials/google/client_secret.json.

  • Run the following command on the device to authorize the application:

python -m platypush.plugins.google.credentials \
    "https://www.googleapis.com/auth/gmail.modify" \
    ~/.credentials/google/client_secret.json \
    --noauth_local_webserver

At this point the GMail delivery is ready to be used by your Platypush automation.

Connecting the dots

Now that both the RSS parsing logic and the mail integration are in place, we can glue them together through the NewFeedEvent event. The new advised way to configure events in Platypush is through native Python scripts - the custom YAML-based syntax for events and procedure was becoming too cumbersome to maintain and write (although it’s still supported), and I feel like going back to a clean and simple Python API may be a better option.

Create and initialize the Platypush scripts directory, if it doesn’t exist already:

mkdir -p ~/.config/platypush/scripts
cd ~/.config/platypush/scripts

# Make sure that the scripts module is initialized
touch __init__.py

Then, create a new hook on NewFeedEvent:

$EDITOR rss_news.py
import os
from typing import List

from platypush.event.hook import hook
from platypush.message.event.http.rss import NewFeedEvent
from platypush.utils import run

# Path to your mailing list - a text file with one address per line
maillist = os.path.expanduser('~/.mail.list')

def get_addresses() -> List[str]:
    with open(maillist, 'r') as f:
        return [addr.strip() for addr in f.readlines()
                if addr.strip() and not addr.strip().startswith('#')]


# This hook matches:
# - event_type=NewFeedEvent
# - digest_format='html'
# - source_title='MIT Technology Review'
@hook(NewFeedEvent, digest_format='html', source_title='MIT Technology Review')
def send_mit_rss_feed_digest(event: NewFeedEvent, **_):
    # The digest output file is stored in event.args['digest_filename']
    with open(event.args['digest_filename'], 'r') as f:
        run(action='mail.smtp.send',
            from_='you@yourdomain.com',
            to=get_addresses(),
            subject=f'{event.args.get("source_title")} feed digest',
            body=f.read(),
            body_type='html')

# Or, if you opted for the native GMail plugin you may want to go for:

@hook(NewFeedEvent, digest_format='html', source_title='MIT Technology Review')
def send_mit_rss_feed_digest(event: NewFeedEvent, **_):
    # The digest output file is stored in event.args['digest_filename']
    with open(event.args['digest_filename'], 'r') as f:
        run(action='google.mail.compose',
            sender='you@gmail.com',
            to=get_addresses(),
            subject=f'{event.args.get("source_title")} feed digest',
            body=f.read())

# If instead you want to send the digest in PDF format as an attachment:

@hook(NewFeedEvent, digest_format='html', source_title='MIT Technology Review')
def send_mit_rss_feed_digest(event: NewFeedEvent, **_):
    # mail.smtp plugin case
    run(action='mail.smtp.send',
        from_='you@yourdomain.com',
        to=get_addresses(),
        subject=f'{event.args.get("source_title")} feed digest',
        body='',
        attachments=[event.args['digest_filename']])

    # google.mail case
    run(action='google.mail.compose',
        sender='you@gmail.com',
        to=get_addresses(),
        subject=f'{event.args.get("source_title")} feed digest',
        body='',
        files=[event.args['digest_filename']])

Finally, create your ~/.mail.list file with one destination email address per line and start platypush either from the command line or as a service. You should receive your email with the first batch of articles shortly after startup, and you'll receive more items if a new batch is available after the poll_seconds configured period.

Reactions

How to interact with this page

Webmentions

To interact via Webmentions, send an activity that references this URL from a platform that supports Webmentions, such as Lemmy, WordPress with Webmention plugins, or any IndieWeb-compatible site.

ActivityPub

  • Follow @blog@platypush.tech on your ActivityPub platform (e.g. Mastodon, Misskey, Pleroma, Lemmy).
  • Mention @blog@platypush.tech in a post to feature on the Guestbook.
  • Search for this URL on your instance to find and interact with the post.
  • Like, boost, quote, or reply to the post to feature your activity here.
📣 1 🔗 1
Fabio Manganiello
Git automation, either in the form of Gitlab pipelines or Github actions, is amazing. It enables you to automate a lot of software maintenance tasks (testing, monitoring, mirroring repositories, generating documentation, building and distributing packages etc.) that until a couple of years ago used to take a lot of development time. These forms of automation have democratized CI/CD, bringing to the open-source world benefits that until recently either belonged mostly to the enterprise world (such as TeamCity) or had a steep curve in terms of configuration (such as Jenkins). I have been using Github actions myself for a long time on the Platypush codebase, with a Travis-CI integration to run integration tests online and a ReadTheDocs integration to automatically generate online documentation. You and whose code? However, a few things have changed lately, and I don't feel like I should rely much on the tools mentioned above for my CI/CD pipelines. Github has too often taken the wrong side in DMCA disputes since it's been acquired by Microsoft. The CEO of Github has in the meantime tried to redeem himself, but the damage in the eyes of many developers, myself included, was done, despite the friendly olive branch to the community handed over IRC. Most of all, that doesn't change the fact that Github has taken down more than 2000 other repos in 2020 alone, often without any appeal or legal support - the CEO bowed down in the case of youtube-dl only because of the massive publicity that the takedown attracted. Moreover, Github has yet to overcome its biggest contradiction: it advertises itself like the home for open-source software, but its own source code is not open-source, so you can't spin up your own instance on your own server. There's also increasing evidence in support of my initial suspicion that the Github acquisition was nothing but another old-school Microsoft triple-E operation. Nowadays, when you want to clone a Github repo you won't be prompted anymore with the HTTPS/SSH link by default. You'll be prompted with the Github CLI command, which extends the standard git command, but it introduces a couple of naming inconsistencies here and there. They could have contributed to improving the git tool for everyone's benefit instead of providing their new tool as the new default, but they have opted not to do so. I'm old enough to have seen quite a few of these examples in the past, and it never ended well for the extended party. As a consequence of these actions, I have moved the Platypush repos to a self-hosted Gitlab instance - which comes with much more freedom, but also no more Github actions. And, after the announcement of the planned migration from travis-ci.org to travis-ci.com, with greater focus on enterprise, a limited credit system for open-source projects and a migration process that is largely manual, I have also realized that Travis-CI is another service that can't be relied upon anymore when it comes to open-source software. And, again, Travis-CI is plagued by the same contradiction as Github - it claims to be open-source friendly, but it's not open-source itself, and you can't install it on your own machine. ReadTheDocs, luckily, seems to be still coherent with its mission of supporting open-source developers, but I'm also keeping an eye on them just in case :) Building a self-hosted CI/CD pipeline Even though abandoning closed-source and unreliable cloud development tools is probably the right thing to do, that leaves a hole behind: how do we bring the simplicity of the automation provided by those tools to our new home - and, preferably, in such a format that it can be hosted and moved anywhere? Github and Travis-CI provide a very easy way of setting up CI/CD pipelines. You read the documentation, upload a YAML file to your repo, and all the magic happens. I wanted to build something that was that easy to configure, but that could run anywhere, not only in someone else's cloud. Building a self-hosted pipeline, however, also brings its advantages. Besides liberating yourself of the concern of handing your hard-worked code to someone else who can either change their mind about their mission, or take it down overnight, you have the freedom of setting up the environment for build and test however you please and customize it however you please. And you can easily set up integrations such as automated notifications over whichever channel you like, without the headache of installing and configuring all the dependencies to run on someone else's cloud. In this article we'll how to use Platypush to set up a pipeline that: Reacts to push and tag events on a Gitlab or Github repository and runs custom Platypush actions in YAML format or Python code. Automatically mirrors the new commits and tags to another repo - in my case, from Gitlab to Github. Runs a suite of tests. If the tests succeed, it proceeds with packaging the new version of the codebase - in my case, I run the automation to automatically create the new platypush-git package for Arch Linux on new pushes, and the new platypush Arch package as well as the pip package on new tags. If the tests fail, it sends a notification (over email, Telegram, Pushbullet or whichever plugin supported by Platypush). It also sends a notification if the latest run of tests has succeeded and the previous one was failing. Note: since I have moved my projects to a self-hosted Gitlab server, I could have also relied on the native Gitlab CI/CD pipelines, but I have eventually opted not to do so for two reasons: Setting up the whole Docker+Kubernetes automation required for the CI/CD pipeline proved to be a quite cumbersome process. Additionally, it may require some properly beefed machine in order to run smoothly, while ideally I wanted something that could run even on a RaspberryPi, provided that the building and testing processes aren't too resource-heavy themselves. The alternative provided by Gitlab to setting up your Kubernetes instance and configuring the Gitlab integration is to get a bucket on the cloud to spin a container that runs all you have to run. But if I have gone so far to set up my own self-hosted infrastructure for hosting my code, I certainly don't want to give up on the last mile in exchange of a small discount on the Google Cloud services :) However, if either you have enough hardware resources and time to set up your own Kubernetes infrastructure to integrate with Gitlab, or you don't mind running your CI/CD logic on the Google cloud, Gitlab CI/CD pipelines are something that you may consider - if you don't have the constraints above then they are very powerful, flexible and easy to set up. Installing Platypush Let's start by installing Platypush with the required integrations. If you want to set up an automation that reacts on Gitlab events then you'll only need the http integration, since we'll use Gitlab webhooks to trigger the automation: $ [sudo] pip install 'platypush[http]' If you want to set up the automation on a Github repo you'll only have one or two additional dependencies, installed through the github integration: $ [sudo] pip install 'platypush[http,github]' If you want to be notified of the status of your builds then you may want to install the integration required by the communication mean that you want to use. We'll use Pushbullet in this example because it's easy to set up and it natively supports notifications both on mobile and desktop: $ [sudo] pip install 'platypush[pushbullet]' Feel free however to pick anything else - for instance, you can refer to this article for a Telegram set up or this article for a mail set up, or take a look at the Twilio integration if you want automated notifications over SMS or Whatsapp. Once installed, create a ~/.config/platypush/config.yaml file that contains the service configuration - for now we'll just enable the web server: # The backend listens on port 8008 by default backend.http: enabled: True Setting up a Gitlab hook Gitlab webhooks are a very simple and powerful way of triggering things when something happens on a Gitlab repo. All you have to do is setting up a URL that should be called upon a repository event (push, tag, new issue, merge request etc.), and set up a piece of automation on the endpoint that reacts to the event. The only requirement for this mechanism to work is that the endpoint must be reachable from the Gitlab host - it means that the host running the Platypush web service must either be publicly accessible, on the same network or VPN as the Gitlab host, or the Platypush web port must be tunneled/proxied to the Gitlab host. Platypush offers a very easy way to expose custom endpoints through the WebhookEvent. All you have to do is set up an event hook that reacts to a WebhookEvent at a specific endpoint. For example, create a new event hook under ~/.config/platypush/scripts/gitlab.py: from platypush.event.hook import hook from platypush.message.event.http.hook import WebhookEvent # Token to be used to authenticate the calls from Gitlab gitlab_token = 'YOUR_TOKEN_HERE' @hook(WebhookEvent, hook='repo-push') def on_repo_push(event, **context): # Check that the token provided over the # X-Gitlab-Token header is valid assert event.headers.get('X-Gitlab-Token') == gitlab_token, \ 'Invalid Gitlab token' print('Add your logic here') This hook will react when an HTTP request is received on http://your-host:8008/hook/repo-push. Note that, unlike most of the other Platypush endpoints, custom hooks are not authenticated - that's because they may be called from any context, and you don't necessarily want to share your Platypush instance credentials or token with 3rd-parties. Instead, it's up to you to implement whichever authentication policy you like over the requests. After adding your endpoint, start Platypush: $ platypush Now, in order to set up a new webhook, navigate to your Gitlab project -> Settings -> Webhooks. Gitlab webhook setup Enter the URL to your webhook and the secret token and select the events you want to react to - in this example, we'll select new push events. You can now test the endpoint through the Gitlab interface itself. If it all went well, you should see a Received event line with a content like this on the standard output or log file of Platypush: { "type": "event", "target": "platypush-host", "origin": "gitlab-host", "args": { "type": "platypush.message.event.http.hook.WebhookEvent", "hook": "repo-push", "method": "POST", "data": { "object_kind": "push", "event_name": "push", "before": "previous-commit-id", "after": "current-commit-id", "ref": "refs/heads/master", "checkout_sha": "current-commit-id", "message": null, "user_id": 1, "user_name": "Your User", "user_username": "youruser", "user_email": "you@email.com", "user_avatar": "path to your avatar", "project_id": 1, "project": { "id": 1, "name": "My project", "description": "Project description", "web_url": "https://git.platypush.tech/platypush/platypush", "avatar_url": "https://git.platypush.tech/uploads/-/system/project/avatar/3/icon-256.png", "git_ssh_url": "git@git.platypush.tech:platypush/platypush.git", "git_http_url": "https://git.platypush.tech/platypush/platypush.git", "namespace": "My project", "visibility_level": 20, "path_with_namespace": "platypush/platypush", "default_branch": "master", "ci_config_path": null, "homepage": "https://git.platypush.tech/platypush/platypush", "url": "git@git.platypush.tech:platypush/platypush.git", "ssh_url": "git@git.platypush.tech:platypush/platypush.git", "http_url": "https://git.platypush.tech/platypush/platypush.git" }, "commits": [ { "id": "current-commit-id", "message": "This is a commit", "title": "This is a commit", "timestamp": "2021-03-06T20:02:25+01:00", "url": "https://git.platypush.tech/platypush/platypush/-/commit/current-commit-id", "author": { "name": "Your Name", "email": "you@email.com" }, "added": [], "modified": [ "tests/my_test.py" ], "removed": [] } ], "total_commits_count": 1, "push_options": {}, "repository": { "name": "My project", "url": "git@git.platypush.tech:platypush/platypush.git", "description": "Project description", "homepage": "https://git.platypush.tech/platypush/platypush", "git_http_url": "https://git.platypush.tech/platypush/platypush.git", "git_ssh_url": "git@git.platypush.tech:platypush/platypush.git", "visibility_level": 20 } }, "args": {}, "headers": { "Content-Type": "application/json", "User-Agent": "GitLab/version", "X-Gitlab-Event": "Push Hook", "X-Gitlab-Token": "YOUR GITLAB TOKEN", "Connection": "close", "Host": "platypush-host:8008", "Content-Length": "lenght" } } } These are all fields provided on the event object that you can use in your hook to build your custom logic. Setting up a Github integration If you want to keep using Github but run the CI/CD pipelines on another host with no dependencies on the Github actions, you can leverage the Github backend to monitor your repos and fire Github events that you can build your hooks on when something happens. First, head to your Github profile to create a new API access token. Then add the configuration to ~/.config/platypush/config.yaml under the backend.github section: backend.github: user: your_user user_token: your_token # Optional list of repos to monitor (default: all user repos) repos: - https://github.com/you/myrepo1.git - https://github.com/you/myrepo2.git # How often the backend should poll for updates (default: 60 seconds) poll_seconds: 60 # Maximum events that will be triggered if a high number of events has # been triggered since the last poll (default: 10) max_events_per_scan: 10 Start the service, and on e.g. the first repository push event you should see a Received event log line like this: { "type": "event", "target": "your-host", "origin": "your-host", "args": { "type": "platypush.message.event.github.GithubPushEvent", "actor": { "id": 1234, "login": "you", "display_login": "You", "url": "https://api.github.com/users/you", "avatar_url": "https://avatars.githubusercontent.com/u/1234?" }, "event_type": "PushEvent", "repo": { "id": 12345, "name": "you/myrepo1", "url": "https://api.github.com/repos/you/myrepo1" }, "created_at": "2021-03-03T18:20:27+00:00", "payload": { "push_id": 123456, "size": 1, "distinct_size": 1, "ref": "refs/heads/master", "head": "current-commit-id", "before": "previous-commit-id", "commits": [ { "sha": "current-commit-id", "author": { "email": "you@email.com", "name": "You" }, "message": "This is a commit", "distinct": true, "url": "https://api.github.com/repos/you/myrepo1/commits/current-commit-id" } ] } } } You can easily create an event hook that reacts to such events to run your automation - e.g. under ~/.config/platypush/scripts/github.py: from platypush.event.hook import hook from platypush.message.event.github import GithubPushEvent @hook(GithubPushEvent) def on_repo_push(event, **context): # Run this action only for a specific repo if event.repo['name'] != 'you/myrepo1': return print('Add your logic here') And here you go - you should now be ready to create your automation routines on Github events. Automated repository mirroring Even though I have moved the Platypush repos to a self-hosted domain, I still keep a mirror of them on Github. That's because lots of people have already cloned the repos over the years and may lose updates if they haven't seen the announcement about the transfer. Also, registering to a new domain is often a barrier for users who want to create issues. So, even though I and Github are no longer friends, I still need a way to easily mirror each new commit on my domain to Github - but you might as well have another compelling case for backing up/mirroring your repos. The way I'm currently achieving this is by cloning the main instance of the repo on the machine that runs the Platypush service: $ git clone git@git.you.com:you/myrepo.git /opt/repo Then add a new remote that points to your mirror repo: $ cd /opt/repo $ git remote add mirror git@github.com:/you/myrepo.git $ git fetch Then try a first git push --mirror to make sure that the repos are aligned and all conflicts are solved: $ git push --mirror -v mirror Then add a new sync_to_mirror function in your Platypush script file that looks like this: import logging import os import subprocess repo_path = '/opt/repo' # ... def sync_to_mirror(): logging.info('Synchronizing commits to mirror') os.chdir(repo_path) # Pull the updates from the main repo subprocess.run(['git', 'pull', '--rebase', 'origin', 'master']) # Sync the updates to the repo subprocess.run(['git', 'push', '--mirror', '-v', 'mirror']) logging.info('Synchronizing commits to mirror: DONE') And just call it from the previously defined on_repo_push hook, either the Gitlab or Github variant: # ... def on_repo_push(event, **_): # ... sync_to_mirror() # ... Now on each push the repository clone stored under /opt/repo will be updated and any new commits and tags will be mirrored to the mirror repository. Running tests If our project is properly set up, then it probably has a suite of unit/integration tests that is supposed to be run on each change to verify that nothing is broken. It's quite easy to configure the previously created hook so that it runs the tests on each push. For instance, if your tests are stored under the tests folder of your project and you use pytest: import datetime import os import pathlib import shutil import subprocess from platypush.event.hook import hook from platypush.message.event.http.hook import WebhookEvent # Path where the latest version of the repo will be cloned tmp_path = '/tmp/repo' # Path where the results of the tests will be stored logs_path = '/var/log/tests' # ... def run_tests(): # Clone the repo in /tmp shutil.rmtree(tmp_path, ignore_errors=True) subprocess.run(['git', 'clone', 'git@git.you.com:you/myrepo.git', tmp_path]) os.chdir(os.path.join(tmp_path, 'tests')) passed = False try: # Run the tests tests = subprocess.Popen(['pytest'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) stdout = tests.communicate()[0].decode() passed = tests.returncode == 0 # Write the stdout to a logfile pathlib.Path(logs_path).mkdir(parents=True, exist_ok=True) logfile = os.path.join(logs_path, f'{datetime.datetime.now().isoformat()}_' f'{"PASSED" if passed else "FAILED"}.log') with open(logfile, 'w') as f: f.write(stdout) finally: shutil.rmtree(tmp_path, ignore_errors=True) # Return True if the tests passed, False otherwise return passed # ... @hook(WebhookEvent, hook='repo-push') # or # @hook(GithubPushEvent) def on_repo_push(event, **_): # ... passed = run_tests() # ... Upon push event, the latest version of the repo will be cloned under /tmp/repo and the suite of tests will be run. The output of each session will be stored under /var/log/tests in a file formatted like _.log. To make things even more robust, you can create a new virtual environment under the temporary directory, install your repo with all of its dependency in the new virtual environment and run the tests from there, or spin a Docker instance with the required configuration, to make sure that the tests would also pass on a fresh installation and prevent the "but it works on my box" issue. Serve the test results over HTTP Now you can simply serve /var/log/tests over an HTTP server and the logs can be accessed from your browser. Simple case: $ cd /var/log/tests $ python -m http.server 8000 The logs will be served on http://host:8000. You can also serve the directory through a proper web server like Nginx or Apache. CI logs over HTTP It doesn't come with all the bells and whistles of the Jenkins or Travis-CI UI, but it's simple and good enough for its job - and it's not hard to extend it with a fancier UI if you like. Another nice addition is to download some of those nice passed/failed badge images that you find on many Github repositories to your Platypush box. When a test run completes, just edit your hook to copy the associated banner image (e.g. passed.svg or failed.svg) to e.g. /var/log/tests/status.svg: import os import shutil # ... def run_tests(): # ... passed = tests.returncode == 0 badge_path = '/path/to/passed.svg' if passed else '/path/to/failed.svg' shutil.copy(badge_path, os.path.join(logs_path, 'status.svg')) # ... Then embed the status in your README.md: [![Tests Status](http://your-host:8000/status.svg)](http://your-host:8000) And there you go - you can now show off a dynamically generated and self-hosted status badge on your README without relying on any cloud runner. Automatic build and test notifications Another useful feature of most of the popular cloud services is the ability to send notifications when a build status changes. This is quite easy to set up with Platypush, as the application provides several plugins for messaging. Let's look at an example where a change in the status of our tests triggers a notification to our Pushbullet account, which can be delivered both to our desktop and mobile devices. Download the Pushbullet app if you want the notifications to be delivered to your mobile, get an API token and then install the dependencies for the Pushbullet integration for Platypush: $ [sudo] pip install 'platypush[pushbullet]' Then configure the Pushbullet plugin and backend in ~/.config/platypush/config.yaml: backend.pushbullet: token: YOUR_PUSHBULLET_TOKEN device: platypush pushbullet: enabled: True Now simply modify your push hook to send a notification when the status of build changes. We will also use the variable plugin to retrieve and store the latest status, so that notifications are triggered only when the status changes: from platypush.context import get_plugin from platypush.event.hook import hook from platypush.message.event.http.hook import WebhookEvent # Name of the variable that holds the latest run status last_tests_passed_var = 'LAST_TESTS_PASSED' # ... def run_tests(): # ... passed = tests.returncode == 0 # ... return passed # ... @hook(WebhookEvent, hook='repo-push') # or # @hook(GithubPushEvent) def on_repo_push(event, **_): variable = get_plugin('variable') pushbullet = get_plugin('pushbullet') # Get the status of the last run response = variable.get(last_tests_passed_var).output last_tests_passed = int(response.get(last_tests_passed_var, 0)) # ... passed = run_tests() if passed and not last_tests_passed: pushbullet.send_note(body='The tests are now PASSING', # If device is not set then the notification will # be sent to all the devices connected to the account device='my-mobile-name') elif not passed and last_tests_passed: pushbullet.send_note(body='The tests are now FAILING', device='my-mobile-name') # Update the last_test_passed variable variable.set(**{last_tests_passed_var: int(passed)}) # ... The nice addition of this approach is that any other Platypush device with the Pushbullet backend enabled and connected to the same account will receive a PushbulletEvent when a Pushbullet note is sent, and you can easily leverage this to build some downstream logic with hooks that react to these events. Continuous delivery Once we have a logic in place that automatically mirrors and tests our code and notifies us about status changes, we can take things a step further and set up our pipeline to also build a package for our applications if the tests are successful. Let's consider in this article the example of a Python application whose new releases are tagged through git tags, and each time a new version is released we want to create a pip package and upload it to the online PyPI registry. However, you can easily adapt this example to work with any build and release process. Twine is a quite popular option when it comes to uploading packages to the PyPI registry. Let's install it: $ [sudo] pip install twine Then create a Gitlab webhook that reacts to tag events, or react to a GithubCreateEvent if you are using Github, and create a Platypush hook that reacts to tag events by running the logic of on_repo_push, and additionally make a package build and upload it with Twine if the tests are successful: import importlib import os import subprocess from platypush.event.hook import hook from platypush.message.event.http.hook import WebhookEvent # Path where the latest version of the repo has been cloned tmp_path = '/tmp/repo' # Initialize these variables with your PyPI credentials os.environ['TWINE_USERNAME'] = 'your-pypi-user' os.environ['TWINE_PASSWORD'] = 'your-pypi-pass' # ... def upload_pip_package(): os.chdir(tmp_path) # Build the package subprocess.run(['python', 'setup.py', 'sdist', 'bdist_wheel']) # Check the version of your app - for example from the # yourapp/__init__.py __version__ field app = importlib.import_module('yourapp') version = app.__version__ # Check that the archive file has been created archive_file = os.path.join('.', 'dist', f'yourapp-{version}.tar.gz') assert os.path.isfile(archive_file), \ f'The target file {archive_file} was not created' # Upload the archive file to PyPI subprocess.run(['twine', 'upload', archive_file]) @hook(WebhookEvent, hook='repo-tag') # or # @hook(GithubCreateEvent) def on_repo_tag(event, **_): # ... passed = run_tests() if passed: upload_pip_package() # ... And here you go - you now have an automated way of building and releasing your application! Continuous delivery of web applications We have seen in this article some examples of CI/CD for stand-alone applications with a complete test+build+release pipeline. The same concept also applies to web services and applications. If your repository stores the source code of a website, then you can easily create pieces of automation that react to push events and pull the changes on the web server and restart the web service if required. This is in fact the way I'm currently managing updates on the Platypush blog and homepage. Let's see a small example where we have a Platypush instance running on the same machine as the web server, and suppose that our website is served under /srv/http/myapp (and, of course, that the user that runs the Platypush service has write permissions on this location). It's quite easy to tweak the previous hook example so that it reacts to push events on this repo by pulling the latest changes, runs e.g. npm run build to build the new dist files and then copies the dist folder to our web server directory: import os import shutil import subprocess from platypush.event.hook import hook from platypush.message.event.http.hook import WebhookEvent # Path where the latest version of the repo has been cloned tmp_path = '/tmp/repo' # Path of the web application webapp_path = '/srv/http/myapp' # Backup path of the web application backup_webapp_path = '/srv/http/myapp-backup' # ... def update_webapp(): os.chdir(tmp_path) # Build the app subprocess.run(['npm', 'install']) subprocess.run(['npm', 'run', 'build']) # Verify that the dist folder has been created dist_path = os.path.join('.', 'dist') assert os.path.isdir(dist_path), 'dist path not created' # Remove the previous app backup folder if present shutil.rmtree(backup_webapp_path, ignore_errors=True) # Backup the old web app folder shutil.move(webapp_path, backup_webapp_path) # Move the dist folder to the web app folder shutil.move(dist_path, webapp_path) @hook(WebhookEvent, hook='repo-push') # or # @hook(GithubPushEvent) def on_repo_tag(event, **_): # ... passed = run_tests() if passed: update_webapp() # ...