Writing a blog with Pelican - my setup and workflow

What is pelican?

Pelican [1] is a static site generator. This means that the resulting website does not require executables such as PHP on the server side. Instead, a set of html-pages is generated.

  • The content files are written in a markdown format. I mainly use reStructuredText [2].
  • A theme style consisting of page templates and CSS-files is used to format the site. The templates also add the index, menu and navigation items.
  • A range of plugins also exists e.g. to automatically resize included images.

Why a static site?

The most significant point of a static site is security and reliability. The PHP engine and PHP sites are known for numerous security holes which requires timely patching. With static content only the webserver is exposed and the webserver is usually kept up to date by the website hoster.

Another point is complexity. With a static site generator all dependencies can be kept locally in one directory. There is no need for a database and configuration kept on the webserver. The static site can be run locally offline just as online.

Workflow

The problem

The default workflow for pelican is to have a single site online and to use a local webserver to test things. For different purposes a list of directories can be defined. E.g. directories where files are located for uploading or directories where articles are located.

For my purposes this setup is impractical for the following reasons.

  • I want to organize my content (article, files for download, other files e.g. image source files) in one subdirectory per project.

    • This means that there will be files and directories which pelican should ignore.
  • I want to start writing draft versions of articles but not publish them.

    • Pelican offers a "draft" but this simply hides the article from the index. One cannot see the summary of the article if it is not listed in the index.

The solution

  • I keep the source files in a separate directory. In this directory all files can be organized in subdirectories as needed.

  • For each target I create a list of files filelist.*BLOGNAME*.txt.

    • The target "everything" includes the whole content directory.

      • It is only served locally and can be used to quickly generate new content without caring for the filelist.
    • The target "staging" includes the articles and files which should be published online.

      • This target is uploaded to a private directory on the webserver. Here, I can test if everything works and all dependencies are included and correctly linked.
    • The target "publish" includes the files actually needed for the public site.

  • A script crawls the source for filelist-files and creates the corresponding links in the target directory.

  • Another script compiles each blog and starts a webserver locally for each.

  • Another script is used to compile and upload "staging" and "publish".

This workflow is virtually identical to having the content directly inside the pelican directory when creating new content and serving it locally in "everything". But all files belonging to an entry can be kept in the same subdirectory. In order to publish the filenames must be listed, which is extra work. This step is an additional safeguard against publishing the wrong files. A quick test with the "staging" target reveals if all files are correctly included and then the filelist can be copy-pasted to the "publish" target and uploaded to the main site.

When regenerating the blog for local serving my command order is make clean, make html, make regenerate &, make serve. Here, make regenerate is very useful to quickly update the site locally each time a source file is changed. But when a file contains an error make regenerate crashes. This means that this command must be restarted each time - either manually in the console or better automatically using a script. I therefore wrote a script which executes these commands for all targets, starts local servers on different ports and restarts make regenerate automatically. Another skript handles uploading.

For file linking and regenerating the blog automatically and starting the webservers I simply execute the script

./serve_locally.sh`

this handles all 3 versions and in the case of everything all new files are automatically integrated even after the script is started.

When everything is ok, I start

./publish_online.sh

which regenerates the blogs and uploads them using rsync over a ssh-connection.

This makes writing new content very efficient. And the management tasks only take a minimum of time. The scripts are tested on OSX and Linux.

git

Since the blog articles are all text files, a source code management tool such as git is pretty well suited.

In my setup the master repository includes the scripts, my theme and plugins and two submodules:

  1. the pelican-plugins from GitHub,
  2. the content including the configuration files.

I have a bare upstream repository on a shared folder for data exchange between different machines.

Dependencies

One thing that I found out while setting up the blog is that it is established practice to include external dependencies in a website. Such as fonts or java-script files from external sites such as google. In my mind this is a bad idea because now many websites only work properly when these servers are online. This might change at any time. For me these dependencies break when I am in the train. Therefore I include all .css and .js files locally.

One exception to this are the blog comments. Since these are

  • independent of the actual static content,
  • really are dynamic content, and
  • the site won't break when the comments are unavailable

it makes sense to use an external site to host the comments.

[1]http://getpelican.com
[2]http://docutils.sourceforge.net/rst.html