rustand.tech

Home About Contact

Static blog generator using Pandoc

2023-02-19 (edited: 2023-03-04)

In this post I will show you how I make my blog using Pandoc as a static site generator. This is still a work in progress, as I am still not completely done with my blog framework, but I really wanted to write about it.

About

I wanted to make a blog to share some of my projects, but wasn’t really impressed with any of the blog frameworks I found, most of them were either very bloated or had other drawbacks.

During my searches I stumbled upon the concept of static site generation which really clicked with me. I have used Pandoc to turn Markdown and templates into HTML before, so this seemed liked the natural way to do it.

I wanted to have a bit more functionality than plain standalone article pages, including a hashtag system, post suggestions in the bottom of every post, navigation bar, and most important of all - the feed on the front page, while still serving only static content.

Implementation

The blog framework is based mostly on Pandoc, GNU Make and jq.

Directory overview

The directory structure looks like this:

.
├── src/
│   ├── assets/
│   ├── posts/
├── template/
│   ├── index.html
│   ├── navbar.html
│   ├── page.html
│   ├── post.html
│   ├── preview.html
│   ├── style.css
│   └── tags.html
├── utils/
│   ├── helpers/
│   │   ├── get-creation-date*
│   │   ├── get-modification-date*
│   │   └── update-file*
│   ├── create-metadata*
│   ├── create-tag-searches*
│   ├── create-tags-list*
│   ├── list-all-tags*
│   └── list-latest-posts*
└── Makefile

The src/ dir contains all the site content written in markdown, such as blog posts and other pages, along with images and page metadata.

The template/ dir contains all the HTML templating used to render the markdown documents.

The utils/ dir contains all shell scripts used to produce all “dynamic” content, that is content generated based on metadata and other site details. I say “dynamic” because this is not generated on the fly when browsing the site, but rather pre-generated at compile time.

Makefile

First we define some variables:

# Directories
export SOURCE_DIR := src
export OUT_DIR := build
export BUILD_DIR := generated
export TEMPLATE_DIR := template
export UTIL_DIR := utils
export ROOT_DIR :=

# Templates
export POST_TEMPLATE := $(TEMPLATE_DIR)/post.html
export PAGE_TEMPLATE := $(TEMPLATE_DIR)/page.html

# CSS
export STYLE = style.css

SOURCES := $(wildcard $(SOURCE_DIR)/*.md $(SOURCE_DIR)/**/*.md)
OBJECTS := $(patsubst $(SOURCE_DIR)/%.md, $(OUT_DIR)/%.html, $(SOURCES))
POST_METADATA := $(wildcard $(SOURCE_DIR)/posts/*.json)
SUGGESTIONS := $(patsubst $(SOURCE_DIR)/posts/%, $(BUILD_DIR)/suggestions/%, $(POST_METADATA))
TAG_SEARCH_PAGES := $(patsubst %,$(OUT_DIR)/tags/%.html,$(shell $(UTIL_DIR)/list-all-tags))

Then we define our top level build targets:

# Default target
all: $(SUGGESTIONS) $(OBJECTS) $(OUT_DIR)/index.html assets $(OUT_DIR)/$(STYLE) $(TAG_SEARCH_PAGES)

# If debug build use file urls
debug: ROOT_DIR := file://$(shell pwd)/$(OUT_DIR)
debug: all

clean:
    rm -rf $(OUT_DIR)/*
    rm -rf $(BUILD_DIR)/*

Set our common compilation flags for all targets:

export COMPILE = pandoc -s --css $(ROOT_DIR)/$(STYLE) --variable rootdir=$(ROOT_DIR) --variable "date=$$($(UTIL_DIR)/helpers/get-creation-date $<)" --variable "changedate=$$($(UTIL_DIR)/helpers/get-modification-date $<)" --metadata lang=en --filter pandoc-include-code

This I don’t want to talk about..

# Hacky shit for creating directory tree
.PRECIOUS: $(OUT_DIR)/ $(OUT_DIR)%/

$(OUT_DIR)/.:
    mkdir -p $@

$(OUT_DIR)%/.:
    mkdir -p $@

$(BUILD_DIR)/.:
    mkdir -p $@

$(BUILD_DIR)%/.:
    mkdir -p $@

Generate a list of all tags and collect metadata from all blog posts in a single JSON structure. We will use this later to build suggestion list, tag searches and main feed.

$(BUILD_DIR)/tags.json: $(POST_METADATA)
    $(UTIL_DIR)/create-tags-list

$(BUILD_DIR)/metadata.json: $(POST_METADATA)
    $(UTIL_DIR)/create-metadata

Generate search results for tags, these pages show a list of all blog posts tagged with a certain tag. These are generated for all tags that are used on the blog.

# Tag searches
$(OUT_DIR)/tags/%.html: $(BUILD_DIR)/metadata.json $(BUILD_DIR)/tags/. $(TEMPLATE_DIR)/navbar.html | $$(@D)/.
    $(UTIL_DIR)/create-tag-searches $(patsubst $(OUT_DIR)/tags/%.html,%,$@)

Copy assets and CSS style to build directory. Using rsync with --delete flag to make sure any deleted assets are also deleted in the build directory, while only copying changed files.

# Assets
assets:
    rsync -au --delete $(SOURCE_DIR)/assets/ $(OUT_DIR)/assets

# Style
$(OUT_DIR)/$(STYLE): $(TEMPLATE_DIR)/style.css
    cp $(TEMPLATE_DIR)/style.css $(OUT_DIR)/$(STYLE)

Site pages:

# Pages
$(OUT_DIR)/%.html: $(SOURCE_DIR)/%.md $(PAGE_TEMPLATE) $(TEMPLATE_DIR)/navbar.html | $$(@D)/.
    $(COMPILE) --template $(PAGE_TEMPLATE) $< -o $@ --metadata-file $(patsubst %.md,%.json,$<)

Suggestions and blog posts. The suggestions are generated per blog post and shows a list of the three latest blog posts excluding the one currently being viewed.

# Suggestions
$(BUILD_DIR)/suggestions/%.json: $(SOURCE_DIR)/posts/%.json $(BUILD_DIR)/metadata.json | $$(@D)/.
    jq "{posts: [.posts[] | select(.url != \"$(patsubst $(SOURCE_DIR)/%.json,$(ROOT_DIR)/%.html,$<)\")][:3]}" $(BUILD_DIR)/metadata.json > $@

# Blog posts
$(OUT_DIR)/posts/%.html: $(SOURCE_DIR)/posts/%.md $(BUILD_DIR)/suggestions/%.json $(TEMPLATE) $(BUILD_DIR)/metadata.json $(TEMPLATE_DIR)/navbar.html | $$(@D)/.
    $(COMPILE) $< --toc --template $(POST_TEMPLATE) -o $@ --metadata-file $(patsubst %.md,%.json,$<) --metadata-file $(patsubst $(SOURCE_DIR)/posts/%.md,$(BUILD_DIR)/suggestions/%.json,$<)

The index contains a feed of all blog posts. Currently I do not have a lot of posts so this is OK for now, but eventually I will have to limit how many posts are shown on the front page.

# Index
$(OUT_DIR)/index.html: $(TEMPLATE_DIR)/index.html $(SOURCE_DIR)/index.json $(TEMPLATE_DIR)/navbar.html $(BUILD_DIR)/metadata.json
    $(COMPILE) --metadata-file $(SOURCE_DIR)/index.json /dev/null -f markdown --template $(TEMPLATE_DIR)/index.html -o $@ --metadata-file $(BUILD_DIR)/metadata.json

Scripts

A simple script to avoid updating the modification time of a file if the content is unchanged. This is helpful to avoid rebuilding things that depend on the unchanged file.

Scripts for finding the creation date and modification date of a blog post based on the time it was first and last commited in Git.

#!/usr/bin/bash

git log --follow --format="%ai" --reverse $1 | head -n1 | cut -d' ' -f 1
#!/usr/bin/bash


if [[ "$(git log --follow --oneline $1 | wc -l)" -gt 1 ]]; then
    git log --follow --format="%ai" $1 | head -n1 | cut -d' ' -f 1
fi

Collect metadata from all blog posts into a single JSON structure.

#!/bin/bash

touch $BUILD_DIR/metadata.json.new2
for post in $($UTIL_DIR/list-latest-posts -0); do
    date=$($UTIL_DIR/helpers/get-creation-date $post)
    changedate=$($UTIL_DIR/helpers/get-modification-date $post)
    jq -n "reduce inputs as \$f (.; . += (\$f + {date: \"$date\", changedate: \"$changedate\", url: ( \"$ROOT_DIR/\" + (input_filename|rtrimstr(\".json\")|ltrimstr(\"$SOURCE_DIR/\")) + \".html\")}))" $(echo $post | sed -E -e 's/.md$/.json/') >> $BUILD_DIR/metadata.json.new2
done

jq -s '{posts: .}' $BUILD_DIR/metadata.json.new2 > $BUILD_DIR/metadata.json.new
rm $BUILD_DIR/metadata.json.new2

$UTIL_DIR/helpers/update-file $BUILD_DIR/metadata.json

Create tag search result pages.

#!/bin/bash

tag=$1

jq ".posts[] | select(.keywords[] | contains(\"$tag\"))" $BUILD_DIR/metadata.json | jq -s '{posts: .}' > $BUILD_DIR/tags/$tag.json.new

$UTIL_DIR/helpers/update-file $BUILD_DIR/tags/$tag.json

pandoc --css $ROOT_DIR/$STYLE -s --variable rootdir=$ROOT_DIR --metadata "title=Articles tagged with $tag" --template $TEMPLATE_DIR/tags.html -f html -t html /dev/null -o $OUT_DIR/tags/$tag.html.new --metadata-file $BUILD_DIR/tags/$tag.json

$UTIL_DIR/helpers/update-file $OUT_DIR/tags/$tag.html

Create list of all tags used. Currently unused.

#!/usr/bin/bash

jq -s "{tags: [.[].keywords] | add | unique}" $SOURCE_DIR/posts/*.json > $BUILD_DIR/tags.json

Usage

Writing blog posts

Every blog post consists of a markdown document and a JSON metadata file.

.
└── src/
    └── posts/
        ├── my-first-blog-post.md
        └── my-first-blog-post.json

The metadata file for this blog post might look something like this:

{
  "title": "My first blog post",
  "keywords": [
    "blog",
    "first"
  ],
  "abstract": "This is the abstract of the article."
}

Writing pages

Same as with blog posts, pages also consist of a markdown document and a JSON file, but these are located in the root of the src/ directory instead of inside posts/. Pages use a different template than blog posts, the main difference being that pages do not contain a table of contents, list of hashtags or links to other blog posts.

.
└── src/
    ├── about.md
    └── about.json

Building the site

Building for productions is just a simple

make -j

This sets the document root in the site to the $ROOT_DIR specified in the Makefile, in our case /. This is used as a prefix for all links on the site.

When developing locally you are most likely not running a server, but will access the site through file://... instead. Build the debug target in order to set the document root accordingly so your links work.

make -j debug


Tags

#blogging #pandoc #makefile #web #linux #opensource

Other posts

Debugging AVR targets with Bloom and MPLAB Snap
2023-02-17
In this post I am going to show you how to debug AVR targets using open source software.