In the beginning - traditional hosting

Since uni I have had a wix (i know…) based site which hosted old coursework - and was eventually supposed to host blog content. It was easy to setup and easy to customize main content but rather awkward to blog on, and cost quite a lot for what essentially ammounted to static content - in my use case. I also had a google workspaces mail account - which again was really easy to setup but was overkill for my purposes.

Building it better

Every so often I would have a surge of inspiration to finally get rid of my old website, to use the latest and greatest to build a beautiful static website. I’d then spend a few days writing something in gatsby, nextJs, hell - even pure HTML. Ultimately I find a way to over complicate it - which these libraries lend themselves to- usually in an effort to make future blogging easier (ironically), then either get stuck, worn out, or convinced that X library would make this whole thing way easier. Usually more than one of these. Eventually, my library/framework hopping landed me here - SSG5/SSG6.

SSG6

SSG6 (Static Site Generator 6) is a POSIX compliant shell script written by Roman Zolotarev, which I modified to suit my own needs. The documentation for the original can be found here.

On top of the excellent work by Roman, I’ve added:

pandoc support - adding in-document control of title, description, date & tags for each post.
pagination - which is controlled by max_page_posts in the script.
home page generation - to show my most recent posts first
comments - so that if you are unfamiliar with shell, you can still work out wtf is actually going on.
solarized-dark css

In doing so, I’ve broken the posix compatability, however Bash 4 is relatively widely used, and I couldn’t think of another way to achieve what I wanted without adding another app anyways.

Github Pages

There were a number of places I could’ve chosen for hosting but again I wanted something simple, that I couldn’t rabbit-hole myself with. In terms of easy of use, I don’t think anything compares to github pages.

free - the standard .github.io site costs nothing, which is great for prototyping
easy to setup custom domain - once you have your A/AAAA records in place, you only need to click add custom domain, then verify once ready.
free certificate - if you use a custom domain you don’t need to buy a cert seperately, github pages handles it all for you.
prebuilt github actions for deployment - your site will automatically be updated by the pages action which runs whenever you push to .github.io/master.

Zoho mail

Zoho mail makes it incredibly easy to setup custom domains, they have great documentation and frankly great prices. My new mail service costs me £0.80 per month, previously google workspaces cost me £80 per year - or in other words is 88% more expensive!

They have a really nice web app, their documentation is detailed and easy to follow, and they actually have support. If you want to check them out - here is a link to their store.

Implementation

In this section you might expect me to explain the changes that I made to ssg5/6, however it has now been several months since I made the changes, so the best I can suggest is that you compare my version to the source. Hopefully between the comments I made on my version and git-diff you can piece together what is happening.

In case it isn’t obvious in the action - I have two repositories, one which stores the content and html rendering script, and one which just acts as a staging ground for my github page. Most likely, if recreating what I have here - you should just use one repo for both. I originally intended to push my site to vercel/ziet (having totally forgotten about github pages) which is why it is structured this way. Essentially I just added a janky patch to push to my pages repo instead of elsewhere.

SSG6 Extended

#!/bin/bash -e
#
# https://rgz.ee/bin/ssg5
# Copyright 2018-2019 Roman Zolotarev <hi@romanzolotarev.com>
#
# Permission to use, copy, modify, and/or distribute this software for any
# purpose with or without fee is hereby granted, provided that the above
# copyright notice and this permission notice appear in all copies.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
# WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
# ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
# WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
# ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
# OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
#
# modified to add pagination & article list for use on offby0x01.github.io

main() {
    # check if args are set
    test -n "$1" || usage
    test -n "$2" || usage
    test -n "$3" || usage
    test -n "$4" || usage
    # check if src & dst are directories
    test -d "$1" || no_dir "$1"
    test -d "$2" || no_dir "$2"

    # set key variables
    src=$(readlink_f "$1")
    dst=$(readlink_f "$2")
    title="$3"
    base_url="$4"
    max_page_posts="6"

    # create ignore regex - similar to git ignore
    IGNORE=$(
        if ! test -f "$src/.ssgignore"; then
            printf ' ! -path "*/.*"'
            return
        fi
        while read -r x; do
            test -n "$x" || continue
            printf ' ! -path "*/%s*"' "$x"
        done <"$src/.ssgignore"
    )

    # populate HEADER & FOOTER (if assoc files exist)
    h_file="$src/_header.html"
    f_file="$src/_footer.html"
    test -f "$f_file" && FOOTER=$(cat "$f_file") && export FOOTER
    test -f "$h_file" && HEADER=$(cat "$h_file") && export HEADER
    # for each dir in src, create a corresponding (empty) dir in dst
    list_dirs "$src" | (cd "$src" && cpio -pdu "$dst")

    # since I delete dst everytime, I don't need the original update functionality
    # if you do need it, I recommend using the original https://www.romanzolotarev.com/bin/ssg6
    fs=$(list_files "$1")

    # if there are any files in src
    if test -n "$fs"; then
        # update dst/.files list
        echo "$fs" | tee "$dst/.files"

        # if there are any markdown files in src
        if echo "$fs" | grep -q '\.md$'; then
            # look for pandoc
            if test -x "$(which pandoc 2> /dev/null)"; then
                # generate home page (recent posts)
                echo "$fs" | grep '\.md$' | generate_home_page "$src"
                # render other pages
                echo "$fs" | grep '\.md$' |
                    render_md_files_pandoc "$src" "$dst" "$title" "$base_url"
            # or lowdown
            elif test -x "$(which lowdown 2>/dev/null)"; then
                echo "$fs" | grep '\.md$' |
                    render_md_files_lowdown "$src" "$dst" "$title"
            # or markdown.pl
            elif test -x "$(which Markdown.pl 2>/dev/null)"; then
                echo "$fs" | grep '\.md$' |
                    render_md_files_Markdown_pl "$src" "$dst" "$title"
            else
                echo "couldn't find pandoc, lowdown, or Markdown.pl"
                exit 3
            fi      
        fi

        # refresh fs list - we may have generate a few new pages
        fs=$(list_files "$1")

        # look in src for any html files, pass them to render fn
        echo "$fs" | grep '\.html$' |
            render_html_files "$src" "$dst" "$title"

        # copy over all other file types i.e. images
        echo "$fs" | grep --extended-regexp --invert-match '\.md$|\.html$' |
            (cd "$src" && cpio -pu "$dst")

        # create sitemap
        date=$(date +%Y-%m-%d)
        urls=$(list_pages "$src")

        # if there are any urls to add, add them to sitemap
        test -n "$urls" &&
            render_sitemap "$urls" "$base_url" "$date" >"$dst/sitemap.xml"

    fi
}

usage() {
    # help message - missing arg or invald command
    echo "usage: ${0##*/} src dst title base_url" >&2
    exit 1
}

no_dir() {
    # help message - dir does not exist
    echo "${0##*/}: $1: No such directory" >&2
    exit 2
}

readlink_f() {
    # get absolute path to file
    file="$1"
    cd "$(dirname "$file")"
    file=$(basename "$file")
    while test -L "$file"; do
        file=$(readlink "$file")
        cd "$(dirname "$file")"
        file=$(basename "$file")
    done
    dir=$(pwd -P)
    echo "$dir/$file"
}

list_dirs() {
    cd "$1" && eval "find . -type d ! -name '.' ! -path '*/_*' $IGNORE"
}

list_files() {
    # list all files in all src directories
    cd "$1" && eval "find . -type f ! -name '.' ! -path '*/_*' $IGNORE"
}

generate_home_page() {
    date_counter=0
    declare -A date_post; declare -a dates;
    while read -r doc; do
        if [ "$doc" = "$src/index.md" ] || [ "$doc" = "./about.md" ] || [ "$doc" = "./projects.md" ]; then
            continue
        fi

        # borrowed from https://github.com/fmash16/ssg5
        TITLE=$(grep -i title "$1/$doc" | head -n1 | awk -F ":" '{ print $2 }')
        DATE=$(grep -i date "$1/$doc" | head -n1 | awk -F ":" '{ print $2 }')
        DESC=$(grep -i description "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        TAGS=$(grep -i tags "$1/$doc" | head -n1 | awk -F ":" '{ print $2 }')
        IMAGE=$(grep -i image "$1/$doc" | head -n1 | awk -F ":" '{ print $2 }')

        # rough slug system in case of duplicate dates
        slug="$DATE.$date_counter"
        date_counter=$(($date_counter+1))

        # requires bash 4
        dates+=("$slug")
        date_post["$slug"]=$(printf "\
<h1 id=\"$TITLE\" style=\"font-size:1.3em;border-bottom: none; padding-bottom: 0em;\">\n\
  <a href=\"/${doc%\.md}.html\">$TITLE</a>\n\
</h1>\n\
<p style=\"text-align:justified;\">$DESC<br/>\n\
<strong><u>tags</u>: </strong>[$TAGS ]</p>")
    done

    post_count=0
    page_count=0
    # sort dates (newest to oldest)
    readarray -t sorted_dates < <(printf '%s\n' "${dates[@]}" | sort -r)
    for d in "${sorted_dates[@]}"; do

        echo "current article date: $d"

        # write article summary to page
        printf "${date_post[$d]}\n\n" >> "$1/index$page_count.html"
        
        post_count=$(($post_count+1))

        # if we reach max num of posts for this page
        if [ "$post_count" -eq "$max_page_posts" ]; then
            # create a link to next page
            printf "<center><a href=\"index$(($page_count+1)).html\">Next>></a></center>\n" >> "$1/index$page_count.html"
            # update page count and reset post count
            post_count="0"
            page_count=$(($page_count+1))
        fi
    done

    # rename index0.html to index.html
    if [ -f $src/index0.html ];then
        mv $src/index0.html $src/index.html
    fi


}

render_md_files_pandoc() {
    while read -r doc; do
        # where 1=src, 2=dst, 3=title, 4=url
        TITLE=$(grep -i title "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        DESC=$(grep -i description "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        IMAGE=$(grep -i image "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        URI=$(echo "$2/${doc%\.md}.html" | cut -d . -f 2)
        DATE=$(grep -i date "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        pandoc --highlight-style pygments --toc "$1/$doc" |
            render_html_file "$3" \
            > "$2/${doc%\.md}.html" && \
          sed -i -e "/<title>/a <meta property=\"og:title\" content=\"$TITLE\" />\n\
            <meta property=\"og:description\" content=\"$DESC\" />\n\
            <meta property=\"og:type\" content=\"article\" />\n\
            <meta property=\"og:url\" content=\"$URI\" />\n\
            <meta property=\"og:image\" content=\"$4$IMAGE\" />\n"\
            "$2/${doc%\.md}.html"
    done
}

render_md_files_lowdown() {
    while read -r f; do
        lowdown \
            --html-no-escapehtml \
            --html-no-skiphtml \
            --parse-no-metadata \
            --parse-no-autolink <"$1/$f" |
            render_html_file "$3" \
                >"$2/${f%\.md}.html"
    done
}

render_md_files_Markdown_pl() {
    while read -r f; do
        Markdown.pl <"$1/$f" |
            render_html_file "$3" \
                >"$2/${f%\.md}.html"
    done
}

render_html_files() {
    while read -r f; do
        render_html_file "$3" <"$1/$f" >"$2/$f"
    done
}

render_html_file() {
    # h/t Devin Teske
    awk -v title="$1" '
    { body = body "\n" $0 }
    END {
        body = substr(body, 2)
        if (body ~ /<\/?[Hh][Tt][Mm][Ll]/) {
            print body
            exit
        }
        if (match(body, /<[[:space:]]*[Hh]1(>|[[:space:]][^>]*>)/)) {
            t = substr(body, RSTART + RLENGTH)
            sub("<[[:space:]]*/[[:space:]]*[Hh]1.*", "", t)
            gsub(/^[[:space:]]*|[[:space:]]$/, "", t)
            if (t) title = t " &mdash; " title
        }
        n = split(ENVIRON["HEADER"], header, /\n/)
        for (i = 1; i <= n; i++) {
            if (match(tolower(header[i]), "<title></title>")) {
                head = substr(header[i], 1, RSTART - 1)
                tail = substr(header[i], RSTART + RLENGTH)
                print head "<title>" title "</title>" tail
            } else print header[i]
        }
        print body
        print ENVIRON["FOOTER"]
    }'
}

list_pages() {
    e="\\( -name '*.html' -o -name '*.md' \\)"
    cd "$1" && eval "find . -type f ! -path '*/.*' ! -path '*/_*' $IGNORE $e" |
        sed 's#^./##;s#.md$#.html#;s#/index.html$#/#'
}

render_sitemap() {
    urls="$1"
    base_url="$2"
    date="$3"

    echo '<?xml version="1.0" encoding="UTF-8"?>'
    echo '<urlset'
    echo 'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"'
    echo 'xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9'
    echo 'http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"'
    echo 'xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'
    echo "$urls" |
        sed -E 's#^(.*)$#<url><loc>'"$base_url"'/\1</loc><lastmod>'"$date"'</lastmod><priority>1.0</priority></url>#'
    echo '</urlset>'
}

main "$@"

Github Action

jobs:
  # This workflow contains a single job called "build"
  build:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest
    steps:    
      - name: pull existing github page
        run: |
          set -e  # if a command fails exit the script
          set -u  # script fails if trying to access an undefined variable
          git clone "https://${{secrets.API_TOKEN_GITHUB}}@github.com/$GITHUB_USER/$GITHUB_REPO" dst
          cd dst
          git rm -rf .
          git clean -fxd
          echo "$SITE_URL" > CNAME
          cd ..
          
      - name: install dependencies
        run: sudo apt install pandoc vim cpio -y
          
      - name: use ssg5 to build site
        run: |
          chmod +x ssg5
          ./ssg5 src dst "$SITE_TITLE" "$SITE_URL"
      - name: log dst folder structure
        run: find dst/ -print

      - name: log src folder structure
        run: find src/ -print
        
      - name: commit & push page changes
        run: |
          cd dst
          git config user.name "$GITHUB_USER"
          git config user.email "$GITHUB_MAIL"
          git add .
          git diff-index --quiet HEAD || git commit --message "${{github.event.head_commit.message}}"
          git push

Folder Structure (generation side)

├── LICENSE
├── README.md
├── src
│   ├── about.md
│   ├── css
│   │   └── styles.css
│   ├── doc
│   │   ├── year_2-digital_forensics.pdf
│   │   ├── year_3-exploit_development.pdf
│   │   ├── year_3-group_project_proposal.pdf
│   │   ├── year_3-network_infrastucture_testing.pdf
│   │   ├── year_3-snmp_personal_project.pdf
│   │   ├── year_3-web_application_testing.pdf
│   │   ├── year_4-cv.pdf
│   │   ├── year_4-cybersecurity_for_the_vulnerable.pdf
│   │   ├── year_4-honours_project_dissertation.pdf
│   │   ├── year_4-honours_project_proposal.pdf
│   │   └── year_4-mobile_forensics_whitepaper.pdf
│   ├── favicon.png
│   ├── _footer.html
│   ├── _header.html
│   ├── posts
│   │   └── new_blog.md
│   └── projects.md
└── ssg5

CSS (which was not fun)

:root {
    /* colours based on solarized-dark by Ethan Schoonover
    https://ethanschoonover.com/solarized/#usage-development */
    --yellow: #b58900;
    --orange: #cb4b16;
    --red: #dc322f;
    --magenta: #d33682;
    --violet:  #d33682;
    --blue: #268bd2;
    --cyan: #2aa198;
    --green: #859900;
    --base0: #839496;
    --base01: #586e75;
    --base02: #073642;
    --base03:   #002b36;
}

html {
  font-family: 'Source Code Pro', monospace;
  color: var(--base0);
  background-color: var(--base03);
  overflow-y: scroll;
  -webkit-text-size-adjust: 100%;
}

pre {
  color: var(--orange);
  white-space: pre;
  padding: 10px 15px;
  border-radius: 0px;
  overflow: auto;
  text-align: center;
  font-family: 'Courier Prime', monospace;
  font-weight: bold;
}

body {
  max-width: 1000px;
  line-height: 1.5em;
  padding: 1em;
  margin-right:auto;
  margin-left: auto;
}

a {
  color: var(--yellow);
}

li {
  display: list-item;
  text-align: -webkit-match-parent;
}

::marker {
  unicode-bidi: isolate;
  font-variant-numeric: tabular-nums;
  text-transform: none;
  text-indent: 0px !important;
  text-align: start !important;
  text-align-last: start !important;
}

ul {
  list-style-type: circle;
}

h1, h2, h3, h4, h5 { 
  padding-top:1.5rem;
  padding-bottom: 0.5rem;
  display: block;
  font-size: 150%;
  font-weight: bold;
  border-bottom: 1px dashed var(--base02);
  color: var(--yellow) 
}

h1::before {
  content: "# "
}

h2::before {
  content: "## "
}

h3::before {
  content: "### "
}

h4::before {
  content: "#### "
}

h5::before {
  content: "##### "
}

blockquote {
  color: var(--orange);
  border-left: 3px solid var(--orange);
  padding-left: 20px;
}

b, strong {
  font-weight: bold;
}

em {
  font-style: italic;
}

p {
  letter-spacing: 0.01em;
}

nav {
  top: 0;
  text-align: center;
  border-bottom: 1px dashed var(--base01);
  padding-bottom: 1em;
  font-size: 90%;
}

nav > a {
  text-decoration: underline;
}

input {
  display: none;
  visibility: hidden;
}

label {
  display: block;
  position: fixed;
  color: var(--base0);
  left: 90%;
  top: 1em;
  padding: 0.1rem 0.5rem 0.1rem 0.5rem;
  border: 1px solid var(--yellow)
}

.toc {
  position: fixed;
  font-size: 90%;
  left: 80%;
  top: 3rem;
  padding: 0.5rem;
  text-align: left;
  border-bottom: none;
  overflow: auto;
}

.toc a {
  text-decoration: none;
}

ul,
ol {
  padding: 0 0 0 2em;
}

.sourceCode {
  background-color: var(--base02);
  color: var(--base0);
}

.sourceCode pre {
  padding: 20px 20px 20px 20px
}

.sourceCode > code {
  padding: 0;
}

code {
  /* default code*/
  font-size: 100%;
  font-family: inherit;
  color: var(--orange);
  background-color: var(--base02);
}

code span.dv {
  /* numbers */
  color: var(--blue);
}

code span.op {
  /* main body */
  color: var(--base0);
}

code span.cf {
  /* statements i.e. if */
  color: var(--green);
}

code span.fu {
  /* keyword */
  color: var(--orange);
}

code span.kw {
  /* symbols */
  color: var(--yellow);
}

code span.co {
  /* comments */
  color: var(--base01);
}

code span.al {
  /* TODO comments */
  color: var(--magenta);
}

code span.st {
  /* string */
  color: var(--green);
}

code span.sc {
  /* string/character */
  color: var(--green);
}

And that is pretty much all there is to it.