_____ ___ ___ _____     ___     ___ ___   
 |     |  _|  _| __  |_ _|   |_ _|   |_  |  
 |  |  |  _|  _| __ -| | | | |_'_| | |_| |_ 
 |_____|_| |_| |_____|_  |___|_,_|___|_____|
                     |___|                  

In the beginning - traditional hosting

Since uni I have had a wix (i know…) based site which hosted old coursework - and was eventually supposed to host blog content. It was easy to setup and easy to customize main content but rather awkward to blog on, and cost quite a lot for what essentially ammounted to static content - in my use case. I also had a google workspaces mail account - which again was really easy to setup but was overkill for my purposes.

Building it better

Every so often I would have a surge of inspiration to finally get rid of my old website, to use the latest and greatest to build a beautiful static website. I’d then spend a few days writing something in gatsby, nextJs, hell - even pure HTML. Ultimately I find a way to over complicate it - which these libraries lend themselves to- usually in an effort to make future blogging easier (ironically), then either get stuck, worn out, or convinced that X library would make this whole thing way easier. Usually more than one of these. Eventually, my library/framework hopping landed me here - SSG5/SSG6.

SSG6

SSG6 (Static Site Generator 6) is a POSIX compliant shell script written by Roman Zolotarev, which I modified to suit my own needs. The documentation for the original can be found here.

On top of the excellent work by Roman, I’ve added:

In doing so, I’ve broken the posix compatability, however Bash 4 is relatively widely used, and I couldn’t think of another way to achieve what I wanted without adding another app anyways.

Github Pages

There were a number of places I could’ve chosen for hosting but again I wanted something simple, that I couldn’t rabbit-hole myself with. In terms of easy of use, I don’t think anything compares to github pages.

Zoho mail

Zoho mail makes it incredibly easy to setup custom domains, they have great documentation and frankly great prices. My new mail service costs me £0.80 per month, previously google workspaces cost me £80 per year - or in other words is 88% more expensive!

They have a really nice web app, their documentation is detailed and easy to follow, and they actually have support. If you want to check them out - here is a link to their store.

Implementation

In this section you might expect me to explain the changes that I made to ssg5/6, however it has now been several months since I made the changes, so the best I can suggest is that you compare my version to the source. Hopefully between the comments I made on my version and git-diff you can piece together what is happening.

In case it isn’t obvious in the action - I have two repositories, one which stores the content and html rendering script, and one which just acts as a staging ground for my github page. Most likely, if recreating what I have here - you should just use one repo for both. I originally intended to push my site to vercel/ziet (having totally forgotten about github pages) which is why it is structured this way. Essentially I just added a janky patch to push to my pages repo instead of elsewhere.

SSG6 Extended

#!/bin/bash -e
#
# https://rgz.ee/bin/ssg5
# Copyright 2018-2019 Roman Zolotarev <hi@romanzolotarev.com>
#
# Permission to use, copy, modify, and/or distribute this software for any
# purpose with or without fee is hereby granted, provided that the above
# copyright notice and this permission notice appear in all copies.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
# WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
# ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
# WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
# ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
# OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
#
# modified to add pagination & article list for use on offby0x01.github.io

main() {
    # check if args are set
    test -n "$1" || usage
    test -n "$2" || usage
    test -n "$3" || usage
    test -n "$4" || usage
    # check if src & dst are directories
    test -d "$1" || no_dir "$1"
    test -d "$2" || no_dir "$2"

    # set key variables
    src=$(readlink_f "$1")
    dst=$(readlink_f "$2")
    title="$3"
    base_url="$4"
    max_page_posts="6"

    # create ignore regex - similar to git ignore
    IGNORE=$(
        if ! test -f "$src/.ssgignore"; then
            printf ' ! -path "*/.*"'
            return
        fi
        while read -r x; do
            test -n "$x" || continue
            printf ' ! -path "*/%s*"' "$x"
        done <"$src/.ssgignore"
    )

    # populate HEADER & FOOTER (if assoc files exist)
    h_file="$src/_header.html"
    f_file="$src/_footer.html"
    test -f "$f_file" && FOOTER=$(cat "$f_file") && export FOOTER
    test -f "$h_file" && HEADER=$(cat "$h_file") && export HEADER
    # for each dir in src, create a corresponding (empty) dir in dst
    list_dirs "$src" | (cd "$src" && cpio -pdu "$dst")

    # since I delete dst everytime, I don't need the original update functionality
    # if you do need it, I recommend using the original https://www.romanzolotarev.com/bin/ssg6
    fs=$(list_files "$1")

    # if there are any files in src
    if test -n "$fs"; then
        # update dst/.files list
        echo "$fs" | tee "$dst/.files"

        # if there are any markdown files in src
        if echo "$fs" | grep -q '\.md$'; then
            # look for pandoc
            if test -x "$(which pandoc 2> /dev/null)"; then
                # generate home page (recent posts)
                echo "$fs" | grep '\.md$' | generate_home_page "$src"
                # render other pages
                echo "$fs" | grep '\.md$' |
                    render_md_files_pandoc "$src" "$dst" "$title" "$base_url"
            # or lowdown
            elif test -x "$(which lowdown 2>/dev/null)"; then
                echo "$fs" | grep '\.md$' |
                    render_md_files_lowdown "$src" "$dst" "$title"
            # or markdown.pl
            elif test -x "$(which Markdown.pl 2>/dev/null)"; then
                echo "$fs" | grep '\.md$' |
                    render_md_files_Markdown_pl "$src" "$dst" "$title"
            else
                echo "couldn't find pandoc, lowdown, or Markdown.pl"
                exit 3
            fi      
        fi

        # refresh fs list - we may have generate a few new pages
        fs=$(list_files "$1")

        # look in src for any html files, pass them to render fn
        echo "$fs" | grep '\.html$' |
            render_html_files "$src" "$dst" "$title"

        # copy over all other file types i.e. images
        echo "$fs" | grep --extended-regexp --invert-match '\.md$|\.html$' |
            (cd "$src" && cpio -pu "$dst")

        # create sitemap
        date=$(date +%Y-%m-%d)
        urls=$(list_pages "$src")

        # if there are any urls to add, add them to sitemap
        test -n "$urls" &&
            render_sitemap "$urls" "$base_url" "$date" >"$dst/sitemap.xml"

    fi
}

usage() {
    # help message - missing arg or invald command
    echo "usage: ${0##*/} src dst title base_url" >&2
    exit 1
}

no_dir() {
    # help message - dir does not exist
    echo "${0##*/}: $1: No such directory" >&2
    exit 2
}

readlink_f() {
    # get absolute path to file
    file="$1"
    cd "$(dirname "$file")"
    file=$(basename "$file")
    while test -L "$file"; do
        file=$(readlink "$file")
        cd "$(dirname "$file")"
        file=$(basename "$file")
    done
    dir=$(pwd -P)
    echo "$dir/$file"
}

list_dirs() {
    cd "$1" && eval "find . -type d ! -name '.' ! -path '*/_*' $IGNORE"
}

list_files() {
    # list all files in all src directories
    cd "$1" && eval "find . -type f ! -name '.' ! -path '*/_*' $IGNORE"
}

generate_home_page() {
    date_counter=0
    declare -A date_post; declare -a dates;
    while read -r doc; do
        if [ "$doc" = "$src/index.md" ] || [ "$doc" = "./about.md" ] || [ "$doc" = "./projects.md" ]; then
            continue
        fi

        # borrowed from https://github.com/fmash16/ssg5
        TITLE=$(grep -i title "$1/$doc" | head -n1 | awk -F ":" '{ print $2 }')
        DATE=$(grep -i date "$1/$doc" | head -n1 | awk -F ":" '{ print $2 }')
        DESC=$(grep -i description "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        TAGS=$(grep -i tags "$1/$doc" | head -n1 | awk -F ":" '{ print $2 }')
        IMAGE=$(grep -i image "$1/$doc" | head -n1 | awk -F ":" '{ print $2 }')

        # rough slug system in case of duplicate dates
        slug="$DATE.$date_counter"
        date_counter=$(($date_counter+1))

        # requires bash 4
        dates+=("$slug")
        date_post["$slug"]=$(printf "\
<h1 id=\"$TITLE\" style=\"font-size:1.3em;border-bottom: none; padding-bottom: 0em;\">\n\
  <a href=\"/${doc%\.md}.html\">$TITLE</a>\n\
</h1>\n\
<p style=\"text-align:justified;\">$DESC<br/>\n\
<strong><u>tags</u>: </strong>[$TAGS ]</p>")
    done

    post_count=0
    page_count=0
    # sort dates (newest to oldest)
    readarray -t sorted_dates < <(printf '%s\n' "${dates[@]}" | sort -r)
    for d in "${sorted_dates[@]}"; do

        echo "current article date: $d"

        # write article summary to page
        printf "${date_post[$d]}\n\n" >> "$1/index$page_count.html"
        
        post_count=$(($post_count+1))

        # if we reach max num of posts for this page
        if [ "$post_count" -eq "$max_page_posts" ]; then
            # create a link to next page
            printf "<center><a href=\"index$(($page_count+1)).html\">Next>></a></center>\n" >> "$1/index$page_count.html"
            # update page count and reset post count
            post_count="0"
            page_count=$(($page_count+1))
        fi
    done

    # rename index0.html to index.html
    if [ -f $src/index0.html ];then
        mv $src/index0.html $src/index.html
    fi


}

render_md_files_pandoc() {
    while read -r doc; do
        # where 1=src, 2=dst, 3=title, 4=url
        TITLE=$(grep -i title "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        DESC=$(grep -i description "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        IMAGE=$(grep -i image "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        URI=$(echo "$2/${doc%\.md}.html" | cut -d . -f 2)
        DATE=$(grep -i date "$1/$doc" | head -n1 | awk -F ": " '{ print $2 }')
        pandoc --highlight-style pygments --toc "$1/$doc" |
            render_html_file "$3" \
            > "$2/${doc%\.md}.html" && \
          sed -i -e "/<title>/a <meta property=\"og:title\" content=\"$TITLE\" />\n\
            <meta property=\"og:description\" content=\"$DESC\" />\n\
            <meta property=\"og:type\" content=\"article\" />\n\
            <meta property=\"og:url\" content=\"$URI\" />\n\
            <meta property=\"og:image\" content=\"$4$IMAGE\" />\n"\
            "$2/${doc%\.md}.html"
    done
}

render_md_files_lowdown() {
    while read -r f; do
        lowdown \
            --html-no-escapehtml \
            --html-no-skiphtml \
            --parse-no-metadata \
            --parse-no-autolink <"$1/$f" |
            render_html_file "$3" \
                >"$2/${f%\.md}.html"
    done
}

render_md_files_Markdown_pl() {
    while read -r f; do
        Markdown.pl <"$1/$f" |
            render_html_file "$3" \
                >"$2/${f%\.md}.html"
    done
}

render_html_files() {
    while read -r f; do
        render_html_file "$3" <"$1/$f" >"$2/$f"
    done
}

render_html_file() {
    # h/t Devin Teske
    awk -v title="$1" '
    { body = body "\n" $0 }
    END {
        body = substr(body, 2)
        if (body ~ /<\/?[Hh][Tt][Mm][Ll]/) {
            print body
            exit
        }
        if (match(body, /<[[:space:]]*[Hh]1(>|[[:space:]][^>]*>)/)) {
            t = substr(body, RSTART + RLENGTH)
            sub("<[[:space:]]*/[[:space:]]*[Hh]1.*", "", t)
            gsub(/^[[:space:]]*|[[:space:]]$/, "", t)
            if (t) title = t " &mdash; " title
        }
        n = split(ENVIRON["HEADER"], header, /\n/)
        for (i = 1; i <= n; i++) {
            if (match(tolower(header[i]), "<title></title>")) {
                head = substr(header[i], 1, RSTART - 1)
                tail = substr(header[i], RSTART + RLENGTH)
                print head "<title>" title "</title>" tail
            } else print header[i]
        }
        print body
        print ENVIRON["FOOTER"]
    }'
}

list_pages() {
    e="\\( -name '*.html' -o -name '*.md' \\)"
    cd "$1" && eval "find . -type f ! -path '*/.*' ! -path '*/_*' $IGNORE $e" |
        sed 's#^./##;s#.md$#.html#;s#/index.html$#/#'
}

render_sitemap() {
    urls="$1"
    base_url="$2"
    date="$3"

    echo '<?xml version="1.0" encoding="UTF-8"?>'
    echo '<urlset'
    echo 'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"'
    echo 'xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9'
    echo 'http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"'
    echo 'xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'
    echo "$urls" |
        sed -E 's#^(.*)$#<url><loc>'"$base_url"'/\1</loc><lastmod>'"$date"'</lastmod><priority>1.0</priority></url>#'
    echo '</urlset>'
}

main "$@"

Github Action

jobs:
  # This workflow contains a single job called "build"
  build:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest
    steps:    
      - name: pull existing github page
        run: |
          set -e  # if a command fails exit the script
          set -u  # script fails if trying to access an undefined variable
          git clone "https://${{secrets.API_TOKEN_GITHUB}}@github.com/$GITHUB_USER/$GITHUB_REPO" dst
          cd dst
          git rm -rf .
          git clean -fxd
          echo "$SITE_URL" > CNAME
          cd ..
          
      - name: install dependencies
        run: sudo apt install pandoc vim cpio -y
          
      - name: use ssg5 to build site
        run: |
          chmod +x ssg5
          ./ssg5 src dst "$SITE_TITLE" "$SITE_URL"
      - name: log dst folder structure
        run: find dst/ -print

      - name: log src folder structure
        run: find src/ -print
        
      - name: commit & push page changes
        run: |
          cd dst
          git config user.name "$GITHUB_USER"
          git config user.email "$GITHUB_MAIL"
          git add .
          git diff-index --quiet HEAD || git commit --message "${{github.event.head_commit.message}}"
          git push

Folder Structure (generation side)

├── LICENSE
├── README.md
├── src
│   ├── about.md
│   ├── css
│   │   └── styles.css
│   ├── doc
│   │   ├── year_2-digital_forensics.pdf
│   │   ├── year_3-exploit_development.pdf
│   │   ├── year_3-group_project_proposal.pdf
│   │   ├── year_3-network_infrastucture_testing.pdf
│   │   ├── year_3-snmp_personal_project.pdf
│   │   ├── year_3-web_application_testing.pdf
│   │   ├── year_4-cv.pdf
│   │   ├── year_4-cybersecurity_for_the_vulnerable.pdf
│   │   ├── year_4-honours_project_dissertation.pdf
│   │   ├── year_4-honours_project_proposal.pdf
│   │   └── year_4-mobile_forensics_whitepaper.pdf
│   ├── favicon.png
│   ├── _footer.html
│   ├── _header.html
│   ├── posts
│   │   └── new_blog.md
│   └── projects.md
└── ssg5

CSS (which was not fun)

:root {
    /* colours based on solarized-dark by Ethan Schoonover
    https://ethanschoonover.com/solarized/#usage-development */
    --yellow: #b58900;
    --orange: #cb4b16;
    --red: #dc322f;
    --magenta: #d33682;
    --violet:  #d33682;
    --blue: #268bd2;
    --cyan: #2aa198;
    --green: #859900;
    --base0: #839496;
    --base01: #586e75;
    --base02: #073642;
    --base03:   #002b36;
}

html {
  font-family: 'Source Code Pro', monospace;
  color: var(--base0);
  background-color: var(--base03);
  overflow-y: scroll;
  -webkit-text-size-adjust: 100%;
}

pre {
  color: var(--orange);
  white-space: pre;
  padding: 10px 15px;
  border-radius: 0px;
  overflow: auto;
  text-align: center;
  font-family: 'Courier Prime', monospace;
  font-weight: bold;
}

body {
  max-width: 1000px;
  line-height: 1.5em;
  padding: 1em;
  margin-right:auto;
  margin-left: auto;
}

a {
  color: var(--yellow);
}

li {
  display: list-item;
  text-align: -webkit-match-parent;
}

::marker {
  unicode-bidi: isolate;
  font-variant-numeric: tabular-nums;
  text-transform: none;
  text-indent: 0px !important;
  text-align: start !important;
  text-align-last: start !important;
}

ul {
  list-style-type: circle;
}

h1, h2, h3, h4, h5 { 
  padding-top:1.5rem;
  padding-bottom: 0.5rem;
  display: block;
  font-size: 150%;
  font-weight: bold;
  border-bottom: 1px dashed var(--base02);
  color: var(--yellow) 
}

h1::before {
  content: "# "
}

h2::before {
  content: "## "
}

h3::before {
  content: "### "
}

h4::before {
  content: "#### "
}

h5::before {
  content: "##### "
}

blockquote {
  color: var(--orange);
  border-left: 3px solid var(--orange);
  padding-left: 20px;
}

b, strong {
  font-weight: bold;
}

em {
  font-style: italic;
}

p {
  letter-spacing: 0.01em;
}

nav {
  top: 0;
  text-align: center;
  border-bottom: 1px dashed var(--base01);
  padding-bottom: 1em;
  font-size: 90%;
}

nav > a {
  text-decoration: underline;
}

input {
  display: none;
  visibility: hidden;
}

label {
  display: block;
  position: fixed;
  color: var(--base0);
  left: 90%;
  top: 1em;
  padding: 0.1rem 0.5rem 0.1rem 0.5rem;
  border: 1px solid var(--yellow)
}

.toc {
  position: fixed;
  font-size: 90%;
  left: 80%;
  top: 3rem;
  padding: 0.5rem;
  text-align: left;
  border-bottom: none;
  overflow: auto;
}

.toc a {
  text-decoration: none;
}

ul,
ol {
  padding: 0 0 0 2em;
}

.sourceCode {
  background-color: var(--base02);
  color: var(--base0);
}

.sourceCode pre {
  padding: 20px 20px 20px 20px
}

.sourceCode > code {
  padding: 0;
}

code {
  /* default code*/
  font-size: 100%;
  font-family: inherit;
  color: var(--orange);
  background-color: var(--base02);
}

code span.dv {
  /* numbers */
  color: var(--blue);
}

code span.op {
  /* main body */
  color: var(--base0);
}

code span.cf {
  /* statements i.e. if */
  color: var(--green);
}

code span.fu {
  /* keyword */
  color: var(--orange);
}

code span.kw {
  /* symbols */
  color: var(--yellow);
}

code span.co {
  /* comments */
  color: var(--base01);
}

code span.al {
  /* TODO comments */
  color: var(--magenta);
}

code span.st {
  /* string */
  color: var(--green);
}

code span.sc {
  /* string/character */
  color: var(--green);
}

And that is pretty much all there is to it.