Another blog update
2025-10-24
In my last post, I mentioned that I was thinking of rewriting my web server from Go to Rust and essentially creating my own static site generator. Fortunately, in the meantime, I realized that there was a much simpler solution right under my nose.
The main issue I wanted to solve was that it was a bit annoying to convert the Markdown posts I’ve been writing into HTML to be served on my website. As I described in an earlier post, I’ve been using pandoc and m4 to perform this conversion, along with a couple of scripts that I’ve been running manually. This was better than the even more manual process I was following before, but it still wasn’t good enough. At the time, I was excited to cobble something together with m4 and some shell scripts, showing off snippets like:
syscmd({{sed -n '/def load_blogs/,/^$/p' scripts/publish.py | sed '$d'}})dnlas a triumph. Again, this was somehow better than my earlier workflow with org-mode or writing HTML by hand, but the mental overhead of remembering how m4 worked was still keeping me from writing posts. You might reasonably be thinking that I wouldn’t really have to think about m4 in every post because it should only pop up when I want to use funny commands like this. Unfortunately, that is not the case. The default command syntax in m4 uses backticks to delimit commands, which means I always had to remember to include:
changequote(`{{', `}}')dnlat the top of my posts, or every code snippet throughout the file would be mangled. And there was also a weird spacing quirk with newlines around this command that I still don’t fully understand. I don’t want to disparage m4 in general, but it really didn’t feel like the right tool for this job.
So what have I replaced it with? This Python script:
import subprocess
from pathlib import Path
from watchfiles import Change, watch
from publish import load_blogs, update_blogs
BLOG_FILE = "json/blogs.json"
BLOGS = load_blogs(BLOG_FILE)
def expand_templates(contents: str) -> str:
"""
This is where I'd put my jinja templates, if I had any
"""
return contents
def run_pandoc(contents: str) -> str:
return subprocess.run(
["pandoc", "-f", "markdown", "-t", "html"],
check=True,
capture_output=True,
text=True,
input=contents,
).stdout
def update(path):
path = Path(path)
contents = path.read_text()
html = run_pandoc(expand_templates(contents))
out_path = (Path("blogs") / path.name).with_suffix(".html")
out_path.write_text(html)
update_blogs(BLOG_FILE, BLOGS, str(out_path))
if __name__ == "__main__":
subprocess.Popen(["make", "run"])
for changes in watch("drafts/"):
for change, path in changes:
match change:
case Change.modified:
update(path)
case Change.added if (Change.deleted, path) in changes:
update(path)The main exciting thing here is the use of the watchfiles package
to watch my drafts/ directory for changed files and to fire
off the commands to update the corresponding HTML file when they change.
These commands currently just include shelling out to pandoc, but as the
comment notes, I’ve added an expand_templates function
where I can fill in jinja templates at some point. I plan to use this
for cases like the sed command above where I want to include a whole
code file or a section of a file. This should be a lot nicer than the m4
commands I tried to write because I can write all of the logic in Python
and have something like:
{{include("/path/to/file.py")}}
in the body of my post. I can also keep reusable functions like this
within the watch.py script above instead of defining them
within each post (or including another m4 file or whatever I was doing
before).
Now if I want to write a post, all I have to do is open the repo,
kick off this watch.py script, and start editing a file in
the drafts/ directory. Saving it for the first time will
automatically handle everything else.
The very last pain point is not too visible here, but the old
publish.py file, which I can mostly delete now, looks like
this with the imports and __main__ block hidden for
brevity:
@dataclass
class Blog:
Title: str
Filename: str
Date: str
def load_blogs(filename):
with open(filename) as f:
return [Blog(**d) for d in json.load(f)]
def update_blogs(filename, blogs, name):
if not any((b.Filename == name for b in blogs)):
date = datetime.today().strftime("%Y-%m-%d")
blogs.insert(0, Blog(name, name, date))
with open(filename, "w") as out:
json.dump(blogs, out, default=lambda o: asdict(o), indent=4)I’m just using the path to the HTML file as both the
Title and Filename. If I could just extract a
real filename from Markdown metadata at the start of the input, this
setup would essentially be perfect. It looks like this already works
with my basic pandoc command above, so I just need to update my Python
scripts to get the title from a line like this at the start of the
file:
% My title here
Now I might start writing these more often.