2 Meg in the Embed, and the little tweet said “Give over”

As the embedded tweet says, I was shocked to discover that a single Twitter embed – inserted into a WordPress blog using the native embed feature – more than doubled the download size of my site, caused by dozens of Twitter JavaScript resources, all fetched using HTTP/1.1. For tweet that contains just text from me(!), it is unpalatable that so much – well, crap – is pulled in to render it. I’ve come up with a solution now but before that, I sent the above tweet in the hope that someone might have a magic fix. My plan sort of worked, it generated a lot of replies from people more knowledgable about web development than me. The discussion was very interesting, I recommend you read it. But if you don’t have the time for that my summary is:

  1. This is not a “new” problem.
  2. Lots of people assume my website is built on some kind of Server Side Rendering paradigm where I have a build step before deploying static assets.
  3. There are lots of “clever” techniques that can be applied during the build step in order to avoiding the high cost of the Twitter embed code. The meaning of clever in this context seems to vary person-by-person.

Points 2) and 3) were quite eye opening for me. I like to play “dummies advocate” with technology and think about how someone on a managed blog platform might respond to such advice. Could the realistically change their entire CMS over from a nice WYSIWYG editor, to a bunch of technologies with unfamiliar names that need to be installed and managed via continuous integration? While something like Github Pages is nice, elegant and simple for techies like me that use GitHub and Markdown on a daily basis, it doesn’t fly with the family and friends I play tech support for. Imagine how hard it is to simply explain what a repository or a pull request is to someone that has never heard of source control. Some might argue that dynamic sites are too slow in this day and age but read the Background section for why I don’t agree.

I tried to express my point in the Twitter discussion and, to be fair, some pragmatic answers came out. For instance, one suggestion was to just take a screenshot of the tweet and add a hyperlink to Twitter. That actually seems pretty fine, except it would still gall me to encode short text in a bloated image. And it would probably be more work than I could be bothered to do. I hope Twitter might consider offering a “light” embed but I won’t hold my breath.

Towards the end of the replies I got this one from Steren:

This seemed more up my street. Perhaps I could just manually copy the text out of tweets and stick them in an HTML block in my WordPress editor? Manual work sucks though, surely someone has a script for this right? I Googled for a bit, and started reading into the deep corners of Twitter APIs, backlash over past changes, and the annoyance for the requirements for OAUTH access to do things. That last point in particular seems to be something that affects solutions to the ligher-weight embed challenge, you have to get an API key, and I really can’t be bothered with another thing to worry about.

Solution

During my search, I found Arthur Rump’s page Fallback styling for embedded Tweets. This was awesome because it explains that:

Embedding a Tweet on your website is easy to do. Find the tweet, click on Embed Tweet, copy, paste, done.

Note how all the actual content of the Tweet is just text in a blockquote. That’s great because if the script does not load, the content you wanted to share is still there and readable. This could happen in situations where Twitter is entirely blocked (whether by a company or a nation-state), the user has JavaScript disabled, or because the script is blocked by Content Blocking in Firefox. However, this means that Tweets will be rendered as blockquotes by default.

Arthur Rump

Sure as dammit, after the </blockquote> is <script async src="https://platform.twitter.com/widgets.js". And in my case, I explicitly want to block this thing that loads all the crap! And I want to do this, so that users don’t have to.

So my solution that removes most of the manual steps is a one-liner:

$ curl -s https://publish.twitter.com/oembed?url=[url of tweet] \
| jq -r '.html' | \
sed 's/class="twitter-tweet"/class="twitter-tweet" data-dnt="true"/' \
| sed 's/<script.*<\/script>//g;' | \
tr -d '\n'

Or a simple bash script I like to call embed-tweet-sans-crap.sh:

#! /bin/bash

curl -s https://publish.twitter.com/oembed?url=$1 | jq -r '.html' | sed 's/class="twitter-tweet"/class="twitter-tweet" data-dnt="true"/' | sed 's/<script.*<\/script>//g;' | tr -d '\n'

For example, to grab the tweet that I embedded at the top of this page:

$ ./embed-tweet-sans-crap.sh https://twitter.com/SimmerVigor/status/1345393494559502337

<blockquote class="twitter-tweet" data-dnt="true"><p lang="en" dir="ltr">Due to a single embedded tweet, it seems my website at <a href="https://t.co/jLqWpZKaB6">https://t.co/jLqWpZKaB6</a> loads about 2.5 megabytes of JS across a dozen HTTP/1.1 requests to <a href="https://t.co/7OXJ9hd5oI">https://t.co/7OXJ9hd5oI</a>. WTF, this doubles the total size of download resources. I think I&#39;ll just delete the embed.</p>&mdash; Lucas Pardue (@SimmerVigor) <a href="https://twitter.com/SimmerVigor/status/1345393494559502337?ref_src=twsrc%5Etfw">January 2, 2021</a></blockquote>

Then, in WordPress embed the tweet by adding a Custom HTML block and pasting the blockquote. As Arthur points out, by default it will look quite plain but it can be fixed using CSS. So I just borrowed Arthur’s and added it as a global custom CSS change. Here’s what it looks like in the editor:

Screenshot of WordPress WYSIWG editor with HTML block.

Performance Results

So did my work change anything? Here’s a before/after example for the main homepage. I think this is a success even if the embedded tweet is functionally and stylistically basic. It’s better than just deleting the embed…

With crap: ~50 requests, 3.2 MB transferred, 4.4 MB resources.

Sans crap: 31 requests, 453 kB transferred, 956 kB resources.

Note that because of lazy loading, scripts or images don’t appear to have a huge effect on some of the critical events in either case. But it sure doesn’t offend me to see that, due to Cloudflare’s speed features (see Background) DOMContentLoaded and Load times are under 200ms, with Lighthouse reporting a Performance of 96 based on First Contentful Paint 0.4s, Time to Interactive 0.4s, and Largest ContentfulPaint 0.5s. I lost points on Cumulative Layout Shift but I can fix that some other day.

Page load with Twitter embed code. 50 requests, 3.2 MB transferred, 4.4 MB resources.
Page load with Twitter embed code. 50 requests, 3.2 MB transferred, 4.4 MB resources.
Page load with no crap. 31 requests, 453 kB transferred, 956 kB resources.

Background

My annual end-of-year tradition is to login to my blog and fiddle about. I make grand promises that I’ll do more blogging each year and typically fail. But I also take the opportunity to make some technical change or improvement. This time around I intended to give Cloudflare’s Automatic Platform Optimizer (APO) a spin to see how much it could improve my WordPress-powered blog. (Disclaimer: I am a Cloudflare employee but I don’t work on this product. My clever colleagues do, and I was keen to see just how turnkey this solution was and its impact on a typical blog, maintained by a lazy owner such as myself would be).

My jist of APO is that it makes your slow, dynamically-rendered, WordPress site super fast by caching everything in Cloudflare’s edge. It does this via a WordPress plugin that magically monitors the site and coordinates with Cloudflare to rapidly purge and cache whenever there are changes. Yevgen and Sven’s blog goes into some great detail on the matter https://blog.cloudflare.com/building-automatic-platform-optimization-for-wordpress-using-cloudflare-workers/.

My speciality is on the network protocols side of things. So whenever I’m investigating a website, I’ll fire up the Dev Tools Network panel and start looking at what requests are happening, where they going, how they are performing etc. For deep inspection, I’ll also head over to WebPageTest, but for quick tests local dev normally gives some good indications. The network trace for my site with Twitter embed code was shocking. I wasn’t sure at first where it came from and had to hunt through the blog posts to find it. The embed is a minor decoration and the blog would not have lost anything if I simply removed it. However, I’m happy with the solution I came to especially because it integrates with my existing workflow. I now benefit from loading more content via APO and other speed features on Cloudflare’s edge.

Dicey Dungeons is a delightful duende

Dicey Dungeons is a roguelike where you roll dice in dungeons. The premise is a game show but it reminds me more of The Running Man than Wheel of Fortune.

Anthropomorphic die uses non-anthropomorphic die to attack charming Honey Monster in a space suit

This game has simple mechanics on the surface but steadily grows in complexity and difficulty; each successful run is meted with a failed spin of the not-wheel-of-fortune wheel and an unlock of a new character or challenge. Each of these tweaks the core mechanics in a way that makes the next run unique and interesting.

Dicey Dungeons was originally a 7DRL game jam entry. I will admit that although I am a fan of Terry Cavanagh’s other games, I was not initially enamoured with that version.

Terry Cavanagh’s animation of the 7DRL entry shared on itch.io – https://itch.io/jam/7drl-challenge-2018/rate/234586

However, the released version of the game is just pure joy. The design is delectable – the characters are charming and the sound design is spot-on. The music track reminds me of 90s game shows such as Catchphrase, Strike it Lucky and the Krypton Factor. Cutscenes pepper the action just enough without getting in the way, the dialog is sharp and the wit is cutting.

Stereohead is the best character

Recommended? Yes

Investment required? Easy to learn, hard to master

Tips? Overconfidence has consequences. The Kraken is an arse.

2020 gaming in 200 words a week

This blog has been a little quiet in 2019. After a days work it can be hard to think of something worth blogging that isn’t tied to my working day, and without inspiration it becomes difficult to muster up the energy to write anything.

So I’ve decided to create a framework for blogging: a single post, once a week, that reviews a different game in a maximum of 200 words.

To kick things off I’ll start with Crusader Kings II.

Poor Princess Sofie, the end of her life was crap

Crusader Kings II is a grand strategy game where you rule over things. This could be a small county, a country, kingdom or continent spanning empire.

Your character in this game is your dynasty. When your current avatar dies, control (and your lands) are passed to the heir. This generally upsets everyone. It becomes important to have an heir that rubs along well with others, so much of the game is spent planning, politicking, plotting and in promiscuity in the pursuit of perfection.

I’ve only played this as a family building the Kingdom of Wales, which grew to Ireland and Brittany but all ended in tears. The start of the downfall? The successful assassination of my wife via manure explosion. 

This game has such great depth and I sadly only ever scratched the surface. From scouting the planet to find the perfect council members and betrothed for my children, to putting up with domestic disputes.

My overarching impression was that Crusader Kings II is a mix between searching through electronics mail order catalogues trying to piece together some weird board and being in charge of the stories in Eastenders.

Recommended? Yes

Investment required? Lots

Tips? Don’t imprison your heirs

Web protocols ate my hosting

New Year, same old blog. A new style, some broken links and some fixed-broken links. Here’s to yet another WordPress-based blog! Now for some background.

An actual front page of a newspaper.

This website has been through a small number of hosts. It started off being hosted on a wordpress.com subdomain. I then migrated it over to a cheapo shared hosting solution, which worked pretty well for the low volume of traffic that it served. As we enter 2019 I am pleased to present the blog from a cheapo VPS, which is fronted by a free CDN.

Why change?

Coming into 2018 there were two aspects that prompted a reconsideration of my hosting:

  1. I started to take a deeper interest in running my own services over the web protocols I was spending so much time working on.
  2. The march of browsers towards treating http:// urls as insecure, and by virtue restricting powerful features (Web Platform APIs like Service Worker).

My old shared hosting was starting to feel restrictive in the face of these changes but I was in no rush to change things. Domains are cheap as chips, so I switched focus to a different project https://quic.stream.

quic.stream

The quic.stream domain was a fun name that was going cheap. For this not familiar with the QUIC protocol, it is a UDP-based always secure and multiplexed transport protocol undergoing standardisation in the IETF. It achieves multiplexing within a single QUIC connection by the use of logical streams.

Around the time I purchased the domain, the options for running a QUIC server that could speak to web browsers were pretty limited. The most straightforward way to get things working was to use Caddy server, a Go-based web server that made use of the excellent quic-go library. If you’re interested in trying it out the Wiki has some instructions that may, or may not, work for you.

Word of warning: at the time was based on Google’s earlier QUIC specification. Google Chrome was the only browser that seemed to interop properly. Google continue to experiment and the interop gets broken pretty often. If you visit https://quic.stream and try the connectivity test you are likely to find that QUIC fails, and there is no way for me to detect in JavaScript that this is because the browser outright cobbled it.

Simple-stupid security

Anyway, none of that matters much. What is of more interest is that QUIC’s “always secure” principle matches Caddy’s “Automatic HTTPS on by default” design philosophy. Caddy achieves this by means of Let’s Encrypt, and it does it all behind the scenes without making you waste your time on figuring anything out.

As a technical user I’ve gotten my head around the acronym soup of PKI, CSR, PEM, CRT. However, it all just becomes a PITA for something that is just supposed to be minimal effort or fun on the weekends. In contrast, Caddy and Let’s Encrypt made things so simple-stupid that I was able to do a live demo during a lecture to students at Lancaster University. This took the form of provisioning a new VPS instance (with Caddy pre-installed), creating a new DNS subdomain, and rolling a Hello World config. It took less than 10 minutes.

Tuning web security to the N-th degree

After creating a rough and ready site with Caddy that had great transport security, my attention turned to web security: CSP, CORS, HSTS, SRI etc. I’d never really looked at this before and found it pretty tedious to get right. I appreciate the difficulties in securing the Web Platform in complex User Agents but it sucks to have to rewrite simple button element script calls because of possible injections.

After much effort, quic.stream scores an A+ on the Mozilla Observatory’s HTTP Observatory tests. You can view results at: https://observatory.mozilla.org/analyze/quic.stream

What this exercise taught me is that I benefited from having fine-grain control of the server behaviour. For example, explicitly controlling HTTP headers and using scripts on the server to generate SRI hash values.

Shared hosting rubber gloves

After the success with one site, I took another look at this one. I wanted to secure it using free certificates from Let’s Encrypt, and I wanted to have more control over some of the lower level stuff.

The admin panel of the shared hosting felt like trying to scratch with rubber gloves on. Worse still, they wanted to nickel-and-dime me to pay for certificates.

Migrating the whole site to something else would require effort and time I didn’t have over the summer. So I took a different kind of quick, simple-stupid measure: I signed up for Cloudflare’s free CDN service.

This worked really easily and took about 10 minutes. I signed up, enabled 2FA, and followed the instruction to change my nameservers to Cloudflare’s. I got TLS 1.3 termination immediately, which was cool!

However, since my old shared hosting was insecure, I need to enable Cloudflare’s Flexible SSL mode (see this explanation). In essence, although the connection between User Agent and Cloudflare’s edge was secure, the connection to my shared hosting origin was insecure. There was no complete end-to-end security.

Now you might say that the site doesn’t handle much of importance but that doesn’t matter. For a long list of reasons why security is important regardless of content, check out Troy Hunt’s blog post Here’s Why Your Static Website Needs HTTPS.

Although the migration was smooth, I found some issues with mixed-config warnings while using Flexible SSL. In my haste to fix this I got into some weird URL issue that ultimately meant Cloudflare couldn’t load any images from my origin. Rather than waster time pursuing this, I decided the long term solution would be to migrate.

Rolling my own

So I finally found the time to take a look at rolling my own WordPress hosting. On first inspection running a LAMP-like stack using Caddy seemed a bit daunting. And I want PHP for some other future project, so the thought of changing blogging software was out of the picture.

I decided to go with a vanilla LAMP stack. And I was excited to use Apache HTTP Server because I’d heard a few things about the newish mod_md module. In this case md means Managed Domains. The module provides the means for automatic Let’s Encrypt certificate management. The other bonus was that the I’d shared a glass of wine with the author, Stefan Eissing, in the past. (Stefan also developed the mod_h2 module, which provides HTTP/2 support in Apache).

Now, unfortunately I somehow got completely side tracked during the setup phase of all of this. Rather than using mod_md I ended up using certbot.

The reason why is because I was very excited about Let’s Encrypt wildcard certificates when they were announced in March 2018. The benefit of certificates with wildcarded subdomains is that I can reuse the same one across the various experiments I have in mind this year, without having to go through a Let’s Encrypt dance. this is especially helpful for other pieces of software that have no automatic capabilibty in-built.

Flicking through the documentation, it states:

If you want to obtain a wildcard certificate using Let’s Encrypt’s new ACMEv2 server, you’ll also need to use one of Certbot’s DNS plugins.

Since I was a Cloudflare customer on this domain, I could use a Certbot plugin – certbot-dns-cloudflare.

All-in-all the Certbot process wasn’t too bad. I have a certificate and private key usable in a few contexts, and Certbot is responsible for updating it every 90 days.

I would have liked to get mod_md working. However, I was also a bit lazy and relied on my distro’s packaged Apache and my familiarity with the more conventional Apache config directives. It would be good to find out if the module supports what I ended up doing.

Was it worth it?

At the most superficial level probably no. I was able to fix broken images but they are pretty pointless anyway.

At a more fundamental level, I’d say the migration was worth it. I have the experimental platform and fine grain control that I wanted, while at the same time providing more robust end-to-end security. Furthermore, the VPS is more performant and has more flexibility to manage scaling. When combined with the capabilities of a CDN, I think this blog has the potential to be a lot more web performance happy. However, WebPageTest marks me down in a few areas, there is still work to do…

SpotifyStatusApplet v1.3 Beta 1

FYI: SpotifyStatusApplet was broken by a Spotify API change made in Q3 2018. This page is provided for archive purposes and the download has been removed. More information is available on the Project Page.

A small update to SpotifyStatusApplet has been released as a beta.

This version fixes a critical issue with newer versions of Spotify that prevent the applet from working. Thanks JeffreyO.

It also adds the (much requested) feature of playback control using the remaining soft keys: 2 – previous, 3 – play/pause toggle, 4 – next. Thanks JeffreyO.

More information is available on the Project Page.


SpotifyStatusApplet v1.2 Beta

FYI: SpotifyStatusApplet was broken by a Spotify API change made in Q3 2018. This page is provided for archive purposes and the download has been removed. More information is available on the Project Page.

A small update to SpotifyStatusApplet has been released as a beta. This version adds the ability to toggle on/off the Field titles (Track, Album and Artist) by pressing “soft key 1″, the first key underneath the LCD on most models.

More information is available on the Project Page.

Ode to IBM Rational Rhapsody

Ode to Rhapsody

to the tune of “Comme d’habitude” / “My Way”

Source: Wikipedia

And now,  the end is here
And so I face, the final codegen
My friend, I’ll say it clear
I state my Use Case, I’ll draw a Lifeline
I’ve declared a class that’s pure
I created each and every dependency
And more, much more than this, I did it in Rhapsody

And now,  the end is here
And so I face, the final codegen
My friend, I’ll say it clear
I state my Use Case, I’ll draw a Lifeline
I’ve declared a class that’s pure
I created each and every dependency
And more, much more than this, I did it in Rhapsody

Branches, I’ve merged a few
But then again, too few to mention
I did what I had to do and saw it through without testing
I planned each charted course, each careful step along the activity
And more, much more than this, I did it in Rhapsody

Yes there were times, I’m sure you’ll mention
When I lost my rag, adding functions
But through it all, there is still doubt
Should it be In, InOut or Out
I chose them all, I felt small, I did it in Rhapsody

I’ve waited, I’ve waited and cried
I’ve had my fill, my share of generating
And now, as tears subside, I find it all so amusing
To think, I did all that
And may I sat, not in an efficient way
Oh no, oh no not me, I did it in Rhapsody

For what is a shared pointer, what has it got
If not newed, then it should be nought
To be the thing it truly feels, must be cast dynamically
The model grows, the model slows, I did it in Rhapsody

Yes, it was Rhapsody

P.S. The etymology of Rhapsody leads us back to Greece; rhaptein ‘to stitch’ + ōidē ‘song, ode’.




Antimatter Poster

During university I produced a poster on the topic of Antimatter as part of a Communicating Science module.

I put this up on the web in 2007 and years on, I am seeing traffic to the site driven by a link contained in a Rutgers assignment concerning Information Design. Unfortunately, across the years the poster image had become unavailable…. until now. While the poster was discoverable via a Google search (finding it hosted without permission on other sites, another story) in order to help out those eager students I will host it here permanently.

Software Metrics and Craftsmanship

Software Development Metrics are a notoriously difficult area, there exists a large allure of quantitative measurements that can be condensed into a top-level dashboard showing progress, plan alignment, effort , costs and so on. However, the unfortunate reality is that the collection and analysis of Software Development Metrics is difficult for a number of factors such as complexity, variance, meaning and gaming of the system. Often it is a difficult case of understanding if we are more interested in the performance of the project (Health) or the performance of the developer (Efficacy).

The effectiveness of metrics themselves will most certainly affect their usefulness and the strength of decisions based on their analysis but it is the potential for gaming of the system that I find most worrying. In the simplest sense, I mean the Human optimization of the reward/penalty system for which the metric provides an input. This coupled with Measurement Inversion begets an environment that can devolve into detrimental, with the focus on low value metrics and developers preoccupied on maximizing their score rather than on the Health of the project.

I feel some of the difficulty with Software Development Metrics comes from the nature of the development activity. Despite attempts at documentation, formalization and rigidity (with various levels of success) much of the development activity can still be quite organic. To be more explicit, the concept of Software Craftsmanship is one that strikes a similar chord to my experiences. This goes as far as an online manifesto and prompts rebuttals of the craftsmanship concept. This seems to be quite a conflicting topic but the way I read things, it is often a matter of perspective with the contributors sometimes advocating the some underlying principles. Perspective and trend also come into the job titles; what are the differences between Software Engineer, Software Developer and Software Programmer?

Following the line of argument, although we could say that developers are not factory workers and that there may be some level of craft to their work, does it rule out metrics? I wouldn’t say so completely but I think the focus showed be more towards the Project Health side of things. A craftsman may prefer to be measured by the quality of the product rather than the volume – and if human nature tends toward gaming of the system, the result will ideally be to improve the quality. Now we just need to agree on Software Quality metrics…

Image courtesy of FreeFoto