My adventures with XMPP

Gemini version of this post

This week, I decided to self-host an XMPP instance in Ávalos' Indie Server 2 (my RPi 4). Why not Matrix? Because, to be honest, I don’t like it: the protocol is huge, complex and inefficient. XMPP is small, simple, lightweight and well-established. I don’t need nor want the thousand features Matrix provides.

My inspiration to try self-hosting XMPP came in part from this awesome post:

XMPP + the essential XEPs have a very small surface compared to the huge and RESTful mess Matrix is; but there’s a serious lack of funding and interest on XMPP, because it’s not hype. Most Big Tech companies have adopted XMPP at some point, built their own proprietary extensions on top of it, disabled federation and finally got rid of XMPP; rarely contributing back to the community.

I recommend giving a read to these interesting related essays from Seirdy, including case studies of XMPP and Matrix:

Before deciding to host my own server, I used to use the free XMPP account that Disroot provided me. I didn’t use it that much, but I played around with it for a while. I still have that account (avalos@disroot.org), but in case you want to reach me out on XMPP for anything, contact me using my self-hosted ID: ivan@avalos.me.

I created accounts for my family, so we can talk using our own hardware, free from surveillance capitalism and garbage. I believe one of the most important things for modern IM clients are push notifications. If there is a main reason why my family has been reluctant to trying new chat platforms, is precisely due to the lack of fast and reliable push notifications.

Most mainstream chat platforms solve this problem easily by using cloud services such as Google Cloud Messaging (GCM) and Apple Push Notifications service (APNs), that are part of the operating system; but then, there is the problem of privacy and metadata. Privacy-respecting apps will usually avoid those cloud services in favor of custom background processes; but they’re never as reliable because most mobile OSes kill background processes in order to improve battery life.

I made this cute graphic yesterday, btw, using Inkscape.

Graphic (XMPP + OMEMO)

Server

The two main server implementations for XMPP are ejabberd and Prosody. I chose Prosody because two of its goals are to be “easy to setup and configure” and “efficient with system resources”, and both are very important to me. It has excellent support for XEPs (extensions), it is actively maintained and has excellent documentation.

Setup was relatively easy. The only problem I encountered was that self-signed TLS certificates are rejected by most instances, so federation didn’t work. I had to use certbot in order to get valid certificates issued by Let’s Encrypt. After the initial setup, the rest was basically configuring XEPs, some of which require extra virtual hosts (and therefore extra TLS certs) for different things (MUC, HTTP file upload, proxy65, etc.). I got 76% compliance on Conversations.im compliance test.

Screenshot of XMPP Compliance Tester results

So far, performance has been excellent, federation has been fast, and everything has been smooth in general, no problems on the server side.

Clients

This is where the lack of hype becomes more noticeable. Once upon a time, XMPP was hype and a lot of awesome clients were available. You could connect to Google Talk and Facebook Messenger using your favourite XMPP client, because both were actually XMPP under the hood. But once they closed federation and ditched XMPP, the protocol lost its hype, and XMPP had a significant decline in popularity.

Many of the original clients are still maintained, but most of them lack modern features such as OMEMO encryption (double-ratchet-based end-to-end encryption) and mobile support. They also look terrible and don’t meet the UI/UX expectations of the younger generations, introduced by Big Tech.

I tried a lot of different clients, the best of them only support GNU/Linux and Android. iOS and macOS clients are all extremely buggy and far from usable. After nearly a week of searching and experimenting, I picked my favourites: Gajim (GNU/Linux and macOS), Dino (GNU/Linux), Conversations (Android), ChatSecure (iOS) and Siskin IM (iOS).

On Android, the obvious choice is Conversations, whose developers are also the ones behind OMEMO. Conversations supports most of the essential XEPs, and has a simple, modern and clean UI/UX. I’ve heard a lot of success stories of people recommending it to their grandparents. It is libre software! It costs 3.49 USD on the Google Play Store, but you can get it for free on F-Droid.

On iOS, there are no good clients. The best ones I tried were ChatSecure and Siskin IM; but ChatSecure crashes a lot, has trouble detecting OMEMO fingerprints, doesn’t support Jingle (VoIP) and has limited features. Siskin IM has none of those problems, but it’s uglier and has poor support for MUCs (multi-user chats), let alone OMEMO in MUCs. I ended up sticking to ChatSecure because of MUCs; but I may end up using both. Both are libre software!

On GNU/Linux, the best ones are probably Gajim and Dino. Gajim has a lot of history, while Dino is relatively new. Both have excellent support for OMEMO. Gajim has a more traditional UI/UX based on GTK+, whereas Dino complies with modern GNOME design guidelines. Gajim is written in Python. Dino is written in Rust. Both are libre software! I wouldn’t ever recommend non-libre software.

macOS is a similar story to iOS: no good clients. Beagle IM was the best I could find, but it lacks some features and crashes. It looks similar to Siskin IM, and shares a lot of code (and problems) with it, because both are developed by Tigase, Inc. Luckily, I managed to get Gajim running on macOS following the official instructions; the only way is currently to build it from source. As it is written in Python, you only have to install some libraries and Python modules.

Screenshot of Gajim on macOS