Sang's Blog

Matrix and XMPP, messaging you actually own

Every message you send on WhatsApp, Telegram or Discord is routed via someone else’s server. You do not own the conversation. You cannot control the archive. You cannot transfer your chat history to another platform without losing it all. The platform can change its privacy policy, display adverts, sell your data or even shut down entirely, and there is nothing you can do about it. The average person sends thousands of messages a month. Over a lifetime, this amounts to a detailed record of personal relationships, work discussions, ideas and decisions. All of this information belongs to corporations whose business model depends on keeping you locked in.

Two open protocols offer an alternative: XMPP, born in 1999 as Jabber, is the veteran. Matrix, launched in 2014, is the newer contender. Both are federated, meaning that, like email, anyone can run a server and communicate with anyone else on any other server. They are both standardised and have multiple independent implementations. They both let you own your messages because the data lives on a machine you control. However, they take fundamentally different approaches to the same problem.

XMPP - the XML veteran

XMPP stands for Extensible Messaging and Presence Protocol. Originally an open-source instant messaging project called Jabber, it was standardised by the IETF as RFC 6120 and RFC 6121 in 2011. The core concept is straightforward: messages are XML stanzas that are routed between servers via persistent TCP connections. A stanza is a small, self-contained XML fragment — a message, a presence update or an IQ (info/query) request. The protocol handles three things: messaging, presence (online/away/busy) and contact list management. Additional specifications built on top of the core, known as XMPP Extension Protocols (XEPs), define everything else — file transfer, voice calls, group chats and end-to-end encryption.

An XMPP address resembles an email address: user@example.com. This is called a JID (Jabber ID). When you send a message to friend@otherserver.org, your server establishes a connection with otherserver.org and delivers the stanza. Federation has been built into the protocol since the beginning. There is no central server, no single point of control and no company that can turn off the network.

XMPP’s architecture is connection-oriented. Clients maintain a persistent TCP socket to their server and servers maintain persistent connections to each other. Messages flow through XML streams — a stream begins with an opening stream tag and continues indefinitely, carrying stanzas in both directions until a closing stream tag ends it. This design is efficient for real-time communication because the connection stays open, allowing messages to be delivered instantly without polling.

The protocol has been deployed on a large scale. Google Talk used XMPP from 2005 to 2013. Similarly, Facebook Chat used XMPP from 2008 to 2014. WhatsApp’s backend was originally built on a modified version of Ejabberd, an XMPP server. While these platforms eventually abandoned federation and moved to proprietary protocols, the fact that they started with XMPP speaks volumes about its maturity and capability. Today, XMPP powers services such as Conversations (Android), Siskin (iOS) and Gajim (desktop). Lightweight and well-documented servers like Prosody and Ejabberd can handle thousands of concurrent users on modest hardware.

The main criticism of XMPP is its reliance on XML. XML stanzas are more verbose than JSON or binary protocols. The namespace system is also complex. Some XEPs overlap or conflict. Historically, mobile battery life was a problem because maintaining a persistent TCP connection drained power; however, modern extensions have mostly solved this issue with push notifications and optimised keep-alive strategies. Despite these issues, XMPP remains a robust, well-established protocol with a rich ecosystem of extensions covering almost every conceivable messaging feature.

Matrix - the HTTP-native newcomer

Matrix is a new protocol designed from scratch for decentralised, real-time communication. Created by the team at Element (formerly New Vector), it is now governed by the Matrix.org Foundation. The protocol specification is open source, and there are multiple server and client implementations.

While XMPP uses XML streams over persistent TCP, Matrix uses HTTP and JSON. Every interaction — sending a message, joining a room or uploading a file — is an HTTP request. This makes Matrix easier to integrate with web applications and simpler to implement, as any HTTP stack can handle it. However, HTTP is a request-response protocol, not a push protocol. Matrix solves this issue using long polling: clients send a GET request to /sync and the server keeps the connection open until new data is available. At this point, the server responds with a batch of events. The client then immediately makes another /sync request. This creates an effective real-time stream over plain HTTP.

The core concept in Matrix is the room. A room is a shared conversation space, identified by a unique ID such as ‘!abc123:example.com’. Every message, file, reaction and state change in a room constitutes an event. These events are stored in a directed acyclic graph (DAG), which makes the room’s history a cryptographically verifiable chain. Each event references those that came before it, enabling the full state of a room to be reconstructed from its event graph. This differs fundamentally from XMPP, where group chat (MUC) is implemented as a service on a specific server, with messages routed through that server. In Matrix, a room is a shared data structure that is replicated across all participating servers.

In Matrix, federation works by having each homeserver replicate room state. When a user on server-a.com sends a message to a room, their homeserver signs the event and forwards it to every other homeserver participating in the room. Each server then validates the event and verifies the signature before appending it to its local copy of the room DAG. This means that every server has a complete record of the room’s history and can make this available to its local users independently of the origin server. It also means that a room survives as long as any one server still has a copy, providing strong resilience.

Matrix is capable of handling more than just text chat. The protocol is designed as a generic event synchronisation layer. While messaging is the primary use case, Matrix can also carry VoIP signalling (via WebRTC), Internet of Things (IoT) sensor data, and arbitrary real-time events. The specification deliberately keeps the core protocol simple, pushing features such as end-to-end encryption (Olm/Megolm), reactions, threads and spaces into separate modules.

The philosophical difference

Both XMPP and Matrix solve the same problem of federated, open messaging, but they embody different philosophies. XMPP stems from an era of layered protocols. The core is minimal: XML streams, stanzas and routing. Everything else is an extension. You can deploy only the XEPs you need and ignore the rest. This means that an XMPP server can be extremely lightweight — for example, Prosody can run on a Raspberry Pi with just tens of megabytes of RAM. However, this also means that client compatibility can be patchy, as not every client implements every XEP that you want to use.

Matrix comes from the era of monolithic platforms. Its core specification is more extensive because it attempts to define a comprehensive set of features out of the box. Features such as rooms, state resolution, federation and end-to-end encryption are all part of the base protocol rather than optional extensions. While this makes Matrix clients more consistently capable — every Matrix client can join rooms, send messages, and read history — it also makes the server heavier. Synapse, the reference Matrix homeserver, requires significantly more resources than Prosody. Although newer implementations such as Dendrite (Go) and Conduit (Rust) are closing this gap, Matrix is still more expensive to operate than XMPP at the lower end of the scale.

There is also a difference in terms of identity. XMPP uses JIDs that resemble email addresses, such as ‘user@domain’. This mirrors the email model, in which your identity is tied to your server. In contrast, Matrix uses user IDs such as @user:example.com, but the protocol abstracts the server away from the user experience. In Matrix, your identity is more about the rooms you are in than the server you are on. You can change homeservers and retain your room memberships, which is more difficult in XMPP.

Neither approach is strictly better. XMPP is better suited to minimalists who want to understand every component and run a server that does exactly what they need. Matrix, on the other hand, rewards pragmatists who want a more modern feature set, such as reactions, threads, read receipts and VoIP, without having to hunt down extensions and hope that clients support them.

Bridging the two

The two protocols are compatible. This is not just a theoretical claim — there are operational bridges that translate between Matrix rooms and XMPP chats. Understanding how these bridges work sheds light on an important aspect of protocol architecture: a protocol designed for interoperability from the outset is much easier to bridge than one that treats external connections as an afterthought.

Matrix treats bridging as a core architectural feature. The protocol defines a dedicated bridge API and the concept of virtual users. A Matrix bridge to XMPP creates a virtual Matrix user for each XMPP contact in the room. When an XMPP user sends a message, the bridge translates the XMPP stanza into a Matrix event and posts it as the virtual user. When a Matrix user replies, the bridge translates the event back into an XMPP stanza and delivers it via the XMPP server. The bridge effectively acts as a protocol translator, impersonating users on both sides. As Matrix rooms are replicated DAGs, bridged messages become part of the room history, enjoying the same durability guarantees as native Matrix messages. The Bifrost bridge, developed by the Matrix community, implements this pattern specifically for XMPP. Similar bridges exist for IRC, Slack, Discord and Telegram.

XMPP uses the concept of ’transports’ to achieve bridging — a term that dates back to the early 2000s, when Jabber servers ran gateways to MSN, AIM, Yahoo and ICQ. A transport is a component of an XMPP server that registers as a service and translates between the XMPP protocol and foreign protocols. To the XMPP client, the transport appears as a regular contact. Messages sent to that contact are intercepted by the transport, translated, and delivered to the foreign network. This architecture is simple and proven — the same transport pattern has been used to bridge XMPP with other protocols for over twenty years. Modern XMPP transports are available for Matrix via Biboumi or Spectrum-based implementations, as well as for IRC, Telegram, and Signal.

While the experience of bridging Matrix and XMPP is functional, it is not seamless. One-to-one messaging works reliably, with both parties able to see text messages from each other. Group chat is more complex, however, because Matrix rooms and XMPP multi-user chats have different membership models, permission systems and expectations regarding message history. Features such as message reactions, threaded replies and read receipts do not have straightforward equivalents on both sides of the boundary. In Matrix, a reaction is a first-class event type with a specific place in the room DAG. In XMPP, reactions are implemented through an XEP which embeds reaction metadata in a message stanza. A bridge must therefore choose one representation and convert the other — typically by either dropping reactions entirely or converting them to plain text messages such as “Alice reacted with 👍”. This is not a fault of the bridge. Rather, it reflects the fact that the two protocols were designed independently with different feature sets, and mapping between them requires trade-offs to be made.

Despite these limitations, the bridge is functional. If you run a Matrix server and a friend runs an XMPP server, you can communicate with each other without either of you having to switch protocols. This is what it means to take open standards seriously: it’s not about everyone using the same protocol, but about protocols exposing enough surface area for bridges to be built. Both Matrix and XMPP pass this test. They are both sufficiently flexible that a dedicated bridge can translate between them with an acceptable level of fidelity, and they are both open enough that there are multiple independent bridge implementations.

Running your own

Prosody is the recommended starting point for XMPP. It is written in Lua, the configuration process is simple, and it uses minimal memory. A basic setup incorporating TLS, user registration and federation can be achieved in under fifty lines of configuration code. Ejabberd, which is written in Erlang, is better suited to larger deployments and has a web admin interface. There are many clients: Conversations and Cheogram are available for Android, Siskin and Monal for iOS, and Gajim and Dino for desktop. There is also a web interface available called Movim. XMPP’s end-to-end encryption uses OMEMO, a modern double-ratchet protocol similar to Signal’s.

For Matrix, Synapse is the reference server. It is written in Python, offers the most features and is the most complete option, but it requires a PostgreSQL database and more RAM than you might expect for a messaging server. Dendrite, a Matrix.org team Go rewrite, is lighter and faster, but is still maturing. Conduit and its successor, Conduit, are Rust implementations that can run on very modest hardware. In terms of clients, Element is the flagship offering, with versions available for desktop, web, Android and iOS. Alternatives include Fluffychat, Nheko and SchildiChat. Matrix’s end-to-end encryption uses Olm for one-to-one sessions and Megolm for group sessions; both are based on the Signal protocol.

If you self-host either protocol, you are responsible for your own availability and backups. However, it also means that your messages are stored on your disk, encrypted with your keys, and accessible only to the people you choose to talk to. No algorithm decides which messages you see. There is no data mining. There are no terms-of-service updates that suddenly allow AI training on your conversations. You trade convenience for sovereignty, and for those who already run their own calendar and contacts on DAV, this trade-off will be familiar.

The bigger picture

The problem of messaging has not been solved. The dominant platforms — WhatsApp, Messenger, Telegram, Discord and Slack — are all silos. They do not interoperate. They each own their users’ data and have no incentive to allow it to be shared elsewhere. This is not a technical limitation, but a business decision. Email has been federated since 1982. Anyone can run a mail server, and messages flow between them regardless of ownership. Messaging could have been the same.

The European Union’s Digital Markets Act is pushing in this direction. It requires large messaging platforms that are designated as gatekeepers to provide interoperability with smaller services that request it. Both Matrix and XMPP have been involved in these discussions as potential standard protocols for cross-platform messaging. If this becomes law in practice, the open protocols will act as bridges.

However, the process of regulation is slow and uncertain. A more direct approach would be to simply use these protocols now. Set up a server for yourself and your family. Invite your friends to join you. If enough people host their own messaging services, the silos will lose their most valuable asset: the network effect. Each person who leaves WhatsApp for a federated alternative reduces its essential nature. Switching is harder than switching calendars — messaging is social infrastructure, so you cannot switch alone — but the same logic applies. You cannot outsource your relationships to a platform and expect them to remain yours.

Matrix and XMPP are not perfect replacements for polished, billion-user platforms. However, they are real, working, federated alternatives that return control of your conversations to you. They have been around for years and will continue to exist because they are protocols, not products. No company can kill XMPP. No acquisition can shut down Matrix. The servers you run today will still work tomorrow, and the messages you send will remain yours.