*Reading whatsmeow source code

March 18, 2026

i opened the whatsmeow repo because i wanted to understand WhatsApp Web from the client side.

not the marketing version of "multi-device", and not the normal wrapper-level view where you call SendMessage and pretend the rest is magic i wanted to see the boring parts the socket, the login flow, the weird ids, the places where the code has to know about old and new WhatsApp behavior at the same time.

whatsmeow is a good project for that because it does not hide much it is a Go library, but if you read it in the right order, it starts looking less like a library and more like a map of the WhatsApp Web protocol.

so this post is basically my reading notes we will go through the files that made the most things click for me: the client, the Noise socket, binary nodes, pairing, sending, receiving, app state, and the store.

whatsmeow source code collage

start with the client

the obvious entry point is client.go

Client is not a tiny wrapper around an HTTP api it is a stateful protocol machine.

it has a websocket, a device store, event handlers, retry state, app state processors, media connection caches, group caches, device caches, message secret stores, reconnect logic, and a lot of small maps that exist because real messaging systems are messy.

the first useful signal is inside NewClient:

cli.nodeHandlers = map[string]nodeHandler{
  "message":      cli.handleEncryptedMessage,
  "appdata":      cli.handleEncryptedMessage,
  "receipt":      cli.handleReceipt,
  "call":         cli.handleCallEvent,
  "chatstate":    cli.handleChatState,
  "presence":     cli.handlePresence,
  "notification": cli.handleNotification,
  "success":      cli.handleConnectSuccess,
  "failure":      cli.handleConnectFailure,
  "stream:error": cli.handleStreamError,
  "iq":           cli.handleIQ,
  "ib":           cli.handleIB,
}

that map is basically a table of contents for WhatsApp Web

messages are not the only thing flowing through the socket, receipts, presence updates, typing state, app data, notifications, and iq queries all arrive as nodes. a WhatsApp client is less like "call an endpoint to send a message" and more like "stay connected to a stream of typed XML-ish events forever"

that changes how you think about the app immediately.

the socket is not just a websocket

ConnectContext eventually calls unlockedConnect, which creates a FrameSocket, opens the websocket, and then calls doHandshake.

fs := socket.NewFrameSocket(cli.Log.Sub("Socket"), client)
if err := fs.Connect(ctx); err != nil {
  fs.Close(0)
  return err
} else if err = cli.doHandshake(ctx, fs, *keys.NewKeyPair()); err != nil {
  fs.Close(0)
  return fmt.Errorf("noise handshake failed: %w", err)
}

the important part is that the websocket is only the transport

the real session starts after the Noise handshake

handshake.go says it directly:

// doHandshake implements the Noise_XX_25519_AESGCM_SHA256 handshake for the WhatsApp web API.
func (cli *Client) doHandshake(ctx context.Context, fs *socket.FrameSocket, ephemeralKP keys.KeyPair) error {
  nh := socket.NewNoiseHandshake()
  nh.Start(socket.NoiseStartPattern, fs.Header)
  ...
}

this tells us a lot

WhatsApp Web is not sending plain websocket JSON around, it establishes an encrypted channel using Noise, verifies the server certificate, mixes in client keys, sends a client payload, and only then turns the connection into a NoiseSocket.

in socket/noisesocket.go, every frame after the handshake is encrypted with an AEAD key and a counter-based IV:

func generateIV(count uint32) []byte {
  iv := make([]byte, 12)
  binary.BigEndian.PutUint32(iv[8:], count)
  return iv
}

func (ns *NoiseSocket) SendFrame(ctx context.Context, plaintext []byte) error {
  ciphertext := ns.writeKey.Seal(nil, generateIV(ns.writeCounter), plaintext, nil)
  ns.writeCounter++
  return ns.fs.SendFrame(ciphertext)
}

that tiny function is one of those places where the abstraction disappears

you are not "connected to WhatsApp" you have a websocket carrying encrypted binary frames, with separate read and write keys, monotonically increasing counters, and a custom framing layer underneath.

WhatsApp speaks nodes

after a frame is decrypted, handleFrame runs.

decompressed, err := waBinary.Unpack(data)
node, err := waBinary.Unmarshal(decompressed)
cli.recvLog.Debugf("%s", node.XMLString())

this is where the protocol becomes readable.

WhatsApp uses a compact binary XML format. whatsmeow represents it with a simple struct:

type Node struct {
  Tag     string
  Attrs   Attrs
  Content interface{}
}

once you see this, the rest of the codebase gets easier to navigate.

a message is a nodee, a receipt is a node, an iq query is a node, pairing starts from a node app state sync comes back as nodes, almost everything important is packed into this shape:

waBinary.Node{
  Tag: "iq",
  Attrs: waBinary.Attrs{
    "to":   types.ServerJID,
    "type": "result",
    "id":   reqID,
  },
}

the encoder is also worth reading because it shows that this is not normal XML serialized as text

binary/encoder.go has token dictionaries, packed nibble strings, packed hex strings, JID-specific encodings, list sizes, and raw byte blocks. WhatsApp is optimizing this stream because clients keep it open all the time.

it feels old-school in a good way very little ceremony just tags, attributes, bytes, and a lot of protocol knowledge hidden in token tables

pairing is identity exchange

the QR login flow is in pair.go.

when the server sends a pair-device node, whatsmeow extracts references and turns them into QR data:

func (cli *Client) makeQRData(ref string) string {
  noise := base64.StdEncoding.EncodeToString(cli.Store.NoiseKey.Pub[:])
  identity := base64.StdEncoding.EncodeToString(cli.Store.IdentityKey.Pub[:])
  adv := base64.StdEncoding.EncodeToString(cli.Store.AdvSecretKey)
  return strings.Join([]string{ref, noise, identity, adv}, ",")
}

that QR code is not just "login token plese".

it contains a server reference, the client's Noise public key, the client's identity public key, and the adv secret scanning it is how the primary phone learns enough about this companion device to link it.

then handlePairSuccess verifies what came back

there is HMAC verification, account signature verification, device signature generation, JID and LID storage, and identity persistence.

if !verifyAccountSignature(&deviceIdentity, cli.Store.IdentityKey, deviceIdentityDetails.GetDeviceType() == waAdv.ADVEncryptionType_HOSTED) {
  cli.sendPairError(ctx, reqID, 401, "signature-mismatch")
  return ErrPairInvalidDeviceSignature
}

deviceIdentity.DeviceSignature = generateDeviceSignature(&deviceIdentity, cli.Store.IdentityKey)[:]

the interesting part is the mental model.

multi-device WhatsApp is not "the browser borrows your phone session", the companion has its own identity, keys, and device record, after linking, it is a real device in the account's device set.

that explains a lot of the complexity later

sending one message means sending many encrypted messages

SendMessage looks like a public api method, but internally it is a long protocol pipeline.

first it builds or accepts a message id

const WebMessageIDPrefix = "3EB0"

func (cli *Client) GenerateMessageID() types.MessageID {
  data := make([]byte, 8, 8+20+16)
  binary.BigEndian.PutUint64(data, uint64(time.Now().Unix()))
  ...
  hash := sha256.Sum256(data)
  return WebMessageIDPrefix + strings.ToUpper(hex.EncodeToString(hash[:9]))
}

then it decides what kind of chat this is

DMs, groups, broadcasts, newsletters, bots, peer messages, and hidden-user LID addressing all branch differently

the LID bit is especially interesting modern WhatsApp has phone-number JIDs and logical IDs whatsmeow has to translate between them, cache mappings, and sometimes replace a destination PN with a LID before sending.

toLID, err = cli.Store.LIDs.GetLIDForPN(ctx, to)
...
cli.Log.Debugf("Replacing SendMessage destination with LID as migration timestamp is set %s -> %s", to, toLID)
to = toLID
ownID = cli.getOwnLID()

this is a good example of why source code is useful protocol migrations rarely show up as a clean paragraph in docs, in source, they show up as branches, caches, fallback paths, and weird comments.

for a normal DM, whatsmeow still has to know every target device

GetUserDevices sends a usync query and gets AD JIDs back:

list, err := cli.usync(ctx, jidsToSync, "query", "message", []waBinary.Node{
  {Tag: "devices", Attrs: waBinary.Attrs{"version": "2"}},
})

then the send path encrypts the message separately for devices.

for _, jid := range allDevices {
  encrypted, err := cli.encryptMessageForDeviceAndWrapV3(ctx, payload, skdm, dsmForDevice, jid, bundles[jid], encAttrs)
  if err != nil {
    cli.Log.Warnf("Failed to encrypt %s for %s: %v", id, jid, err)
    continue
  }
  participantNodes = append(participantNodes, *encrypted)
}

that is the core of multi-device

you do not encrypt a message "to a user" you encrypt it to the user's devices. uf a session is missing, whatsmeow fetches prekeys, processes a bundle, creates a Signal session, and then encrypts.

the public call is simple:

cli.SendMessage(ctx, target, &waE2E.Message{
  Conversation: proto.String("hello"),
})

but behind it is device discovery, session lookup, prekey fallback, message transport protobufs, binary nodes, encrypted frames, server ack waiting, retry handling, and cache invalidation if the server says the participant hash was wrong.

the api is small because the protocol is not

receiving is the same story in reverse

incoming messages enter through handleEncryptedMessage

first whatsmeow parses the node into MessageInfo: chat, sender, participant, timestamp, push name, media type, bot metadata, LID/PN alternates.

then it decrypts based on the child node type

switch encType {
case "pkmsg", "msg":
  decrypted, ciphertextHash, err = cli.decryptDM(ctx, &child, senderEncryptionJID, encType == "pkmsg", info.Timestamp)
case "skmsg":
  decrypted, ciphertextHash, err = cli.decryptGroupMsg(ctx, &child, senderEncryptionJID, info.Chat, info.Timestamp)
}

pkmsg is a prekey message msg is a normal Signal message. skmsg is a sender-key group message

again, the tags reveal the architecture

DMs use per-device Signal sessions groups use sender keys failed decryptions can produce retry receipts some messages are buffered so the same ciphertext is not decrypted twice some unavailable messages can be requested again from the phone.

this is why real messaging clients feel simple but are hard to implement the happy path is tiny the edge paths are the product

what about calls?

at some point while reading this you might be thinking:

does whatsmeow support WhatsApp calls?

short answer: no.

whatsmeow has call event handling, and you can see call in the nodeHandlers map, but that does not mean it implements the whole VoIP stack

that made me curious enough to go look at how WhatsApp Web actually does callsd

i ended up reverse engineering a decent chunk of it, so far i can make and receive audio calls just fine from a web-style client not video yet, and not something i would call polished, but the core audio path works

that rabbit hole taught me a lot because calls are a very different beast from messages messages are stored, retried, synced, and decrypted later. calls are live you suddenly care about signaling, media negotiation, timing, streams, device state, and all the annoying real-time parts that normal message sending lets you ignore.

i will probably write a separate post on that later, maybe a proper "how WhatsApp Web calls work" writeup once i clean up my notes.

update - April 13, 2026

i wrote the calls post here: Reverse engineering WhatsApp calls

app state is a patch stream

one of my favorite parts of the whatsmeow is appstate.go.

your contact list, pinned chats, muted chats, archived state, and other local-ish settings are not fetched as one normal object

they are app state patches.

FetchAppState loads the local version and hash, asks the server for patches, applies them, and stores the new version.

version, hash, err := cli.Store.AppState.GetAppStateVersion(ctx, string(name))
state := appstate.HashState{Version: version, Hash: hash}
...
state, err = cli.applyAppStatePatches(ctx, name, state, patches, fullSync, eventsToDispatchPtr)

the shape is closer to syncing a database than calling a REST endpoint.

there is versioning, hashes, snapshots,mutations, there is recovery when keys are missing, there is a special fast path for mass inserting contacts during a full sync.

this makes sense when you remember the product.

WhatsApp has multiple devices, offline periods, settings changed from different clients, and end-to-end encrypted state. A companion device cannot just ask "give me my chats" and trust a giant JSON response forever it needs a sync protocol.

the store is part of the protocol

store/store.go looks boring until you realize it is a list of everything WhatsApp needs to remember locally.

type Device struct {
  NoiseKey       *keys.KeyPair
  IdentityKey    *keys.KeyPair
  SignedPreKey   *keys.PreKey
  RegistrationID uint32
  AdvSecretKey   []byte

  ID  *types.JID
  LID types.JID

  Identities    IdentityStore
  Sessions      SessionStore
  PreKeys       PreKeyStore
  SenderKeys    SenderKeyStore
  AppStateKeys  AppStateSyncKeyStore
  AppState      AppStateStore
  Contacts      ContactStore
  ChatSettings  ChatSettingsStore
  MsgSecrets    MsgSecretStore
  PrivacyTokens PrivacyTokenStore
  LIDs          LIDStore
}

this is not just persistence for convenience.

without this store, the client forgets who it is, loses Signal sessions, cannot trust identities, cannot decrypt group messages, cannot sync app state, cannot map LIDs to phone numbers, and cannot handle retries after restart.

the database is part of the client identity.

that is probably the biggest lesson from the codebase: WhatsApp Web is not a stateless web app it is a full companion client with durable cryptographic state.

what the source reveals

reading whatsmeow made WhatsApp feel less magical and more like a set of very deliberate layers:

  • websocket transport
  • Noise encrypted frames
  • binary XML nodes
  • protobuf message payloads
  • Signal sessions per device
  • sender keys for groups
  • app state patches for sync
  • local stores for identity and continuity

each layer is small enough to understand by itself the complexity comes from all of them being alive at once.

the best way to read whatsmeow is to follow this path:

  1. client.go to understand the event loop
  2. handshake.go and socket/ to understand the encrypted transport
  3. binary/node.go and binary/encoder.go to understand the node format
  4. pair.go to understand device linking
  5. send.go to understand outbound messages
  6. message.go to understand inbound messages
  7. appstate.go and appstate/ to understand sync
  8. store/ to understand what has to survive restarts

once you read all this WhatsApp Web stops looking like a website.

it looks like a small encrypted messaging client that happens to run over a websocket.

and honestly, that is much more interesting.