Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

LN Gossip Visualizer

Seeing how information travels across the Lightning Network

The LN Gossip Visualizer is a tool for observing, analyzing, and visualizing how gossip messages propagate across the Lightning Network. It answers a deceptively simple question: when a node announces something to the network, how does that information spread?

Why This Matters

The Lightning Network relies on a gossip protocol for nodes to discover each other, learn about available channels, and compute payment routes. This protocol is defined in BOLT #7 and operates as a peer-to-peer flood-fill: when a node receives a new gossip message, it forwards it to all its other peers.

But how well does this actually work in practice? How fast does information reach the entire network? Are there bottlenecks? Can an observer figure out who originated a message by watching propagation patterns?

These are the questions the LN Gossip Visualizer helps explore.

Standing on Solid Foundations

The LN Gossip Visualizer is built on top of the gossip_observer project by Jonathan Harvey-Buschel (@jharveyb). Jonathan designed and implemented the core infrastructure that makes this work possible: the Rust-based gossip collector, the archiver pipeline, the data storage layer, and the deployment tooling. His work on passive gossip observation — connecting to hundreds of Lightning peers, recording message timing with nanosecond precision, and exporting structured datasets — is the cornerstone on which this visualizer is built.

Our contribution layers an interactive visualization dashboard on top of Jonathan’s data collection infrastructure, making the propagation patterns visible and explorable.

What You’ll Find in This Book

  • Part 1 explains what LN gossip is and why studying it matters
  • Part 2 describes the passive observer (built by @jharveyb) that connects to hundreds of peers and records every gossip message
  • Part 3 walks through the interactive dashboard — propagation replay, geolocation map, fingerprinting, and leak detection
  • Part 4 shares findings from the dataset
  • Part 5 outlines future directions

You can explore the live prototype at prototype-ln-gossip.vercel.app.

What is Lightning Network Gossip?

Lightning Network nodes need to know about each other to route payments. They learn about the network through a gossip protocol defined in BOLT #7 — a peer-to-peer mechanism where nodes share information about channels and other nodes.

The Three Gossip Message Types

MessagePurposeTriggered by
channel_announcementDeclares a new channel exists between two nodesChannel opening (confirmed on-chain)
channel_updateUpdates a channel’s routing policy (fees, HTLC limits, enabled/disabled)Node operator changes, periodic refresh
node_announcementAdvertises a node’s metadata (alias, color, addresses)Node coming online, config changes

How Gossip Propagates

Gossip works as a flood-fill: when a node receives a new gossip message it hasn’t seen before, it validates the message and then forwards it to all of its connected peers. This means a single channel_update from one node will eventually reach every other node in the network, hopping peer-to-peer across the graph.

  Node A (origin)
    ├──→ Peer 1 ──→ Peer 4 ──→ ...
    ├──→ Peer 2 ──→ Peer 5 ──→ ...
    └──→ Peer 3 ──→ Peer 6 ──→ ...

The speed at which this happens depends on network topology, peer connectivity, implementation details, and geographic distance between nodes.

Why It’s Interesting

This flood-fill is efficient but not instantaneous. There are measurable propagation delays — some nodes receive messages in milliseconds, others take seconds. These timing differences are the signal our visualizer is built to explore.

Why Observing Gossip Matters

Gossip might seem like a mundane protocol detail, but observing it closely reveals important things about privacy, network health, and the real-world behavior of a decentralized system.

Privacy: Who Originated This Message?

When a node updates its channel policy, it creates a channel_update and sends it to its peers. Those peers forward it to their peers, and so on. But the first peer to deliver a message to an observer is likely to be topologically close to the originator.

If an observer connects to enough peers, the arrival order of a message can act as a fingerprint — potentially revealing who originated it. This has direct privacy implications:

  • Can you tell which node changed its fees?
  • Can you correlate node_announcement timing to identify when a node restarts?
  • Can you detect which nodes are run by the same operator?

Network Health

Propagation timing also reveals the health of the network’s communication layer:

  • Bottlenecks: Are some regions consistently slower?
  • Unreliable peers: Do some nodes fail to forward messages?
  • Implementation differences: Do LND, CLN, Eclair, and LDK propagate at different speeds?

Research Context

This kind of analysis has a strong precedent on the Bitcoin base layer. Projects like TxProbe and research on transaction propagation timing have shown that P2P network observation is a powerful tool for understanding — and sometimes deanonymizing — decentralized networks.

The Lightning Network’s gossip layer has received far less scrutiny. The LN Gossip Visualizer aims to change that.

Collecting Gossip at Scale

Draft — placeholder content to verify rendering

To observe gossip propagation, we need a passive collector node that connects to as many Lightning peers as possible and records exactly when each peer delivers each gossip message.

The entire collection infrastructure described in this chapter was designed and built by Jonathan Harvey-Buschel (@jharveyb) as part of the gossip_observer project. The LN Gossip Visualizer builds its visualization layer on top of this foundation.

The Observer Node

Our collector is built on LDK (Lightning Dev Kit) and operates as a silent listener — it never opens channels or routes payments. Its only job is to connect to peers and log incoming gossip.

What Gets Recorded

For every gossip message received, we store:

FieldDescription
msg_hashSHA-256 hash identifying the unique message
peerPublic key of the peer that delivered it
net_timestampNanosecond-precision arrival time
collectorWhich collector instance received it

The same message arrives from many peers at different times — that’s the core data that powers our visualizations.

From Raw Data to Insights

The collector generates millions of raw timing records. Turning them into something a browser can visualize requires a multi-stage pipeline.

Stage 1: Export to Parquet

The archiver stores data in DuckDB and exports it as Parquet files:

FileSizeContents
timings.parquet~840 MB (22 shards)One row per (message, peer) — the core timing data
messages.parquet~10 MBMessage metadata: hash, type, timestamp, payload
metadata.parquet~5 MBPeer metadata: pubkey, alias, addresses

Stage 2: Preprocessing

A Python script (preprocess.py) transforms the raw data into visualization-ready JSON:

  1. Arrival percentiles — For each message, rank peers by arrival time. A peer’s avg_arrival_pct across all messages determines its radial position in the visualization.

  2. First-responder scores — Peers that consistently deliver messages before others get high scores. These are candidates for being topologically close to message originators.

  3. Message selection — From ~416,000 total messages, we select ~181 “interesting” ones: messages received by at least 50 peers, deduplicated, with clear propagation patterns.

  4. Community assignment — Peers are grouped into communities using a combination of:

    • Known hubs: ~15 manually identified pubkeys (major nodes like ACINQ, Bitfinex, River)
    • Alias matching: Nodes with “LNT” in their alias are grouped together
    • Unknown: The remaining ~970 of 978 peers fall into the catch-all “unknown” community

Stage 3: JSON Output

The pipeline produces 7 JSON files that the frontend loads directly:

  • peers.json — Per-peer stats, coordinates, community assignment
  • wavefronts.json — Per-message arrival sequences (the largest file at 14 MB)
  • messages.json — Message metadata for the selector
  • communities.json — Community definitions with colors and labels
  • fingerprints.json — Peer timing fingerprints
  • leaks.json — First-responder and colocation analysis
  • summary.json — Aggregate statistics

Dashboard Overview

The LN Gossip Visualizer presents data across a four-quadrant dashboard where each panel offers a different lens on the same underlying data.

The Four Quadrants

QuadrantNameWhat it shows
Q1 (top-left)Message Propagation ReplayAnimated radial view of how a message spreads peer-by-peer
Q2 (top-right)World MapGeographic distribution of peers, colored by propagation timing
Q3 (bottom-left)Peer FingerprintsTiming signature patterns across peers
Q4 (bottom-right)Leak DetectionFirst-responder analysis and colocation suspects

Linked Interaction

All four panels are linked: clicking a peer in any quadrant highlights the same peer in all others. Selecting a message in the replay panel updates the map markers, fingerprint highlights, and leak scores simultaneously.

At the top, a message selector lets you browse through ~181 curated gossip messages. Each one triggers a full propagation replay showing the order in which peers delivered that specific message.

Playback controls allow you to:

  • Play/pause the propagation animation
  • Scrub through time manually
  • Adjust playback speed

Tech Stack

The dashboard is intentionally lightweight — no frameworks, no build step:

  • Vanilla JavaScript (~72 KB) — all logic in a single app.js
  • HTML5 Canvas — for the propagation replay (Q1) and fingerprints (Q3)
  • Leaflet.js — for the geographic map (Q2)
  • Static JSON — all data is pre-computed and served as flat files

Message Propagation Replay

The propagation replay is the centerpiece of the visualizer. It shows, in real time, how a single gossip message spreads from its origin through the network to our observer’s peers.

The Radial Layout

The visualization uses a radial layout with the observer node at the center:

  • Angular position — determined by the peer’s community (e.g., known hub operators, implementation groups). Each community occupies a proportional angular slice.
  • Radial position — determined by the peer’s average arrival percentile (avg_arrival_pct). Peers that consistently receive messages early sit closer to the center; slower peers sit near the edge.

The formula:

radius = maxR × (0.15 + 0.85 × avg_arrival_pct)

This ensures even the fastest peers have some distance from the center (the 0.15 floor), while slow peers extend to the full radius.

The Wavefront Ring

An orange expanding ring sweeps outward from the center during playback. This represents the passage of time — as the ring reaches each peer’s radial position, that peer should be receiving the message around that moment.

The ring’s radius at any point is:

ring_radius = (elapsed_ms / time_spread_ms) × maxR

Note: the ring is a time metaphor, not a geometric boundary. Peers light up based on their actual recorded arrival time, which may not perfectly align with the ring.

Peer States

Each peer dot transitions through states during playback:

StateColorMeaning
WaitingDim grayHasn’t received the message yet
ReceivedBright community colorJust received the message
FadedMuted colorReceived earlier, fading to background

What You Can Learn

By watching multiple messages play out, patterns emerge:

  • Some peers consistently light up first — they’re well-connected or close to common originators
  • Clusters of peers that light up together suggest shared network paths
  • Peers that are always last may have poor connectivity or be behind rate limiters

Geolocation Map

The geolocation map provides a geographic perspective on gossip propagation, showing where in the world peers are located and how the message wavefront correlates with physical distance.

IP-to-Location Mapping

Lightning nodes that advertise clearnet (IPv4/IPv6) addresses can be geolocated using IP geolocation databases. For each peer connected to our observer, we resolve their advertised IP address to approximate latitude/longitude coordinates.

Out of ~978 connected peers, a subset have mappable clearnet addresses. The rest use Tor (.onion) addresses or don’t advertise any address at all.

The Map View

The map uses Leaflet.js with dark CartoDB tiles. Each geolocated peer appears as a circle marker on the map:

  • Color matches the peer’s community assignment (same as the propagation replay)
  • Opacity/brightness reflects propagation state during playback — peers brighten when they receive the current message
  • Clicking a peer on the map highlights it across all four dashboard quadrants

When a single peer is selected, the map pans to center on it. When multiple peers are highlighted, the map adjusts its bounds to fit them all.

What the Map Reveals

  • Geographic clustering: Many peers concentrate in North America and Western Europe, reflecting where Lightning infrastructure is hosted
  • Propagation vs distance: Messages don’t always reach nearby peers first — network topology matters more than physical proximity
  • Regional patterns: Some messages show clear geographic wavefronts; others spread unpredictably

Limitations

Geographic data should be interpreted carefully:

  • Tor nodes (~30-40% of the network) have no mappable location
  • VPNs and cloud hosting place nodes at datacenter locations, not operator locations
  • IP geolocation accuracy varies — city-level at best, sometimes only country-level
  • A node’s advertised address may not match its actual network path

Peer Fingerprinting & Leak Detection

The bottom half of the dashboard focuses on analysis — using propagation timing data to identify patterns that reveal information about peers and the network.

First-Responder Analysis

For each gossip message, the first responder is the peer that delivers it to our observer before any other. Across hundreds of messages, some peers appear as first responders far more often than chance would predict.

A peer with a high first-responder score may be:

  • Running on fast infrastructure (low-latency connections, powerful hardware)
  • Potentially the originator of some messages (a node always delivers its own messages first)
  • Geographically or topologically close to the observer node

Note that a high first-responder score does not necessarily mean a peer is well-connected in the overall network topology — it may simply have a fast, direct link to the observer.

The leak detection panel ranks peers by their first-responder frequency, flagging statistical outliers.

Timing Fingerprints

Each peer has a characteristic timing signature — a pattern of how early or late it tends to deliver messages relative to other peers. The fingerprint view visualizes this as a pattern across all observed messages.

These fingerprints can reveal:

  • Implementation differences — LND, CLN, Eclair, and LDK advertise different feature bits in their node_announcement messages, making it possible to identify which implementation a peer is running
  • Rate limiting — Some implementations batch gossip messages, creating characteristic delivery patterns
  • Network position — A peer’s consistent timing pattern reflects its position in the network graph

Colocation Detection

When two or more peers always receive messages at nearly the same time, it suggests they may be:

  • Running on the same machine or in the same datacenter
  • Connected to each other with a very low-latency link
  • Operated by the same entity running multiple nodes

The colocation panel groups peers with highly correlated arrival times and flags suspicious clusters.

Privacy Implications

Taken together, these analyses raise important questions:

  • If an observer can fingerprint peers and identify first responders, can they deanonymize gossip origins?
  • Could multiple observers, positioned across the network, triangulate the source of a message?
  • What countermeasures could implementations adopt? (e.g., random delays, batching, decoy messages)

What the Data Reveals

Our dataset captures a snapshot of Lightning Network gossip propagation from a single observer connected to ~978 peers, recording over 416,000 unique gossip messages.

By the Numbers

MetricValue
Connected peers~978
Total unique messages observed~416,000
Messages curated for replay181
Peers with geolocation datavaries by clearnet availability
Observation periodSeptember 2025 dump

Propagation Speed

Most gossip messages reach the majority of peers within a few seconds. However, the distribution has a long tail — some peers consistently receive messages many seconds after the first arrival.

The time_spread_ms for a typical message (the time between the first and last peer receiving it) ranges from under 1 second to over 10 seconds.

Fast vs Slow Peers

Peers cluster into roughly two groups:

  • Fast peers (low avg_arrival_pct): These tend to be well-known routing nodes with high connectivity — they sit close to the center of the propagation replay
  • Slow peers (high avg_arrival_pct): Often smaller nodes, Tor-only nodes, or peers with rate-limited gossip forwarding

Community Patterns

The manual community assignments (based on ~15 known hub pubkeys and alias matching) account for only 8 of 978 peers. The vast majority (970) fall into the “unknown” community. This highlights how much of the network remains uncharacterized — and how much room there is for better automated community detection.

Key Observations

  • First responders are not random: A small set of peers consistently delivers messages first, suggesting they may be message originators or have particularly fast connections to the observer
  • Geographic proximity ≠ propagation speed: Peers in the same city don’t necessarily receive messages at the same time — network topology dominates physical distance
  • Implementation fingerprints are visible: Feature bits advertised in node_announcement messages differ across LN implementations, making it possible to identify which software a peer is running

🔗 Explore the data yourself at prototype-ln-gossip.vercel.app

Roadmap & Open Questions

The LN Gossip Visualizer is a prototype. Here’s where we see it going.

Multi-Observer Deployment

Currently, we observe from a single vantage point. A single observer can rank peers by arrival time, but it can’t definitively identify message origins — it only sees who delivered the message to it first.

With multiple observers deployed across the network (different geographic regions, different peer sets), we could:

  • Triangulate message origins by correlating arrival times across observers
  • Map propagation paths more accurately
  • Distinguish between “fast because well-connected” and “fast because close to the origin”

Real-Time Mode

The current visualizer works with pre-recorded data. A natural evolution is a live mode that streams gossip events in real time:

  • Watch messages propagate as they happen
  • Alert on anomalous propagation patterns
  • Monitor network health continuously

Better Community Detection

Our current community assignment is largely manual (~15 known pubkeys). Future work should incorporate:

  • Graph-based community detection using the channel graph topology (Louvain, label propagation)
  • Implementation fingerprinting to automatically group peers by software (LND vs CLN vs Eclair vs LDK)
  • Clustering by timing patterns — peers with similar propagation profiles likely share network characteristics

Minisketch & Set Reconciliation

The Lightning Network is exploring Minisketch-based gossip (Erlay-style set reconciliation) as a more bandwidth-efficient alternative to flood-fill gossip. Observing how this changes propagation dynamics would be valuable:

  • Does set reconciliation make propagation more uniform?
  • Does it reduce the information available to passive observers?
  • How does it interact with different implementation strategies?

Open Research Questions

  • How many observers are needed to reliably identify message origins?
  • Can random forwarding delays effectively prevent timing analysis?
  • What is the minimum connectivity an observer needs to get meaningful propagation data?
  • How does gossip propagation change over time — is the network getting faster, slower, or more centralized?

Appendix: Technical Reference

Data Schema

timings.parquet

The core dataset — one row per (message, peer) observation:

ColumnTypeDescription
msg_hashbytesSHA-256 hash of the gossip message
peerbytesPublic key of the delivering peer
net_timestampi64Nanosecond arrival timestamp
collectorstringCollector instance identifier

messages.parquet

ColumnTypeDescription
msg_hashbytesMessage identifier
msg_typestringchannel_announcement, channel_update, or node_announcement
timestampi64Message’s internal timestamp (set by originator)

metadata.parquet

ColumnTypeDescription
peerbytesPeer public key
aliasstringNode alias (if available)
addressesstringAdvertised network addresses

Key Algorithms

Arrival Percentile

For each message m and peer p:

arrival_pct(m, p) = rank_of_p_in_message_m / total_peers_for_message_m

A peer’s avg_arrival_pct is the mean of arrival_pct across all messages where that peer participated.

First-Responder Score

first_responder_score(p) = count(messages where p was first) / total_messages_seen_by_p

Peers with scores significantly above 1/N (where N is the average peer count per message) are flagged as statistical outliers.