Mobile Network
Topology GraphDB
Mobile networks are fundamentally relational systems — cell neighbor meshes, transport backhaul hierarchies, and 5G slice service chains are all graphs that relational and time-series stores model poorly at scale. A purpose-built GraphDB layer, operating as a federated topology intelligence plane over existing RAN, Core, and Transport systems, stores relationships and traversal-critical properties in the graph while leaving authoritative attribute detail in source systems. Agents traverse causal chains across domain boundaries in under a second — identifying common transport ancestors behind alarm storms, tracing handover failure paths through neighbor meshes, assessing slice SLA exposure from a single NF fault.
The Structural Problem with Today's Network Data
Mobile networks generate enormous volumes of event, alarm, KPI, and configuration data — but the data is siloed by domain. Your RAN OSS doesn't natively know that a gNodeB backhaul link traverses a specific transport segment that also carries AMF signaling. Your Core NMS doesn't know which cells are co-located neighbors or which UEs are currently handing over between two sites experiencing simultaneous degradation.
The result is the classic correlation gap: Network engineering teams manually stitch together context across four or five different systems, taking 30–90 minutes to build a picture that a graph query can assemble in under a second. This is not a tooling problem — it is a data model problem. The relationships between network entities are first-class information, and they need first-class storage.
Importantly, solving this does not require consolidating all topology data into a single system. The GraphDB can operate as a federated inventory layer — holding identity keys and relationships across domains, while individual source systems retain ownership of their full attribute sets. Traversal-critical properties (band, operational state, role) are replicated into the graph for query performance; everything else is fetched on demand from the authoritative source by reference.
- Cells have overlapping coverage — inherently a graph edge (NEIGHBOR_OF)
- Handover sequences trace paths through adjacency relationships
- Transport paths hop through multiple nodes and logical segments
- Multi-band environments create layered spectrum relationships per site
- 5G slices traverse RAN → Transport → UPF → DN as a graph traversal
- Failure propagation follows topology — downstream nodes relate to upstream faults
- X2/Xn neighbor tables are literal graph edge lists already managed in the RAN
- Federated model means source systems stay authoritative — graph stores relationships, not duplicated inventory
- TSDB: great for KPI trends, blind to topology relationships
- Elasticsearch: full-text alarm correlation, no causal chain traversal
- Relational DB: join explosions when modeling N:M neighbor meshes
- No native representation of "which cells share the same transport segment"
- Cannot model sector-to-carrier-to-band-to-spectrum-block relationships efficiently
- Multi-hop queries (cell → site → transport ring → AMF) require complex ETL
- Schema migrations are expensive when network topology changes
Should You Build This? — The Honest Assessment
This is a significant platform investment. Before committing, weigh the structural advantages against the real operational and organizational costs. The decision should hinge on your RCA scale problem, your willingness to invest in graph data engineering, and your agentic AI maturity. Critically, the investment profile changes significantly depending on whether you pursue a full consolidation model versus a federated inventory model. The federated approach is almost always the right starting point.
- Network Operations spending >45 min/incident correlating cross-domain context manually
- Operating multi-band (n77+n41+B4/13) with complex inter-frequency neighbor meshes
- Repeated cascading failures — single transport fault triggers 50+ alarms across domains
- Building agentic AI — agents need a structured topology query interface, not free-text alarm logs
- Handover failure RCA requires traversing: cell → neighbor list → X2 interface → transport → shared node
- RAN has 1M+ cells — at this scale, neighbor relationship management is an unsolvable graph problem without a purpose-built store
- Network slicing (5G SA) maps service chains across 4+ NFs — a natural graph traversal
- Proactively identifying single points of failure in your transport mesh
- Running spectrum refarming — impact radius queries are graph-native
- Federated model: RAN OSS, Transport NMS, and Core NMS remain authoritative — the graph adds relationship intelligence without displacing existing investments
- A small network (<5k cells) — a well-structured relational model may suffice
- No agentic AI roadmap — incremental value may not justify cost
- Source topology systems lack APIs or reliable data quality — graph reflects what you feed it
- No graph-skilled engineering resources — Neo4j requires dedicated expertise
- Existing NMS provides cross-domain correlation at adequate depth
- Topology change velocity is low — pipeline complexity may exceed benefit
- Data governance is immature — incomplete topology data creates false RCA conclusions
- Source systems have poor or no APIs — federated model requires reliable on-demand query capability
- Short-term MTTR improvement may be achieved faster with simpler enrichment pipelines
For agentic RCA at scale across 1M+ cells in multi-band, multi-domain environments, a GraphDB topology plane is not optional — it is the enabling infrastructure. Start with the federated inventory model: the graph owns relationships and identity, source systems own attributes. This limits initial scope, preserves existing system investments, and delivers agent-ready topology traversal without a full consolidation program.
Network Topology as a Graph — Conceptual Model
The critical insight is that every physical and logical relationship in your network maps cleanly to graph primitives. Nodes represent network entities; edges represent relationships with properties describing the nature of that relationship — capacity, state, protocol, neighbor priority, and more.
| Entity / Relationship | Graph Type | Key Properties | Source System |
|---|---|---|---|
| NODE Site / Tower | Site | siteId, address, lat, lon, powerFeed, backhaul type | Network Inventory / PPM |
| NODE gNodeB / eNodeB | RadioNode | nodeId, vendor, swVersion, lat, lon, site, powerClass | RAN OSS / Network Inventory |
| NODE Cell / Sector | Cell/Carrier | cellId, PCI, TAC, earfcn/nrArfcn, txPower, bandClass, dlBW, ulBW, state, MIMO Layers | RAN OSS |
| NODE Spectrum Carriers | Carrier Aggregation | allowedBandClass, dlBW, ulBW, scs, MIMO Layers | RAN OSS |
| NODE Transport Segment | TransportSegment | segmentId, type (fiber/MW/MPLS), capacity, SLA, latency | Transport NMS |
| NODE Transport Node | TransportNode | nodeId, role (CE/PE/P), vendor, swVersion, protectedBy | Transport NMS / IP OSS / Network Inventory |
| NODE 5G CNF | CoreNF | nfType (AMF/SMF/UPF), nfId, plmn, slice, capacity, state | Core NMS / 3GPP O&M |
| NODE Network Slice | Slice | sst, sd, dnn, eMBB/URLLC/mMTC, SLA, subscriberProvision | RAN / NSMF / Core NMS |
| REL NEIGHBOR_OF | Cell → Cell | nrRelType, freq (intra/inter), weight | RAN OSS |
| REL CARRIED_BY | Cell → TransportSegment | interface, vlan, bandwidth, protectionPath | Transport NMS / Network Inventory |
| REL CONNECTED_TO | gNodeB → CoreNF (AMF) | ngInterface, ngState, sctp | Core NMS / RAN OSS |
| REL SERVES_SLICE | Cell → Slice | nssai, admittedUE, configuredCapacity | Core NMS / RAN OSS |
| REL BACKHAULED_VIA | Site → TransportNode | linkType, protection, latencyMsec, bwMbps | Transport NMS(s) |
| REL HAS_CARRIER | Cell → Cell (Carrier) | isPrimary, aggregationType (CA/DC), scgOrMcg | RAN OSS |
| REL COLO_WITH | Cell → Cell | sameAntenna, sameRRH, coTower | Network Inventory |
| REL ANCHORED_BY | NRCell → LTECell | enDCCapable, psCell, scgBearers | RAN OSS |
This schema naturally accommodates multi-band environments: a single site has multiple cells per sector (one per band), each with their own carrier nodes, all connected to the same transport segment and the same core network nodes — creating a rich, queryable topology that is impossible to represent in a star-schema relational model without dozens of joins.
Integration Architecture — From Source Systems to Graph
The architecture positions the GraphDB as a topology intelligence plane — not a replacement for existing OSS/NMS systems, but an integration layer that creates a unified, traversable model of your network. All writes are driven by topology change events; the graph maintains a live representation of network relationships enriched with the minimum node properties required for traversal.
For source systems with mature, well-documented APIs — Transport NMS, Core NMS, Network Inventory — a federated access pattern is preferable to full ingestion. The graph holds only node identity keys and relationship structure. When an agent completes a traversal and needs full attribute context, it calls the owning system directly using the identity key returned by the graph. This keeps the graph lean, avoids dual-maintenance of authoritative data, and means agents interact with source systems as a natural part of their reasoning workflow.
Event Stream
Polling (gNMI)
Normalization
Mapping Layer
Resolution
Primary GraphDB
Change Log
Registry
Agent
Agent
Agent
Dashboard
Planning
- gNMI/gRPC — streaming telemetry for real-time state updates (link state, cell state)
- NETCONF/YANG — configuration retrieval for topology bootstrap and change events
- RESTCONF — transport node configuration and topology APIs
- SNMP traps → Kafka — legacy transport equipment state changes
- Vendor OSS APIs — bulk neighbor table export, CM file parsing
- 3GPP O1 — RAN element management, neighbor sync
- Topology truth source priority — OSS wins over NMS wins over inferred
- Change detection must be idempotent — duplicate events cannot corrupt graph
- Soft-delete for removed neighbors — retain historical adjacency for RCA replay
- Classify each node type: graph-resident vs. federated (identity key only)
- Traversal-critical properties (band, opState, role, alarmState) replicated into graph; CM detail and SLA remain in source systems
- Time-stamped relationship properties for handover metric trending
- Anchor cell (LTE) identified explicitly for EN-DC configurations
- Agent tool library must include both graph query tools and source system API tools
Agentic RCA Use Cases — Graph-Powered Intelligence
The following use cases illustrate how an AI agent with access to the GraphDB fundamentally changes the RCA workflow. Each includes the graph traversal pattern that powers it — impossible to replicate efficiently in a non-graph data store.
When 20+ cell alarms fire simultaneously, the agent queries the graph to find the minimal common ancestor — the upstream transport node or segment shared by the largest subset of alarming cells. If a single transport segment is the common backhaul path for 80% of alarming cells, that segment is the probable root cause, not the cells themselves. The agent suppresses downstream alarms, escalates the transport fault, and auto-populates the incident ticket with the full impact radius.
The agent identifies cells with degraded HO success rate, then traverses the neighbor graph to determine whether the failure is confined to a single source-target pair (PCI confusion, coverage gap) or is systemic across all neighbors of the target cell. It also checks whether the target and source share the same transport backhaul — a transport fault can masquerade as a handover failure. Multi-band environments require checking both intra-frequency and inter-frequency neighbor relationships separately.
In a multi-band environment (n77 for capacity + n41 mmW for peak + B13 for coverage), a coverage hole may appear when the mid-band anchor cell is degraded and fallback to coverage-layer cells fails because the inter-frequency neighbor relationship is misconfigured or missing. The agent traverses the band-layered neighbor graph to identify whether a geographic area's cells have proper inter-frequency and inter-RAT neighbor relationships, flagging gaps where coverage continuity depends on missing or improperly weighted neighbors.
When a UPF instance degrades, the agent traverses the slice graph to determine which network slices are served by that UPF, then traces which cells are configured to offer those slices, then determines the subscriber count and enterprise SLA commitments at risk. The cross-domain traversal (Core NF → Slice → Cell → Site → Subscriber segment) takes under 1 second in the graph but would require 4–5 system queries and manual correlation without it. The output drives automated SLA notification and remediation prioritization.
Without a graph, identifying transport single points of failure requires a network architect to manually trace topologies. The agent periodically traverses the transport graph to find segments or nodes that, if removed, would disconnect the largest number of RAN sites from core. It weights by subscriber population and SLA tier to rank remediation priority. This proactive use case — converting graph centrality analysis into a risk report — is a zero-human-effort operation once the graph is live, turning topology intelligence into a continuous reliability program.
A Technology Stack
Preferred for Cypher expressiveness, APOC procedures, and GDS (Graph Data Science) library — critical for centrality, community detection, and path analysis used in RCA agents.
Topology Events Topic with schema validation via Confluent Schema Registry. Separate topics for RAN, Core, Transport topology changes. Change Data Capture (CDC) pattern for OSS/NMS integration.
Stateful stream processing for topology normalization, deduplication, and graph mutation generation. Flink preferred for windowed join logic across domain events. Kafka Streams for simpler topologies.
gNMI streaming for real-time state (cell state, link state). NETCONF/YANG for topology bootstrap. Vendor adapters required for Ericsson ENMIQ, Nokia NetAct, Samsung SON.
Graph-aware RCA agents with Neo4j tool calls as agent capabilities. Each agent exposes Cypher query templates as tools. LLM layer (Claude / GPT-5) for natural language alarm narrative generation and recommendation synthesis.
TSDB (InfluxDB/Prometheus) for KPI time-series linked by nodeId. Elasticsearch for alarm text. Apache Iceberg for historical topology snapshots. GraphDB stores relationships; other stores handle their native data types.
Implementation Roadmap
Foundation & Graph Bootstrap
- Deploy Neo4j cluster
- Define graph schema — nodes, relationships, property contracts
- Federated vs. consolidated evaluation — assess each source system's API maturity and data quality
- RAN Physical topology via Planning Tool
- RAN Element topology bootstrap via NETCONF/YANG bulk export
- Neighbor table import from Network Element Managers
- Transport topology import from NMS APIs
- Core NF topology from 5GC O&M interfaces
- Kafka topology events pipeline — RAN topology changes
- Data quality validation framework
- Basic Cypher query library for Operations Tooling
Real-Time State & Open-Loop Agents
- gNMI streaming integration — cell/link state → graph properties
- FM enrichment — alarm state written to node properties
- Transport state sync from NMS
- Flink normalization layer for cross-domain joins
- Cascading alarm RCA agent (Use Case 01)
- Handover failure chain agent (Use Case 02)
- Agent recommendations surfaced in NOC dashboard
- Human-in-the-loop validation workflow
- KPI → TSDB linkage by nodeId for enriched context
Advanced Intelligence & Closed-Loop
- Multi-band coverage hole agent (Use Case 03)
- Slice SLA impact agent (Use Case 04)
- SPOF risk identification — scheduled graph analysis (Use Case 05)
- Neo4j GDS — centrality and community detection
- Historical topology snapshots (Iceberg) for RCA replay
- Trust-gated autonomous remediation (neighbor add/delete)
- Closed-loop transport rerouting recommendations
- Agent performance measurement — MTTR delta, false positive rate
Graph data quality is the single largest risk to this program. A neighbor list that is 85% complete produces RCA conclusions that are correct 85% of the time — which erodes NOC trust rapidly. Invest heavily in Phase 1 data validation before building agents on top of it. A graph with known, bounded incompleteness is far more valuable than one with unknown gaps.
Decision Framework Summary
- 1M+ cells with complex multi-band neighbor meshes
- Active agentic AI / automation roadmap for NOC operations
- Documented cross-domain RCA gaps adding >30 min to MTTR
- 5G SA or NSA with network slicing SLA commitments
- Engineering capacity for graph data pipeline development
- Willingness to invest 6–9 months before agent value is realized
- Executive sponsorship for a multi-year network intelligence program
- Network is small or topology is relatively static and simple
- No agentic AI roadmap in the next 18 months
- Source system data quality is poor or API access is limited
- No engineering resources with graph/data pipeline experience
- Current NMS provides adequate cross-domain visibility
- Budget constraints require single-system consolidation first
- Simpler enrichment layer on Elasticsearch can close the gap short-term
The GraphDB topology plane is not a nice-to-have for large-scale agentic network operations at 1M+ cells — it is the foundational data structure that makes cross-domain RCA traversal computationally tractable. Every agentic use case described above degrades to a slower, less reliable version of itself without it. The investment is justified precisely because mobile network relationships are the data, not metadata — and graphs are the only model that treats them as such.