Graph Metric Backends#

Hiveplotlib computes graph metrics through networkx, which supports dispatchable backends: plugin packages like nx-parallel (CPU parallelism) and nx-cugraph (GPU) that provide accelerated implementations of networkx algorithms. On large graphs, routing metric computation through one of these backends can speed it up considerably.

The graph_metric_backend parameter requests a backend by name. It is available on HivePlot, HivePlotMatrix, and the standalone compute_graph_metrics() function. This notebook demonstrates the parameter, per-metric overrides, surfacing fallback notices with logging, and networkx’s own backend priority configuration as an alternative route.

Note: this notebook requires that Hiveplotlib be installed with extra packages, which can be done by running:

pip install hiveplotlib[networkx]

Backend dispatch requires networkx>=3.2. The demonstrations below also use nx-parallel:

pip install nx-parallel

graph_metric_backend accepts the registered name of any installed dispatchable backend, not just nx-parallel.

[1]:
import logging

import matplotlib.pyplot as plt
import networkx as nx
from hiveplotlib import HivePlot
from hiveplotlib.converters import networkx_to_nodes_edges
from hiveplotlib.exceptions import InvalidGraphMetricBackendError
from hiveplotlib.graph_features import compute_graph_metrics
from matplotlib.cm import ScalarMappable
from matplotlib.colors import Normalize

We will use Zachary’s Karate Club graph below. It is far too small to need an accelerated backend; we use it only to demonstrate the mechanics.

[2]:
G = nx.karate_club_graph()

The graph_metric_backend Parameter#

Passing the graph_metric_backend parameter at HivePlot initialization routes every requested graph metric through the named backend. Backends register under names that differ from their pip package names: nx-parallel registers as "parallel", nx-cugraph as "cugraph".

[3]:
hp = HivePlot(
    graph=G,
    partition_variable="club",
    sorting_variables="betweenness_centrality",
    node_graph_metrics="betweenness_centrality",
    graph_metric_backend="parallel",
)
hp.nodes.data.head()
[3]:
unique_id club betweenness_centrality
0 0 Mr. Hi 0.437635
1 1 Mr. Hi 0.053937
2 2 Mr. Hi 0.143657
3 3 Mr. Hi 0.011909
4 4 Mr. Hi 0.000631

The backend changes which engine does the work, not the result; the computed column matches what default networkx produces. The construction value is also stored, so later HivePlot.compute_graph_metrics() calls on this hive plot reuse "parallel" unless overridden per call.

The degree, in_degree, and out_degree metrics are direct structural reads of the graph, so backend dispatch does not apply to them; a global backend silently skips them (the result is identical either way).

Backend Names Are Validated Up Front#

Unknown or uninstalled backend names raise an InvalidGraphMetricBackendError before any metric is computed. The common stumble is passing the pip package name where the registry name belongs:

[4]:
import traceback

try:
    HivePlot(
        graph=G,
        partition_variable="club",
        sorting_variables="degree_centrality",
        node_graph_metrics="degree_centrality",
        graph_metric_backend="nx-parallel",  # should be "parallel"
    )
except InvalidGraphMetricBackendError:
    traceback.print_exc()
Traceback (most recent call last):
  File "/tmp/ipykernel_1896915/355322967.py", line 4, in <module>
    HivePlot(
  File "/home/garyk/repos/hiveplotlib/src/hiveplotlib/hiveplot.py", line 2264, in __init__
    self._apply_graph_metrics(
  File "/home/garyk/repos/hiveplotlib/src/hiveplotlib/hiveplot.py", line 2478, in _apply_graph_metrics
    new_nodes, new_edges = compute_graph_metrics(
                           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/garyk/repos/hiveplotlib/src/hiveplotlib/graph_features/__init__.py", line 731, in compute_graph_metrics
    _validate_graph_metric_backends(
  File "/home/garyk/repos/hiveplotlib/src/hiveplotlib/graph_features/__init__.py", line 421, in _validate_graph_metric_backends
    raise InvalidGraphMetricBackendError(msg)
hiveplotlib.exceptions.hive_plot.InvalidGraphMetricBackendError: Unknown graph metric backend(s) ['nx-parallel']. Installed dispatchable backends: ['parallel']. Note that backends register under names that differ from their pip package names (nx-parallel registers as 'parallel', nx-cugraph as 'cugraph').

Per-Metric Overrides#

A per-metric "backend" entry in node_graph_metric_kwargs overrides graph_metric_backend for that one metric, whatever its value. One special case is an explicit "backend": None, which opts that single metric back to default networkx even when graph_metric_backend requests one for everything else. The edge_graph_metric_kwargs parameter (and edge_metric_kwargs on compute_graph_metrics()) carries the same reserved "backend" key for edge metrics.

[5]:
hp_override = HivePlot(
    graph=G,
    partition_variable="club",
    sorting_variables="degree",
    node_graph_metrics=[
        "degree",
        "betweenness_centrality",
        "pagerank",
    ],
    # opt pagerank back to default networkx
    node_graph_metric_kwargs={"pagerank": {"backend": None}},
    edge_graph_metrics="edge_betweenness_centrality",
    # the edge side carries the same reserved "backend" key
    edge_graph_metric_kwargs={
        "edge_betweenness_centrality": {"backend": None},
    },
    graph_metric_backend="parallel",
)
hp_override.nodes.data.head()
[5]:
unique_id club degree betweenness_centrality pagerank
0 0 Mr. Hi 16 0.437635 0.088508
1 1 Mr. Hi 9 0.053937 0.057415
2 2 Mr. Hi 10 0.143657 0.062767
3 3 Mr. Hi 6 0.011909 0.037212
4 4 Mr. Hi 3 0.000631 0.020504

In this call:

  • betweenness_centrality dispatches to "parallel".

  • pagerank and edge_betweenness_centrality run on default networkx via their explicit None entries.

  • degree is a direct structural read and never dispatches.

A per-metric "backend" entry can also name a specific backend, not just opt out with None. For example, {"pagerank": {"backend": "cugraph"}} would send that one metric to a GPU backend while the rest follow the call’s graph_metric_backend.

Backends are chosen in this order of precedence (highest first):

  1. A per-metric "backend" entry.

  2. The per-call graph_metric_backend.

  3. The graph_metric_backend value stored at the HivePlot / HivePlotMatrix initialization.

Surfacing Fallback Notices with Logging#

Not every backend implements every metric. When the requested backend does not implement a metric, Hiveplotlib falls back to default networkx for that metric. The library is silent by default, but one line of standard logging configuration makes the fallback notices visible:

[6]:
logging.basicConfig(level=logging.INFO)

The standalone compute_graph_metrics() function accepts the same graph_metric_backend parameter. At the time of writing, nx-parallel does not implement pagerank, so requesting it under the "parallel" backend triggers an INFO-level log line:

[7]:
nodes, edges = networkx_to_nodes_edges(G)

new_nodes, new_edges = compute_graph_metrics(
    G,
    node_metrics="pagerank",
    target_nodes=nodes,
    target_edges=edges,
    graph_metric_backend="parallel",
)
new_nodes.data.head()
INFO:hiveplotlib.graph_features:Metric 'pagerank' is not implemented by the 'parallel' backend for these arguments; falling back to default networkx.
[7]:
unique_id club pagerank
0 0 Mr. Hi 0.088508
1 1 Mr. Hi 0.057415
2 2 Mr. Hi 0.062767
3 3 Mr. Hi 0.037212
4 4 Mr. Hi 0.020504

The result is still correct: the fallback computed pagerank on default networkx, but we won’t get the speedup for that one metric. To see successful dispatches to the non-networkx backend, we can set the log level to DEBUG:

[8]:
hpl_logger = logging.getLogger("hiveplotlib.graph_features")
hpl_logger.setLevel(logging.DEBUG)

nodes, edges = networkx_to_nodes_edges(G)

new_nodes, new_edges = compute_graph_metrics(
    G,
    node_metrics="betweenness_centrality",
    target_nodes=nodes,
    target_edges=edges,
    graph_metric_backend="parallel",
)

hpl_logger.setLevel(logging.INFO)  # restore so later cells stay quiet
new_nodes.data.head()
DEBUG:hiveplotlib.graph_features:Metric 'betweenness_centrality' dispatched to the 'parallel' backend.
[8]:
unique_id club betweenness_centrality
0 0 Mr. Hi 0.437635
1 1 Mr. Hi 0.053937
2 2 Mr. Hi 0.143657
3 3 Mr. Hi 0.011909
4 4 Mr. Hi 0.000631

NetworkX’s Backend Priority Configuration#

networkx itself allows us to change the default backend for all dispatchable graph metrics by setting the NETWORKX_BACKEND_PRIORITY environment variable before networkx is imported.

For example, if we wanted to make nx-parallel the default backend, we could run:

export NETWORKX_BACKEND_PRIORITY=parallel

Or, we can equivalently run the following at the start of our Python code:

nx.config.backend_priority.algos = ["parallel"]

networkx will then try this backend first for each dispatchable call, Hiveplotlib’s metric computations included, and silently run its default implementation when the backend does not implement the function. This route trades control for convenience relative to the graph_metric_backend parameter: no up-front validation of backend names, no per-metric control, and silent fallbacks with no log lines.

Plotting the Computed Metrics#

A backend changes how metrics are computed, not what they are; the resulting columns plot like any other node or edge data. Below, we sort each axis by betweenness_centrality and color edges by edge_betweenness_centrality, both computed via nx-parallel. The layout sets repeat_axes=True so that intra-group edges can be drawn (see Adding and Modifying Repeat Axes).

[9]:
hp_viz = HivePlot(
    graph=G,
    partition_variable="club",
    sorting_variables="betweenness_centrality",
    node_graph_metrics="betweenness_centrality",
    graph_metric_backend="parallel",
    repeat_axes=True,
)

# with two groups and repeat axes, one inter-group edge set is redundant
#  so we drop one
hp_viz.reset_edges(axis_id_1="Mr. Hi_repeat", axis_id_2="Officer")

# the stored "parallel" backend from construction is reused here
hp_viz.compute_graph_metrics(
    edge_graph_metrics="edge_betweenness_centrality",
)
[10]:
edge_coloring_kwargs = {
    "cmap": "cividis",
    "clim": (0, 0.13),
    "alpha": 1,
}

# color edges by edge_betweenness_centrality value
hp_viz.update_edge_plotting_keyword_arguments(
    array="edge_betweenness_centrality",
    **edge_coloring_kwargs,
)

fig, ax = hp_viz.plot()

# add custom colorbar to plot
fig.colorbar(
    ScalarMappable(
        norm=Normalize(*edge_coloring_kwargs["clim"]),
        cmap=edge_coloring_kwargs["cmap"],
    ),
    orientation="horizontal",
    ax=ax,
    shrink=0.7,
    label="Edge Betweenness Centrality",
    pad=-0.2,
)

ax.set_title(
    "Karate Club, axes sorted by node betweenness centrality,\n"
    "edges colored by edge betweenness centrality",
    y=0.8,
    size=18,
)
plt.show()
../_images/notebooks_graph_metric_backends_20_0.png

For more on the available node and edge metrics and their per-metric keyword arguments, see the Computing Graph Metrics page.

For working with networks large enough to benefit from an accelerated backend, see the Hive Plots for Large Networks tutorial.