Hive Plot Matrix: Generic Constructor#

This notebook covers the features and options for the generic HivePlotMatrix class, which builds a matrix from independently constructed HivePlot instances when per-cell flexibility is required.

For additional discussion motivating Hive Plot Matrices (HPMs) and the different HPM options, see the Hive Plot Matrices tutorial.

[1]:
import matplotlib.pyplot as plt
from hiveplotlib import HivePlot, HivePlotMatrix
from hiveplotlib.datasets import example_hpm_nodes_and_edges

We will base this discussion on the following toy dataset:

[2]:
nodes, edges = example_hpm_nodes_and_edges(edge_tag_counts={"official": 90})
[3]:
nodes.data.head()
[3]:
unique_id group value1 value2 value3
0 0 A 2.579853 7.447622 8.894677
1 1 A 1.462928 9.675097 8.236987
2 2 A 2.861993 3.258254 8.550787
3 3 A 2.324560 3.704597 9.216663
4 4 A 0.313924 4.695558 8.782394
[4]:
edges
[4]:
hiveplotlib.Edges of 90 edges.
[5]:
edges.data.head()
[5]:
from to
0 2 23
1 19 13
2 12 25
3 2 20
4 6 2

Building Individual HivePlots#

Each cell’s HivePlot is constructed independently. Let’s create four that each emphasize a different group:

[6]:
# group A on the first axis
#  sorted by value1
hp_a = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["A", "B", "C"],
    all_edge_kwargs={"color": "steelblue", "alpha": 0.4},
)

# group B on the first axis instead
#  change edge color
hp_b = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["B", "C", "A"],
    all_edge_kwargs={"color": "darkorange", "alpha": 0.4},
)

# group C on the first axis instead
#  change edge color
hp_c = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["C", "A", "B"],
    all_edge_kwargs={"color": "royalblue", "alpha": 0.4},
)

# group A on the first axis
#  sorted by value3 instead
#  change edge color
hp_d = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value3",
    axes_order=["A", "B", "C"],
    all_edge_kwargs={"color": "darkgray", "alpha": 0.4},
)

List-of-Lists Input Format#

The simplest input format is a list of lists of HivePlot instances, where each inner list is a row in the resulting matrix.

Let’s arrange our four hive plots into a 2×2 grid:

[7]:
hpm = HivePlotMatrix(
    hive_plots=[[hp_a, hp_b], [hp_c, hp_d]],
    row_labels=["Top", "Bottom"],
    col_labels=["Left", "Right"],
)
fig, axes = hpm.plot()
plt.show()
../_images/notebooks_hpm_generic_10_0.png

We can track some matrix properties for each HivePlotMatrix class:

[8]:
print("matrix_type:", hpm.matrix_type)
print("shape:", hpm.shape)
print("backend:", hpm._backend)
matrix_type: generic
shape: (2, 2)
backend: matplotlib

Sparse Layouts with None Cells#

What if we don’t want or need every position filled? We can pass None for any cell to leave it empty.

The grid size is determined by the full list-of-lists shape, where None positions will render as blank axes when plotting:

[9]:
hpm_sparse = HivePlotMatrix(
    hive_plots=[[hp_a, hp_b], [hp_c, None]],
    col_labels=["Group A focus", "Group B focus"],
    row_labels=["Row 0", "Row 1"],
)
fig, axes = hpm_sparse.plot()
plt.show()
../_images/notebooks_hpm_generic_14_0.png

Dictionary Input Format#

For programmatically determined layouts, we can instead pass a dictionary mapping (row, col) integer tuples to HivePlot instances. Unspecified positions are automatically None, and the grid size is inferred from the maximum row and column indices:

[10]:
hpm_dict = HivePlotMatrix(
    hive_plots={
        (0, 0): hp_a,
        (0, 2): hp_b,
        (1, 1): hp_c,
        (1, 2): hp_d,
    },
    col_labels=["Left", "Center", "Right"],
    row_labels=["Top", "Bottom"],
)
fig, axes = hpm_dict.plot()
plt.show()
../_images/notebooks_hpm_generic_16_0.png

Cell Access#

Individual cells can be retrieved by index, and the iter_populated_cells() method provides a convenient iterator over all non-None cells. Let’s inspect the matrix we built above from a dictionary that includes None cells:

[11]:
# retrieve a single cell
print("Single Cell:")
cell = hpm_dict[0, 0]
print("hpm[0, 0]:", cell, "\n")

# inspect matrix properties
print("Matrix properties:")
print("shape:", hpm_dict.shape)
print("matrix_type:", hpm_dict.matrix_type)
print()

# Iterate over populated cells
print("Populated cells:")
for r, c, hp in hpm_dict.iter_populated_cells():
    print(f"  ({r}, {c}): axes = {list(hp.axes.keys())}")
Single Cell:
hpm[0, 0]: hiveplotlib.HivePlot: 30 nodes, axes=['A', 'B', 'C'], 90 edges, partition='group', sort='value1', backend='matplotlib'

Matrix properties:
shape: (2, 3)
matrix_type: generic

Populated cells:
  (0, 0): axes = ['A', 'B', 'C']
  (0, 2): axes = ['B', 'C', 'A']
  (1, 1): axes = ['C', 'A', 'B']
  (1, 2): axes = ['A', 'B', 'C']

Drilling Down on a Single Hive Plot in an HPM#

We can take a copy of a hive plot cell and explore further changes without disrupting the existing HPM. For example, we can switch to an interactive Hiveplotlib-supported back end like bokeh. Note, however, the below code will only run if you install Hiveplotlib with the bokeh dependencies:

pip install hiveplotlib[bokeh]

[12]:
from bokeh.io import output_notebook
from bokeh.plotting import show
from bokeh.resources import INLINE

output_notebook(resources=INLINE)

cell_hp = hpm[0, 0].copy()
cell_hp.set_viz_backend("bokeh")
show(cell_hp.plot())
Loading BokehJS ...

If we had found anomalous nodes or edges, for example, we could use the hover tool support with the bokeh back end to find the relevant node or edge IDs.

Unified Axis Scale with unify_axes#

The axis scaling choices made for each individual HivePlot instance (i.e. the vmin and vmax for each hive plot axis) are by default preserved when constructing a generic HivePlotMatrix. If these values were never explicitly set, then each hive plot axis is auto-scaled to the data range of the nodes assigned to it.

In other words, the generic HivePlotMatrix instantiation will not change anything about your existing HivePlot instances by default.

If we want to compare node positions across cells on the same scale, however, we can choose to instantiate the HivePlotMatrix with unify_axes=True, which ensures every axis on every HivePlot has the same range.

[13]:
# default: each cell was built with independent per-cell axis scaling
fig, axes = hpm.plot()
fig.suptitle("Default: unify_axes=False", y=1.02, size=16)
plt.show()
../_images/notebooks_hpm_generic_24_0.png
[14]:
# unified: unify_axes=True auto-computes a shared range across all four cells
hpm_unified = HivePlotMatrix(
    hive_plots=[[hp_a.copy(), hp_b.copy()], [hp_c.copy(), hp_d.copy()]],
    row_labels=["Top", "Bottom"],
    col_labels=["Left", "Right"],
    unify_axes=True,
)
fig, axes = hpm_unified.plot()
fig.suptitle("unify_axes=True", y=1.02, size=16)
plt.show()
../_images/notebooks_hpm_generic_25_0.png

Set a Specific Range for Unified Axes#

To force a specific range instead of auto-computing, we can pass a dictionary with vmin and / or vmax. Missing keys are auto-computed to the global min / max of the data.

This can be helpful if there are outliers or if there are important threshold values for a given sorting variable.

[15]:
# pin vmin to -10, auto-compute vmax from the data
hpm_pinned = HivePlotMatrix(
    hive_plots=[[hp_a.copy(), hp_b.copy()], [hp_c.copy(), hp_d.copy()]],
    row_labels=["Top", "Bottom"],
    col_labels=["Left", "Right"],
    unify_axes={"vmin": -10},
)
fig, axes = hpm_pinned.plot()
plt.show()
../_images/notebooks_hpm_generic_27_0.png

Note, setting any unify_axes value other than False will update each HivePlot instance’s axes ranges. If you want to leave the original hive plots untouched, make sure to take a copy of each instance in the HPM instantiation as we did above with the HivePlot.copy() method.

Apply Edge Styling to All HPM Hive Plots#

The HivePlotMatrix.update_all_edge_plotting_keyword_arguments() method applies a style change across every populated cell in a single call.

This lets us alter all hive plot edges in the HPM at once. For example, we can unify the colors of each hive plot’s edges to a single color:

[16]:
hpm_restyled = hpm.copy()

# unify all edge colors to black
hpm_restyled.update_all_edge_plotting_keyword_arguments(
    edge_kwarg_setting="all_edge_kwargs",
    color="black",
)
fig, axes = hpm_restyled.plot()
plt.show()
../_images/notebooks_hpm_generic_30_0.png

The edge_kwarg_setting parameter accepts any level from the edge kwarg hierarchy, including "clockwise_edge_kwargs" and "counterclockwise_edge_kwargs":

[17]:
hpm_directed = hpm.copy()

# remove the old color edge kwargs to avoid clashing edge kwargs
hpm_directed.update_all_edge_plotting_keyword_arguments(
    edge_kwarg_setting="all_edge_kwargs",
    reset_edge_kwarg_setting=True,
)

hpm_directed.update_all_edge_plotting_keyword_arguments(
    edge_kwarg_setting="clockwise_edge_kwargs",
    color="orange",
    linewidth=0.8,
    alpha=0.4,
)

hpm_directed.update_all_edge_plotting_keyword_arguments(
    edge_kwarg_setting="counterclockwise_edge_kwargs",
    color="green",
    linewidth=0.8,
    alpha=0.4,
)

fig, axes = hpm_directed.plot()
plt.show()
../_images/notebooks_hpm_generic_32_0.png

For a full explanation of edge kwarg options and prioritization, see the Changing Edge Keyword Arguments page.

Uniform Node Rendering#

The node_kwargs parameter applies node rendering options uniformly across every cell when building the HivePlotMatrix. Node kwargs can also be passed to .plot(), which will take precedence.

[18]:
hpm_uniform = HivePlotMatrix(
    hive_plots=[[hp_a, hp_b], [hp_c, hp_d]],
    row_labels=["Top", "Bottom"],
    col_labels=["Left", "Right"],
    node_kwargs={"s": 50, "color": "green"},
)
fig, axes = hpm_uniform.plot()
plt.show()
../_images/notebooks_hpm_generic_35_0.png
[19]:
fig, axes = hpm_uniform.plot(node_kwargs={"color": "magenta"})
plt.show()
../_images/notebooks_hpm_generic_36_0.png

Plot Options#

The plot() method accepts several keyword arguments to control figure appearance. For example, we could change the figure size:

[20]:
# figsize: override the default auto-computed size
fig, axes = hpm.plot(figsize=(6, 6))
plt.show()
../_images/notebooks_hpm_generic_38_0.png

Or if our row labels are too long and hitting the hive plots, we can rotate them with the row_label_rotation parameter.

[21]:
# row_label_rotation: rotate row labels (useful when labels are long)
fig, axes = hpm.plot(row_label_rotation=90)
plt.show()
../_images/notebooks_hpm_generic_40_0.png

Visualization Back Ends#

Two visualization back ends are supported with HPMs: matplotlib and datashader. The back end is set at construction time via the backend parameter.

By default for generic HPMs, the back end will be inferred from the first populated cell.

[22]:
print("Current backend:", hpm._backend)
Current backend: matplotlib

Datashader Back End#

Datashader renders rasterized density images with shared colorbars across all cells. This requires that hiveplotlib be installed with the datashader dependencies via:

pip install hiveplotlib[datashader]

For more on constructing hive plots with datashader, see the Hive Plots for Large Networks and Datashader pages.

Note that while the matplotlib back end only returns the figure and axes, here the plot() call also returns the node / edge rasterizations.

Below, we discuss three ways to use datashader with generic HPMs.

Method 1: Build HivePlot Instances With backend="datashader" Upfront#

With the example below, the HPM instantiation infers the back end as datashader from the first hive plot hp_a_ds.

Note that any other hive plots added to this HPM that use the matplotlib back end (i.e. hp_b_mpl) will be rendered in the final HPM with datashader.

[23]:
hp_a_ds = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["A", "B", "C"],
    all_edge_kwargs={"color": "#006BA4"},
    backend="datashader",
)
hp_b_mpl = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["B", "C", "A"],
    all_edge_kwargs={"color": "#FF800E"},
    backend="matplotlib",  # only first populated cell dictates HPM back end!
)
hpm_ds = HivePlotMatrix(
    hive_plots=[[hp_a_ds, hp_b_mpl]],
    row_labels=["Row 0"],
    col_labels=["Left", "Right"],
)
# datashader plot also returns node / edge rasterizations
fig, axes, im_nodes, im_edges = hpm_ds.plot()
plt.show()
../_images/notebooks_hpm_generic_45_0.png

Method 2: Specify backend="datashader" When Instantiating the HivePlotMatrix#

Here, the instantiation explicity converts each hive plot’s back end to datashader:

[24]:
hp_a_mpl = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["A", "B", "C"],
    all_edge_kwargs={"color": "#006BA4", "alpha": 0.4},
)
hp_b_mpl = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["B", "C", "A"],
    all_edge_kwargs={"color": "#FF800E", "alpha": 0.4},
)
hpm_ds2 = HivePlotMatrix(
    hive_plots=[[hp_a_mpl, hp_b_mpl]],
    row_labels=["Row 0"],
    col_labels=["Left", "Right"],
    backend="datashader",
)
fig, axes, im_nodes, im_edges = hpm_ds2.plot()
plt.show()
../_images/notebooks_hpm_generic_47_0.png

Method 3: Call set_viz_backend() After Construction#

We can always change the visualization back end on an existing HPM with the set_viz_backend() method:

[25]:
hp_a_mpl = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["A", "B", "C"],
    all_edge_kwargs={"color": "#006BA4", "alpha": 0.4},
)
hp_b_mpl = HivePlot(
    nodes=nodes,
    edges=edges,
    partition_variable="group",
    sorting_variables="value1",
    axes_order=["B", "C", "A"],
    all_edge_kwargs={"color": "#FF800E", "alpha": 0.4},
)
hpm_to_convert = HivePlotMatrix(
    hive_plots=[[hp_a_mpl, hp_b_mpl]],
    row_labels=["Row 0"],
    col_labels=["Left", "Right"],
)
hpm_to_convert.set_viz_backend("datashader")
fig, axes, im_nodes, im_edges = hpm_to_convert.plot()
plt.show()
../_images/notebooks_hpm_generic_49_0.png

Setting Explicit Density Cutoffs with Datashader#

The node and edge density colormaps and color range will be the same for all hive plots in the HPM.

By default, the max color range for each will top out at the maximum density value over all of the hive plots.

If preferred, users can set vmax_nodes and vmax_edges to fix the shared density max across all cells to a specific level. This can be useful when one cell is much denser than the others or if users have preferred, more-interpretable cutoffs.

[26]:
fig, axes, im_nodes, im_edges = hpm_ds.plot(vmax_nodes=15, vmax_edges=30)
plt.show()
../_images/notebooks_hpm_generic_51_0.png

Turn Off Density Colorbars with Datashader#

Users can turn off one or both node / edge colorbars that show up by default by setting show_node_colorbar / show_edge_colorbar to False (both default to True).

[27]:
fig, axes, im_nodes, im_edges = hpm_ds.plot(
    show_node_colorbar=False,
    show_edge_colorbar=False,
)
plt.show()
../_images/notebooks_hpm_generic_53_0.png

For a deeper dive into other Hive Plot Matrix convenience methods, see the HivePlotMatrix Gallery Examples.