Hive Plots

Node, Axis, and HivePlot Classes

class hiveplotlib.Node(unique_id: Hashable, data: Dict | None = None)

Node instances hold the data for individual network node.

Each instance is initialized with a unique_id for identification. These IDs must be Hashable. One can also initialize with a dictionary of data, but data can also be added later with the add_data() method.

Example:
my_node = Node(unique_id="my_unique_node_id", data=my_dataset)

my_second_node = Node(unique_id="my_second_unique_node_id")
my_second_node.add_data(data=my_second_dataset)
add_data(data: Dict, overwrite_old_data: bool = False) None

Add dictionary of data to Node.data.

Parameters:
  • data – dict of data to associate with Node instance.

  • overwrite_old_data – whether to delete existing data dict and overwrite with data. Default False.

Returns:

None.

class hiveplotlib.Axis(axis_id: Hashable, start: float = 1, end: float = 5, angle: float = 0, long_name: Hashable | None = None)

Axis instance.

Axis instances are initialized based on their intended final position when plotted. Each Axis is also initialized with a unique, hashable axis_id for clarity when building hive plots with multiple axes.

The eventual size and positioning of the Axis instance is dictated in the context of polar coordinates by three parameters:

start dictates the distance from the origin to the beginning of the axis when eventually plotted.

stop dictates the distance from the origin to the end of the axis when eventually plotted.

angle sets the angle the Axis is rotated counterclockwise. For example, angle=0 points East, angle=90 points North, and angle=180 points West.

Node instances placed on each Axis instance will be scaled to fit onto the span of the Axis, but this is discussed further in the HivePlot class, which handles this placement.

Since axis_id values may be shorthand for easy referencing when typing code, if one desires a formal name to plot against each axis when visualizing, one can provide a separate long_name that will show up as the axis label when running hiveplotlib.viz code. (For example, one may choose axis_id="a1" and long_name="Axis 1".

Note

long_name defaults to axis_id if not specified.

Example:
# 3 axes, spaced out 120 degrees apart, all size 4, starting 1 unit off of origin
axis0 = Axis(axis_id="a0", start=1, end=5, angle=0, long_name="Axis 0")
axis1 = Axis(axis_id="a1", start=1, end=5, angle=120, long_name="Axis 1")
axis2 = Axis(axis_id="a2", start=1, end=5, angle=240, long_name="Axis 2")
class hiveplotlib.HivePlot

Hive Plots built from combination of Axis and Node instances.

This class is essentially methods for creating and maintaining the nested dictionary attribute edges, which holds constructed Bézier curves, edge ids, and matplotlib keyword arguments for various sets of edges to be plotted. The nested dictionary structure can be abstracted to the below example.

HivePlot.edges["starting axis"]["ending axis"]["tag"]

The resulting dictionary value holds the edge information relating to an addition of edges that are tagged as “tag,” specifically the edges going FROM the axis named “starting axis” TO the axis named “ending axis.” This value is in fact another dictionary, meant to hold the discretized Bézier curves (curves), the matplotlib keyword arguments for plotting (edge_kwargs), and the abstracted edge ids (an (m, 2) np.ndarray) between which we are drawing Bézier curves (ids).

add_axes(axes: Axis | List[Axis]) None

Add list of Axis instances to HivePlot.axes.

Note

All resulting Axis IDs must be unique.

Parameters:

axesAxis object(s) to add to HivePlot instance.

Returns:

None.

add_edge_curves_between_axes(axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0) None

Construct discretized edge curves between two axes of a HivePlot instance.

Note

One must run HivePlot.add_edge_ids() first for the two axes of interest.

Resulting discretized Bézier curves will be stored as an (n, 2) numpy.ndarray of multiple sampled curves where the first column is x position and the second column is y position in Cartesian coordinates.

Note

Although each curve is represented by a (num_steps, 2) array, all the curves are stored curves in a single collective numpy.ndarray separated by rows of [np.nan, np.nan] between each discretized curve. This allows matplotlib to accept a single array when plotting lines via plt.plot(), which speeds up plotting later.

This output will be stored in HivePlot.edges[axis_id_1][axis_id_2][tag]["curves"].

Parameters:
  • axis_id_1 – pointer to first of two Axis instances in HivePlot.axes between which we want to find connections.

  • axis_id_2 – pointer to second of two Axis instances in HivePlot.axes between which we want to find connections.

  • tag – unique ID specifying which subset of edges specified by their IDs to construct (e.g. HivePlot.edges[axis_id_1][axis_id_2][tag]["ids"]). Note, if no tag is specified (e.g. tag=None), it is presumed there is only one tag for the specified set of axes to look over, which can be inferred. If no tag is specified and there are multiple tags to choose from, a ValueError will be raised.

  • a1_to_a2 – whether to build out the edges going FROM axis_id_1 TO axis_id_2.

  • a2_to_a1 – whether to build out the edges going FROM axis_id_2 TO axis_id_1.

  • num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.

  • short_arc – whether to take the shorter angle arc (True) or longer angle arc (False). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the default True. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.

  • control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default 1 sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.

  • control_angle_shift – how far to rotate the control point for each edge around the origin. Default 0 sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.

Returns:

None.

add_edge_ids(edges: ndarray, axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True) Hashable

Find and store the edge IDs relevant to the specified pair of axes.

Find the subset of network connections that involve nodes on axis_id_1 and axis_id_2. looking over the specified edges compared to the IDs of the Node instances currently placed on each Axis. Edges discovered between the specified two axes (depending on the values specified by a1_to_a2 and a2_to_a1, more below) will have the relevant edge IDs stored, with other edges disregarded.

Generates (j, 2) and (k, 2) numpy arrays of axis_id_1 to axis_id_2 connections and axis_id_2 to axis_id_1 connections (or only 1 of those arrays depending on parameter choices for a1_to_a2 and a2_to_a1).

The resulting arrays of relevant edge IDs (e.g. each row is a [<FROM ID>, <TO ID>] edge) will be stored automatically in HivePlot.edges, a dictionary of dictionaries of dictionaries of edge information, which can later be converted into discretized edges to be plotted in Cartesian space. They are stored as HivePlot.edges[<source_axis_id>][<sink_axis_id>][<tag>]["ids"].

Note

If no tag is provided (e.g. default None), one will be automatically generated and returned by this method call.

Parameters:
  • edges(n, 2) array of Hashable values representing unique IDs of specific Node instances. The first column is the IDs for the “from” nodes and the second column is the IDS for the “to” nodes for each connection.

  • axis_id_1 – pointer to first of two Axis instances in HivePlot.axes between which we want to find connections.

  • axis_id_2 – pointer to second of two Axis instances in HivePlot.axes between which we want to find connections.

  • tag – tag corresponding to subset of specified edges. If None is provided, the tag will be set as the lowest unused integer starting at 0 amongst the available tags under HivePlot.edges[axis_id_1][axis_id_2] and / or HivePlot.edges[axis_id_2][axis_id_1].

  • a1_to_a2 – whether to find the connections going FROM axis_id_1 TO axis_id_2.

  • a2_to_a1 – whether to find the connections going FROM axis_id_2 TO axis_id_1.

Returns:

the resulting unique tag. Note, if both a1_to_a2 and a2_to_a1 are True the resulting unique tag returned will be the same for both directions of edges.

add_edge_kwargs(axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, **edge_kwargs) None

Add edge kwargs to the constructed HivePlot.edges between two axes of a HivePlot.

For a given set of edges for which edge kwargs were already set, any redundant edge kwargs specified by this method call will overwrite the previously set kwargs.

Expected to have found edge IDs between the two axes before calling this method, which can be done either by calling HivePlot.connect_axes() method or the lower-level HivePlot.add_edge_ids() method for the two axes of interest.

Resulting kwargs will be stored as a dict. This output will be stored in HivePlot.edges[axis_id_1][axis_id_2][tag]["edge_kwargs"].

Note

There is special handling in here for when the two provided axes have names "<axis_name>" and "<axis_name>_repeat". This is for use with hiveplotlib.hive_plot_n_axes(), which when creating repeat axes always names the repeated one "<axis_name>_repeat". By definition, the edges between an axis and its repeat are the same, and therefore edges between these two axes should only be plotted in one direction. If one is running this method on a Hiveplot instance from hiveplotlib.hive_plot_n_axes() though, a warning of a lack of edges in both directions for repeat edges is not productive, so we formally catch this case.

Parameters:
  • axis_id_1 – Hashable pointer to the first Axis instance in HivePlot.axes we want to add plotting kwargs to.

  • axis_id_2 – Hashable pointer to the second Axis instance in HivePlot.axes we want to add plotting kwargs to.

  • tag – which subset of curves to modify kwargs for. Note, if no tag is specified (e.g. tag=None), it is presumed there is only one tag for the specified set of axes to look over and that will be inferred. If no tag is specified and there are multiple tags to choose from, a ValueError will be raised.

  • a1_to_a2 – whether to add kwargs for connections going FROM axis_id_1 TO axis_id_2.

  • a2_to_a1 – whether to add kwargs for connections going FROM axis_id_2 TO axis_id_1.

  • edge_kwargs – additional matplotlib keyword arguments that will be applied to the specified edges.

Returns:

None.

add_nodes(nodes: List[Node], check_uniqueness: bool = True) None

Add Node instances to HivePlot.nodes.

Parameters:
  • nodes – collection of Node instances, will be added to HivePlot.nodes dict with unique IDs as keys.

  • check_uniqueness – whether to formally check for uniqueness. WARNING: the only reason to turn this off is if the dataset becomes big enough that this operation becomes expensive, and you have already established uniqueness another way (for example, you are pulling data from a database and the key in your table is the unique ID). If you add non-unique IDs with check_uniqueness=False, we make no promises about output.

Returns:

None.

connect_axes(edges: ndarray, axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0, **edge_kwargs) Hashable

Construct all the curves and set all the curve kwargs between axis_id_1 and axis_id_2.

Based on the specified edges parameter, build out the resulting Bézier curves, and set any kwargs for those edges for later visualization.

The curves will be tracked by a unique tag, and the resulting constructions will be stored in HivePlot.edges[axis_id_1][axis_id_2][tag] if a1_to_a2 is True and HivePlot.edges[axis_id_2][axis_id_1][tag] if a2_to_a1 is True.

Note

If trying to draw different subsets of edges with different kwargs, one can run this method multiple times with different subsets of the entire edges array, providing unique tag values with each subset of edges, and specifying different edge_kwargs each time. The resulting HivePlot instance would be plotted showing each set of edges styled with each set of unique kwargs.

Note

You can choose to construct edges in only one of either directions by specifying a1_to_a2 or a2_to_a1 as False (both are True by default).

Parameters:
  • edges(n, 2) array of Hashable values representing pointers to specific Node instances. The first column is the “from” and the second column is the “to” for each connection.

  • axis_id_1 – Hashable pointer to the first Axis instance in HivePlot.axes we want to find connections between.

  • axis_id_2 – Hashable pointer to the second Axis instance in HivePlot.axes we want to find connections between.

  • tag – tag corresponding to specified edges. If None is provided, the tag will be set as the lowest unused integer starting at 0 amongst the available tags under HivePlot.edges[from_axis_id][to_axis_id] and / or HivePlot.edges[to_axis_id][from_axis_id].

  • a1_to_a2 – whether to find and build the connections going FROM axis_id_1 TO axis_id_2.

  • a2_to_a1 – whether to find and build the connections going FROM axis_id_2 TO axis_id_1.

  • num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.

  • short_arc – whether to take the shorter angle arc (True) or longer angle arc (False). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the default True. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.

  • control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default 1 sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.

  • control_angle_shift – how far to rotate the control point for each edge around the origin. Default 0 sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.

  • edge_kwargs – additional matplotlib params that will be applied to the related edges.

Returns:

Hashable tag that identifies the generated curves and kwargs.

construct_curves(num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0) None

Construct Bézier curves for any connections for which we’ve specified the edges to draw.

(e.g. HivePlot.edges[axis_0][axis_1][<tag>]["ids"] is non-empty but HivePlot.edges[axis_0][axis_1][<tag>]["curves"] does not yet exist).

Note

Checks all <tag> values between axes.

Parameters:
  • num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.

  • short_arc – whether to take the shorter angle arc (True) or longer angle arc (False). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the default True. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.

  • control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default 1 sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.

  • control_angle_shift – how far to rotate the control point for each edge around the origin. Default 0 sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.

Returns:

None.

copy() HivePlot

Return a copy of the HivePlot instance.

Returns:

HivePlot instance.

place_nodes_on_axis(axis_id: Hashable, unique_ids: List[Hashable] | None | ndarray = None, sorting_feature_to_use: Hashable | None = None, vmin: float | None = None, vmax: float | None = None) None

Set node positions on specific Axis.

Cartesian coordinates will be normalized to specified vmin and vmax. Those vmin and vmax values will then be normalized to span the length of the axis when plotted.

Parameters:
  • axis_id – which axis (as specified by the keys from HivePlot.axes) for which to plot nodes.

  • unique_ids – list of node IDs to assign to this axis. If previously set with HivePlot._allocate_nodes_to_axis(), this will overwrite those node assignments. If None, method will check and confirm there are existing node ID assignments.

  • sorting_feature_to_use – which feature in the node data to use to align nodes on an axis. Default None uses the feature previously assigned via HivePlot.axes[axis_id]._set_node_placement_label().

  • vmin – all values less than vmin will be set to vmin. Default None sets as global minimum of feature values for all Node instances on specified Axis.

  • vmax – all values greater than vmax will be set to vmin. Default None sets as global maximum of feature values for all Node instances on specified Axis.

Returns:

None.

reset_edges(axis_id_1: Hashable | None = None, axis_id_2: Hashable | None = None, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True) None

Reset HivePlot.edges.

Setting all the parameters to None deletes any stored connections between axes previously computed. If any subset of the parameters is not None, the resulting edges will be deleted:

If axis_id_1, axis_id_2, and tag are all specified as not None, the implied single subset of edges will be deleted. (Note, tags are required to be unique within a specified (axis_id_1, axis_id_2) pair.) In this case, the default is to delete all the edges bidirectionally (e.g. going axis_id_1 -> axis_id_2 and axis_id_2 -> axis_id_1) with the specified tag. To only delete edges in one of these directions, see the description of the bool parameters a1_to_a2 and a2_to_a1 below.

If only axis_id_1 and axis_id_2 are provided as not None, then the default is to delete all edge subsets bidirectionally between axis_id_1 to axis_id_2 (e.g. going axis_id_1 -> axis_id_2 and axis_id_2 -> axis_id_1) with the specified tag. To only delete edges in one of these directions, see the description of the bool parameters a1_to_a2 and a2_to_a1 below.

If only axis_id_1 is provided as not None, then all edges going TO and FROM axis_id_1 will be deleted. To only delete edges in one of these directions, see the description of the bool parameters a1_to_a2 and a2_to_a1 below.

Parameters:
  • axis_id_1 – specifies edges all coming FROM the axis identified by this unique ID.

  • axis_id_2 – specifies edges all coming TO the axis identified by this unique ID.

  • tag – tag corresponding to explicit subset of added edges.

  • a1_to_a2 – whether to remove the connections going FROM axis_id_1 TO axis_id_2. Note, if axis_id_1 is specified by axis_id_2 is None, then this dictates whether to remove all edges going from axis_id_1.

  • a2_to_a1 – whether to remove the connections going FROM axis_id_2 TO axis_id_1. Note, if axis_id_1 is specified by axis_id_2 is None, then this dictates whether to remove all edges going to axis_id_1.

Returns:

None.

to_json() str

Return the information from the axes, nodes, and edges in Cartesian space as a serialized JSON string.

This allows users to visualize hive plots with arbitrary libraries, even outside of python.

The dictionary structure of the resulting JSON will consist of two top-level keys:

“axes” - contains the information for plotting each axis, plus the nodes on each axis in Cartesian space.

“edges” - contains the information for plotting the discretized edges in Cartesian space, plus the corresponding to and from IDs that go with each edge, as well as any kwargs that were set for plotting each set of edges.

Returns:

JSON output of axis, node, and edge information.

Quick Hive Plots

hiveplotlib.hive_plot_n_axes(node_list: List[Node], edges: ndarray | List[ndarray], axes_assignments: List[List[Hashable | None]], sorting_variables: List[Hashable], axes_names: List[Hashable] | None = None, repeat_axes: List[bool] | None = None, vmins: List[float] | None = None, vmaxes: List[float] | None = None, angle_between_repeat_axes: float = 40, orient_angle: float = 0, all_edge_kwargs: Dict | None = None, edge_list_kwargs: List[Dict] | None = None, cw_edge_kwargs: Dict | None = None, ccw_edge_kwargs: Dict | None = None, repeat_edge_kwargs: Dict | None = None) HivePlot

Generate a HivePlot Instance with an arbitrary number of axes, as specified by passing a partition of node IDs.

Repeat axes can be generated for any desired subset of axes, but repeat axes will be sorted by the same variable as the original axis.

Axes will be added in counterclockwise order.

Axes will all be the same length and position from the origin.

Changes to all the edge kwargs can be affected with the all_edge_kwargs parameter. If providing multiple sets of edges (e.g. a list input for the edges parameter), one can also provide unique kwargs for each set of edges by specifying a corresponding list of kwargs with the edge_list_kwargs parameter.

Edges directed counterclockwise will be drawn as solid lines by default. Clockwise edges will be drawn as solid lines by default. All CW / CCW lines kwargs can be changed with the cw_edge_kwargs and ccw_edge_kwargs parameters, respectively. Edges between repeat axes will be drawn as solid lines by default. Repeat edges operate under their own set of visual kwargs (repeat_edge_kwargs) as clockwise vs counterclockwise edges don’t have much meaning when looking within a single group.

Specific edge kwargs can also be changed by running the add_edge_kwargs() method on the resulting HivePlot instance, where the specified tag of edges to change will be the index value in the list of lists in edges (note: a tag is only necessary if the indices input is a list of lists, otherwise there would only be a single tag of edges, which can be inferred).

There is a hierarchy to these various kwarg arguments. That is, if redundant / overlapping kwargs are provided for different kwarg parameters, a warning will be raised and priority will be given according to the below hierarchy (Note: cw_edge_kwargs, ``ccw_edge_kwargs, and repeat_edge_kwargs do not interact with each other in practice, and are therefore equal in the hierarchy):

edge_list_kwargs > cw_edge_kwargs / ccw_edge_kwargs / repeat_edge_kwargs > all_edge_kwargs.

Parameters:
  • node_list – List of Node instances to go into output HivePlot instance.

  • edges(n, 2) array of Hashable values representing pointers to specific Node instances. The first column is the “from” and the second column is the “to” for each connection. Alternatively, one can provide a list of two-column arrays, which will allow for plotting different sets of edges with different kwargs.

  • axes_assignments – list of lists of node unique IDs. Each list of node IDs will be assigned to a separate axis in the resulting HivePlot instance, built out in counterclockwise order. If None is provided as one of the elements instead of a list of node IDs, then all unassigned nodes will be aggregated onto this axis.

  • sorting_variables – list of Hashable variables on which to sort each axis, where the ith index Hashable corresponds to the ith index list of nodes in axes_assignments (e.g. the ith axis of the resulting HivePlot).

  • axes_names – list of Hashable names for each axis, where the ith index Hashable corresponds to the ith index list of nodes in axes_assignments (e.g. the ith axis of the resulting HivePlot). Default None names the groups as “Group 1,” “Group 2,” etc.

  • repeat_axes – list of bool values of whether to generate a repeat axis, where the ith index bool corresponds to the ith index list of nodes in axes_assignments (e.g. the ith axis of the resulting HivePlot). A True value generates a repeat axis. Default None assumes no repeat axes (e.g. all False).

  • vmins – list of float values (or None values) specifying the vmin for each axis, where the ith index value corresponds to the ith index list of nodes in axes_assignments (e.g. the ith axis of the resulting HivePlot). A None value infers the global min for that axis. Default None uses the global min for all the axes.

  • vmaxes – list of float values (or None values) specifying the vmax for each axis, where the ith index value corresponds to the ith index list of nodes in axes_assignments (e.g. the ith axis of the resulting HivePlot). A None value infers the global max for that axis. Default None uses the global max for all the axes.

  • angle_between_repeat_axes – angle between repeat axes. Default 40 degrees.

  • orient_angle – rotates all axes counterclockwise from their initial angles (default 0 degrees).

  • all_edge_kwargs – kwargs for all edges. Default None specifies no additional kwargs.

  • edge_list_kwargs – list of dictionaries of kwargs for each element of edges when edges is a list. The ith set of kwargs in edge_list_kwargs will only be applied to edges constructed from the ith element of edges. Default None provides no additional kwargs. Note, list must be same length as edges.

  • cw_edge_kwargs – kwargs for edges going clockwise. Default None specifies a solid line.

  • ccw_edge_kwargs – kwargs for edges going counterclockwise. Default None specifies a solid line.

  • repeat_edge_kwargs – kwargs for edges between repeat axes. Default None specifies a solid line.

Returns:

HivePlot instance.

Converters

Converters from various data structures to hiveplotlib-ready structures.

hiveplotlib.converters.networkx_to_nodes_edges(graph: networkx.classes.graph.Graph instance) Tuple[List[Node], ndarray]

Take a networkx graph and return hiveplotlib-friendly data structures.

Specifically, returns a list of hiveplotlib.Node instances and an (n, 2) np.ndarray of edges. These outputs can be fed directly into hive_plot_n_axes()

Parameters:

graphnetworkx graph.

Returns:

list of Node instances, (n, 2) np.ndarray of edges.

Utility Functions

Helper static methods for working with node data.

hiveplotlib.node.dataframe_to_node_list(df: DataFrame, unique_id_column: Hashable) List[Node]

Convert a dataframe into Node instances, where each row will be turned into a single instance.

Parameters:
  • df – dataframe to use to generate Node instances.

  • unique_id_column – which column corresponds to unique IDs for the eventual nodes.

Returns:

list of Node instances.

hiveplotlib.node.split_nodes_on_variable(node_list: List[Node], variable_name: Hashable, cutoffs: List[float] | int | None = None, labels: List[Hashable] | None = None) Dict[Hashable, List[Node]]

Split a list of Node instances into a partition of node IDs.

By default, splits will group node IDs on unique values of variable_name.

If variable_name corresponds to numerical data, and a list of cutoffs is provided, node IDs will be separated into bins according to the following binning scheme:

(-inf, cutoff[0]], (cutoff[0], cutoff[1]], … , (cutoff[-1], inf]

If variable_name corresponds to numerical data, and cutoffs is provided as an int, node IDs will be separated into cutoffs equal-sized quantiles.

Note

This method currently only supports splits where variable_name corresponds to numerical data.

Parameters:
  • node_list – list of Node instances to partition.

  • variable_name – which variable in each Node instances to group by.

  • cutoffs – cutoffs to use in binning nodes according to data under variable_name. Default None will bin nodes by unique values of variable_name. When provided as a list, the specified cutoffs will bin according to (-inf, cutoffs[0]], (`cutoffs[0]`, cutoffs[1]], … , (cutoffs[-1], inf). When provided as an int, the exact numerical break points will be determined to create cutoffs equally-sized quantiles.

  • labels – labels assigned to each bin. Only referenced when cutoffs is not None. Default None labels each bin as a string based on its range of values. Note, when cutoffs is a list, len(labels) must be 1 greater than len(cutoffs). When cutoffs is an int, len(labels) must be equal to cutoffs.

Returns:

dict whose values are lists of Node unique IDs. If cutoffs is None, keys will be the unique values for the variable. Otherwise, each key will be the string representation of a bin range.

Utility functions for hive plot curvature and coordinates.

hiveplotlib.utils.bezier(start: float, end: float, control: float, num_steps: int = 100) ndarray

Calculate 1-dimensional Bézier curve values between start and end with curve based on control.

Note, this function is hardcoded for exactly 1 control point.

Parameters:
  • start – starting point.

  • end – ending point.

  • control – “pull” point.

  • num_steps – number of points on Bézier curve.

Returns:

(num_steps, ) sized np.ndarray of 1-dimensional discretized Bézier curve output.

hiveplotlib.utils.bezier_all(start_arr: List[float] | ndarray, end_arr: List[float] | ndarray, control_arr: List[float] | ndarray, num_steps: int = 100) ndarray

Calculate Bézier curve between multiple start and end values.

Note, this function is hardcoded for exactly 1 control point per curve.

Parameters:
  • start_arr – starting point of each curve.

  • end_arr – corresponding ending point of each curve.

  • control_arr – corresponding “pull” points for each curve.

  • num_steps – number of points on each Bézier curve.

Returns:

(start_arr * num_steps, ) sized np.ndarray of 1-dimensional discretized Bézier curve output. Note, every num_steps chunk of the output corresponds to a different Bézier curve.

hiveplotlib.utils.cartesian2polar(x: ndarray | float, y: ndarray | float) Tuple[ndarray | float, ndarray | float]

Convert cartesian coordinates e.g. (x, y) to polar coordinates.

(Polar coordinates e.g. (rho, phi), where rho is distance from origin, and phi is counterclockwise angle off of x-axis in degrees.)

Parameters:
  • x – Cartesian x coordinates.

  • y – Cartesian y coordinates.

Returns:

(rho, phi) polar coordinates.

hiveplotlib.utils.polar2cartesian(rho: ndarray | float, phi: ndarray | float) Tuple[ndarray | float, ndarray | float]

Convert polar coordinates to cartesian coordinates e.g. (x, y).

(Polar coordinates e.g. (rho, phi), where rho is distance from origin, and phi is counterclockwise angle off of x-axis in degrees.)

Parameters:
  • rho – distance from origin.

  • phi – counterclockwise angle off of x-axis in degrees (not radians).

Returns:

(x, y) cartesian coordinates.

Polar Parallel Coordinates Plots

P2CP Class

class hiveplotlib.P2CP(data: DataFrame | None = None)

Polar Parallel Coordinates Plots (P2CPs).

Conceptually similar to Hive Plots, P2CPs can be used for any multivariate data as opposed to solely for network visualizations. Features of the data are placed on their own axes in the same polar setup as Hive Plots, resulting in each representation of a complete data point being a loop in the resulting figure. For more on the nuances of P2CPs, see Koplik and Valente, 2021.

add_edge_kwargs(tag: Hashable | None = None, **edge_kwargs) None

Add edge kwargs to a tag of Bézier curves previously constructed with P2CP.build_edges().

For a given tag of curves for which edge kwargs were already set, any redundant edge kwargs specified by this method call will overwrite the previously set kwargs.

Note

Expected to have previously called P2CP.build_edges() before calling this method, for the tag of interest. However, if no tags were ever set (e.g. there’s only 1 tag of curves), then no tag is necessary here.

Parameters:
  • tag – which subset of curves to modify kwargs for. Note, if no tag is specified (e.g. tag=None), it is presumed there is only one tag to look over and that will be inferred. If no tag is specified and there are multiple tags to choose from, a ValueError will be raised.

  • edge_kwargs – additional matplotlib keyword arguments that will be applied to edges constructed for the referenced indices.

Returns:

None.

build_edges(indices: List[int] | ndarray | str = 'all', tag: Hashable | None = None, num_steps: int = 100, **edge_kwargs) Hashable

Construct the loops of the P2CP for the specified subset of indices.

These index values correspond to the indices of the pandas dataframe P2CP.data.

Note

Specifying indices="all" draws the curves for the entire dataframe.

Parameters:
  • indices – which indices of the underlying dataframe to draw on the P2CP. Note, “all” draws the entire dataframe.

  • tag – tag corresponding to specified indices. If None is provided, the tag will be set as the lowest unused integer starting at 0 amongst the tags.

  • num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.

  • edge_kwargs – additional matplotlib keyword arguments that will be applied to edges constructed for the referenced indices.

Returns:

the unique, Hashable tag used for the constructed edges.

copy() P2CP

Return a copy of the P2CP instance.

Returns:

P2CP instance.

reset_edges(tag: Hashable | None = None) None

Drop the constructed edges with the specified tag.

Note

If no tags were ever set (e.g. there’s only 1 tag of curves), then no tag is necessary here.

Parameters:

tag – which subset of curves to delete. Note, if no tag is specified (e.g. tag=None), then all curves will be deleted.

Returns:

None.

set_axes(columns: List[Hashable] | ndarray, angles: List[float] | None = None, vmins: List[float] | None = None, vmaxes: List[float] | None = None, axis_kwargs: List[Dict] | None = None, overwrite_previously_set_axes: bool = True, start_angle: float = 0) None

Set the axes that will be used in the eventual P2CP visualization.

Parameters:
  • columns – column names from P2CP.data to use. Note, these need not be unique, as repeat axes may be desired. By default, repeat column names will be internally renamed to name + "\nRepeat".

  • angles – corresponding angles (in degrees) to set for each desired axis. Default None sets the angles evenly spaced over 360 degrees, starting at start_angle degrees for the first axis and moving counterclockwise.

  • vmins – list of float values (or None values) specifying the vmin for each axis, where the ith index value corresponds to the ith axis set by columns. A None value infers the global min for that axis. Default None uses the global min for all axes.

  • vmaxes – list of float values (or None values) specifying the vmax for each axis, where the ith index value corresponds to the ith axis set by columns. A None value infers the global max for that axis. Default None uses the global max for all axes.

  • axis_kwargs – list of dictionaries of additional kwargs that will be used for the underlying Axis instances that will be created for each column. Only relevant if you want to change the positioning / length of an axis with the start and end parameters. For more on these kwargs, see the documentation for hiveplotlib.Axis. Note, if you want to add these kwargs for only a subset of the desired axes, you can skip adding kwargs for specific columns by putting a None at those indices in your axis_kwargs input.

  • overwrite_previously_set_axes – Whether to overwrite any previously decided axes. Default True overwrites any existing axes.

  • start_angle – if angles is None, sets the starting angle from which we place the axes around the origin counterclockwise.

Returns:

None.

set_data(data: DataFrame) None

Add a dataset to the P2CP instance.

All P2CP construction will be based on this dataset, which will be stored as P2CP.data.

Parameters:

data – dataframe to add.

Returns:

None.

to_json() str

Return the information from the axes, point placement on each axis, and edges in Cartesian space as JSON.

This allows users to visualize P2CPs with arbitrary libraries, even outside of python.

The dictionary structure of the resulting JSON will consist of two top-level keys:

“axes” - contains the information for plotting each axis, plus the points on each axis in Cartesian space.

“edges” - contains the information for plotting the discretized edges in Cartesian space broken up by tag values, plus the corresponding unique IDs of points that go with each tag, as well as any kwargs that were set for plotting each set of points in a given tag.

Returns:

JSON output of axis, point, and edge information.

Quick P2CPs

hiveplotlib.p2cp_n_axes(data: DataFrame, indices: List[int] | List[List[int]] | List[ndarray] | str = 'all', split_on: Hashable | List[Hashable] | None = None, axes: List[Hashable] | None = None, vmins: List[float] | None = None, vmaxes: List[float] | None = None, orient_angle: float = 0, all_edge_kwargs: Dict | None = None, indices_list_kwargs: List[Dict] | None = None) P2CP

Generate a P2CP instance with an arbitrary number of axes for an arbitrary dataframe.

Can specify a desired subset of column names, each of which will become an axis in the resulting P2CP. Default grabs all columns in the dataframe, unless split_on is a column name, in which case that specified column will be excluded from the list of axes in the final P2CP instance. Note, repeat axes (e.g. repeated column names) are allowed here.

Axes will be added in counterclockwise order. Axes will all be the same length and position from the origin.

In deciding what edges of data get drawn (and how they get drawn), the user has several options. The default behavior plots all data points in data with the same keyword arguments. If one instead wanted to plot a subset of data points, one can provide a list of a subset of indices from the dataframe to the indices parameter.

If one wants to plot multiple sets of edges in different styles, there are two means of doing this. The more automated means is to split on the unique values of a column in the provided data. By specifying a column name to the split_on parameter, data will be added in chunks according to the unique values of the specified column. If one instead includes a list of values corresponding to the records in data, data will be added according to the unique values of this provided list. Each subset of data corresponding to a unique column value will be given a separate tag, with the tag being the unique column value. Note, however, this only works when indices="all". If one prefers to split indices manually, one can instead provide a list of lists to the indices parameter, allowing for arbitrary splitting of the data. Regardless of how one chooses to split the data, one can then assign different keyword arguments to each subset of data.

Changes to all the edge kwargs can be affected with the all_edge_kwargs parameter. If providing multiple sets of edges though in one of the ways discussed above, one can also provide unique kwargs for each set of edges by specifying a corresponding list of dictionaries of kwargs with the indices_list_kwargs parameter.

Specific edge kwargs can also be changed later by running the add_edge_kwargs() method on the returned P2CP instance. If one only added a single set of indices (e.g. indices="all" or indices was provided as a flat list of index values), then this method can simply be called with kwargs. However, if multiple subsets of edges were specified, then one will need to be precise about which tag of edge kwargs to change. If multiple sets were provided via the indices parameter, then the resulting tag for each subset will correspond to the index value in the list of lists in indices. If instead split_on_column was specified as not None, then tags will be the unique values in the specified column / list of values. Regardless of splitting methodology, existing tags can be found under the returned P2CP.tags.

There is a hierarchy to these kwarg arguments. That is, if redundant / overlapping kwargs are provided for different kwarg parameters, a warning will be raised and priority will be given according to the below hierarchy:

indices_list_kwargs > all_edge_kwargs.

Parameters:
  • data – dataframe to add.

  • indiceslist of index values from the index of the added dataframe data. Default “all” creates edges for every row in data, but a list input creates edges for only the specified subset. Alternatively, one can provide a list of lists of indices, which will allow for plotting different sets of edges with different kwargs. These subsets will be added to the resulting P2CP instance with tags corresponding to the index value in indices.

  • split_on – column name from data or list of values corresponding to the records of data. If specified as not None, the resulting P2CP instance will split data according to unique values with respect to the column of data / the list of provided values, with each subset of data given a tag of the unique value corresponding to each subset. When specifying a column in data, this column will be excluded from consideration if axes is None. Note: this subsetting can only be run when indices="all". Default None plots all the records in data with the same line kwargs.

  • axes – list of Hashable column names in data. Each column name will be assigned to a separate axis in the resulting P2CP instance, built out in counterclockwise order. Default None grabs all columns in the dataframe, unless split_on is a column name, in which case that specified column will be excluded from the list of axes in the final P2CP instance. Note, repeat axes (e.g. repeated column names) are allowed here.

  • vmins – list of float values (or None values) specifying the vmin for each axis, where the ith index value corresponds to the ith index axis in axes (e.g. the ith axis of the resulting P2CP instance). A None value infers the global min for that axis. Default None uses the global min for all the axes.

  • vmaxes – list of float values (or None values) specifying the vmax for each axis, where the ith index value corresponds to the ith index axis in axes (e.g. the ith axis of the resulting P2CP instance). A None value infers the global max for that axis. Default None uses the global max for all the axes.

  • orient_angle – rotates all axes counterclockwise from their initial angles (default 0 degrees).

  • all_edge_kwargs – kwargs for all edges. Default None specifies no additional kwargs.

  • indices_list_kwargs – list of dictionaries of kwargs for each element of indices when indices is a list of lists or split_on is not None. The ith set of kwargs in indices_list_kwargs will only be applied to index values corresponding to the ith list in indices or to index values which have the ith unique value in a sorted list of unique values in split_on. Default None provides no additional kwargs. Note, this list must be same length as indices or the same number of values as the number of unique values in split_on.

Returns:

P2CP instance.

Utility Functions

Helper static methods for generating and working with P2CP instances.

hiveplotlib.p2cp.indices_for_unique_values(df: DataFrame, column: Hashable) Dict[Hashable, ndarray]

Find the indices corresponding to each unique value in a column of a pandas dataframe.

Works when the values contained in column are numerical or categorical.

Parameters:
  • df – dataframe from which to find index values.

  • column – column of the dataframe to use to find indices corresponding to each of the column’s unique values.

Returns:

dict whose keys are the unique values in the column of data and whose values are 1d arrays of index values.

hiveplotlib.p2cp.split_df_on_variable(df: DataFrame, column: Hashable, cutoffs: List[float] | int, labels: List[Hashable] | ndarray | None = None) ndarray

Generate value for each record in a dataframe according to a splitting criterion.

Using either specified cutoff values or a specified number of quantiles for cutoffs, return an (n, 1) np.ndarray where the ith value corresponds to the partition assignment of the ith record of df.

If column corresponds to numerical data, and a list of cutoffs is provided, then dataframe records will be assigned according to the following binning scheme:

(-inf, cutoff[0]], (cutoff[0], cutoff[1]], … , (cutoff[-1], inf]

If column corresponds to numerical data, and cutoffs is provided as an int, then dataframe records will be assigned into cutoffs equal-sized quantiles.

Note

This method currently only supports splits where column corresponds to numerical data. For splits on categorical data values, see indices_for_unique_values().

Parameters:
  • df – dataframe whose records will be assigned to a partition.

  • column – column of the dataframe to use to assign partition of records.

  • cutoffs – cutoffs to use in partitioning records according to the data under column. When provided as a list, the specified cutoffs will partition according to (-inf, cutoffs[0]], (`cutoffs[0]`, cutoffs[1]], … , (cutoffs[-1], inf). When provided as an int, the exact numerical break points will be determined to create cutoffs equally-sized quantiles.

  • labels – labels assigned to each bin. Default None labels each bin as a string based on its range of values. Note, when cutoffs is a list, len(labels) must be 1 greater than len(cutoffs). When cutoffs is an int, len(labels) must be equal to cutoffs.

Returns:

(n, 1) np.ndarray whose values are partition assignments corresponding to records in df.

Visualization

Matplotlib

matplotlib-backend visualizations in hiveplotlib.

hiveplotlib.viz.matplotlib.axes_viz(instance: HivePlot | P2CP, fig: Figure | None = None, ax: Axes | None = None, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, axes_off: bool = True, fig_kwargs: dict | None = None, text_kwargs: dict | None = None, **axes_kwargs) Tuple[Figure, Axes]

matplotlib visualization of axes in a HivePlot or P2CP instance.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw axes.

  • fig – default None builds new figure. If a figure is specified, axes will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for axes labels.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • text_kwargs – additional kwargs passed to plt.text() call.

  • axes_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a plt.plot() call.

Returns:

matplotlib figure, axis.

hiveplotlib.viz.matplotlib.edge_viz(instance: HivePlot | P2CP, fig: Figure | None = None, ax: Axes | None = None, tags: Hashable | List[Hashable] | None = None, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, buffer: float = 0.1, axes_off: bool = True, fig_kwargs: dict | None = None, **edge_kwargs) Tuple[Figure, Axes]

matplotlib visualization of constructed edges in a HivePlot or P2CP instance.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • fig – default None builds new figure. If a figure is specified, edges will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, edges will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.HivePlot.connect_axes() / hiveplotlib.P2CP.build_edges() or hiveplotlib.HivePlot.add_edge_kwargs() / hiveplotlib.P2CP.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.HivePlot.add_edge_kwargs() / hiveplotlib.P2CP.add_edge_kwargs() for more. Note, these are kwargs that affect a matplotlib.collections.LineCollection() call.

Returns:

matplotlib figure, axis.

hiveplotlib.viz.matplotlib.hive_plot_viz(hive_plot: HivePlot, fig: Figure | None = None, ax: Axes | None = None, tags: Hashable | List[Hashable] | None = None, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) Tuple[Figure, Axes]

matplotlib visualization of a HivePlot instance.

Parameters:
  • hive_plotHivePlot instance for which we want to draw edges.

  • fig – default None builds new figure. If a figure is specified, hive plot will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, hive plot will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in hive_plot.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for hive plot axes labels.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • node_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a plt.scatter() call.

  • axes_kwargs – additional params that will be applied to all axes. Note, these are kwargs that affect a plt.plot() call.

  • text_kwargs – additional kwargs passed to plt.text() call.

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.HivePlot.connect_axes() or hiveplotlib.HivePlot.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.HivePlot.add_edge_kwargs() for more. Note, these are kwargs that affect a matplotlib.collections.LineCollection() call.

Returns:

matplotlib figure, axis.

hiveplotlib.viz.matplotlib.label_axes(instance: HivePlot | P2CP, fig: Figure | None = None, ax: Axes | None = None, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, buffer: float = 0.1, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, axes_off: bool = True, fig_kwargs: dict | None = None, **text_kwargs) Tuple[Figure, Axes]

matplotlib visualization of axis labels in a HivePlot or P2CP instance.

For HivePlot instances, each axis’ long_name attribute will be used. For P2CP instances, column names in the data attribute will be used.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw nodes.

  • fig – default None builds new figure. If a figure is specified, axis labels will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, axis labels will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for axes labels.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • text_kwargs – additional kwargs passed to plt.text() call.

Returns:

matplotlib figure, axis.

hiveplotlib.viz.matplotlib.node_viz(instance: HivePlot | P2CP, fig: Figure | None = None, ax: Axes | None = None, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, buffer: float = 0.1, axes_off: bool = True, fig_kwargs: dict | None = None, **scatter_kwargs) Tuple[Figure, Axes]

matplotlib visualization of nodes in a HivePlot or P2CP instance that have been placed on its axes.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw nodes.

  • fig – default None builds new figure. If a figure is specified, nodes will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, nodes will be drawn on that axis. Note: ``fig` and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • scatter_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a plt.scatter() call.

Returns:

matplotlib figure, axis.

hiveplotlib.viz.matplotlib.p2cp_legend(p2cp: P2CP, fig: Figure, ax: Axes, tags: Hashable | List[Hashable] | None = None, title: str = 'Tags', line_kwargs: dict | None = None, **legend_kwargs) Tuple[Figure, Axes]

Generate a legend for a P2CP instance, where entries in the legend will be tags of data added to the instance.

Parameters:
  • p2cpP2CP instance we want to visualize.

  • figmatplotlib figure on which we will draw the legend.

  • axmatplotlib axis on which we will draw the legend.

  • tags – which tags of data to include in the legend. Default None uses all tags under p2cp.tags. This can be ignored unless explicitly wanting to exclude certain tags from the legend.

  • title – title of the legend. Default “Tags”.

  • line_kwargs – keyword arguments that will add to / overwrite _all_ of the legend line markers from the defaults used in the original P2CP instance plot. For example, if one plots a large number of lines with low alpha and / or a small lw, one will likely want to include line_kwargs=dict(alpha=1, lw=2) so the representative lines in the legend are legible.

  • legend_kwargs – additional params that will be applied to the legend. Note, these are kwargs that affect a plt.legend() call. Default is to plot the legend in the upper right, outside of the bounding box (e.g. loc="upper left", bbox_to_anchor=(1, 1)).

Returns:

matplotlib figure, axis.

hiveplotlib.viz.matplotlib.p2cp_viz(p2cp: P2CP, fig: Figure | None = None, ax: Axes | None = None, tags: Hashable | List[Hashable] | None = None, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) Tuple[Figure, Axes]

matplotlib visualization of a P2CP instance.

Parameters:
  • p2cpP2CP instance we want to visualize.

  • fig – default None builds new figure. If a figure is specified, P2CP will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, P2CP will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in p2cp.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the P2CP axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for P2CP axes labels.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • node_kwargs – additional params that will be applied to all points on axes. Note, these are kwargs that affect a plt.scatter() call.

  • axes_kwargs – additional params that will be applied to all axes. Note, these are kwargs that affect a plt.plot() call.

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.P2CP.build_edges() or hiveplotlib.P2CP.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.P2CP.add_edge_kwargs() for more. Note, these are kwargs that affect a matplotlib.collections.LineCollection() call.

Returns:

matplotlib figure, axis.

Bokeh

bokeh-backend visualizations in hiveplotlib.

hiveplotlib.viz.bokeh.axes_viz(instance: HivePlot | P2CP, fig: figure | None = None, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: str = '16px', width: int = 600, height: int = 600, center_plot: bool = True, axes_off: bool = True, fig_kwargs: dict | None = None, label_kwargs: dict | None = None, **line_kwargs) figure

bokeh visualization of axes in a HivePlot or P2CP instance.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw axes.

  • fig – default None builds new figure. If a figure is specified, axes will be drawn on that figure.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for axes labels.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting bokeh figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • label_kwargs – additional kwargs passed to bokeh.models.Label() call.

  • line_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a bokeh.models.Line() call.

Returns:

bokeh figure.

hiveplotlib.viz.bokeh.edge_viz(instance: HivePlot | P2CP, fig: figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, fig_kwargs: dict | None = None, **edge_kwargs) figure

bokeh visualization of constructed edges in a HivePlot or P2CP instance.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • fig – default None builds new figure. If a figure is specified, edges will be drawn on that figure.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • axes_off – whether to turn off Cartesian x, y axes in resulting bokeh figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.HivePlot.connect_axes() / hiveplotlib.P2CP.build_edges() or hiveplotlib.HivePlot.add_edge_kwargs() / hiveplotlib.P2CP.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.HivePlot.add_edge_kwargs() / hiveplotlib.P2CP.add_edge_kwargs() for more. Note, these are kwargs that affect a bokeh.models.MultiLine() call.

Returns:

bokeh figure.

hiveplotlib.viz.bokeh.hive_plot_viz(hive_plot: HivePlot, fig: figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: str = '16px', axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, label_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) figure

Create default bokeh visualization of a HivePlot instance.

Parameters:
  • hive_plotHivePlot instance for which we want to draw edges.

  • fig – default None builds new figure. If a figure is specified, hive plot will be drawn on that figure.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in hive_plot.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for hive plot axes labels.

  • axes_off – whether to turn off Cartesian x, y axes in resulting bokeh figure (default True hides the x and y axes).

  • node_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a fig.scatter() call.

  • axes_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a bokeh.models.Line() call.

  • label_kwargs – additional kwargs passed to bokeh.models.Label() call.

  • fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.HivePlot.connect_axes() or hiveplotlib.HivePlot.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.HivePlot.add_edge_kwargs() for more. Note, these are kwargs that affect a bokeh.models.MultiLine() call.

Returns:

bokeh figure.

hiveplotlib.viz.bokeh.label_axes(instance: HivePlot | P2CP, fig: figure | None = None, axes_labels_buffer: float = 1.1, axes_labels_fontsize: str = '16px', buffer: float = 0.3, width: int = 600, height: int = 600, center_plot: bool = True, axes_off: bool = True, fig_kwargs: dict | None = None, **label_kwargs) figure

bokeh visualization of axis labels in a HivePlot or P2CP instance.

For HivePlot instances, each axis’ long_name attribute will be used. For P2CP instances, column names in the data attribute will be used.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw nodes.

  • fig – default None builds new figure. If a figure is specified, axis labels will be drawn on that figure.

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for axes labels.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting bokeh figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • label_kwargs – additional kwargs passed to bokeh.models.Label() call.

Returns:

bokeh figure.

hiveplotlib.viz.bokeh.node_viz(instance: HivePlot | P2CP, fig: figure | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, fig_kwargs: dict | None = None, **scatter_kwargs) figure

bokeh visualization of nodes in a HivePlot or P2CP instance that have been placed on their axes.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw nodes.

  • fig – default None builds new figure. If a figure is specified, nodes will be drawn on that figure.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • axes_off – whether to turn off Cartesian x, y axes in resulting bokeh figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • scatter_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a fig.scatter() call.

Returns:

bokeh figure.

hiveplotlib.viz.bokeh.p2cp_legend(p2cp: P2CP, fig: figure, tags: Hashable | List[Hashable] | None = None, title: str = 'Tags') figure

Generate a legend for a P2CP instance, where entries in the legend will be tags of data added to the instance.

Note

The legend can be further modified by changing its attributes under fig.legend. For more on the flexibility in changing the legend, see the bokeh.models.Legend() docs.

Parameters:
  • p2cpP2CP instance we want to visualize.

  • figbokeh figure on which we will draw the legend.

  • tags – which tags of data to include in the legend. Default None uses all tags under p2cp.tags. This can be ignored unless explicitly wanting to exclude certain tags from the legend.

  • title – title of the legend. Default “Tags”.

Returns:

bokeh figure.

hiveplotlib.viz.bokeh.p2cp_viz(p2cp: P2CP, fig: figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: str = '16px', axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, label_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) figure

Create default bokeh visualization of a P2CP instance.

Parameters:
  • p2cpP2CP instance we want to visualize.

  • fig – default None builds new figure. If a figure is specified, P2CP will be drawn on that figure.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in p2cp.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the P2CP axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for P2CP axes labels.

  • axes_off – whether to turn off Cartesian x, y axes in resulting bokeh figure (default True hides the x and y axes).

  • node_kwargs – additional params that will be applied to all P2CP nodes. Note, these are kwargs that affect a fig.scatter() call.

  • axes_kwargs – additional params that will be applied to all P2CP axes. Note, these are kwargs that affect a bokeh.models.Line() call.

  • label_kwargs – additional kwargs passed to bokeh.models.Label() call.

  • fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.P2CP.build_edges() or hiveplotlib.P2CP.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.P2CP.add_edge_kwargs() for more. Note, these are kwargs that affect a bokeh.models.MultiLine() call.

Returns:

bokeh figure.

Holoviews

holoviews visualizations in hiveplotlib.

Currently, hiveplotlib supports a bokeh and matplotlib backend for holoviews.

hiveplotlib.viz.holoviews.axes_viz(instance: HivePlot | P2CP, fig: Overlay | None = None, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, width: float | None = None, height: float | None = None, center_plot: bool = True, axes_off: bool = True, overlay_kwargs: dict | None = None, text_kwargs: dict | None = None, **curve_kwargs) Overlay

holoviews visualization of axes in a HivePlot or P2CP instance.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw axes.

  • fig – default None builds new overlay. If an overlay is specified, axes will be drawn on that overlay.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for axes labels.

  • width – width of figure. When the holoviews backend is set to "bokeh", width must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure. When the holoviews backend is set to "bokeh", height must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • axes_off – whether to turn off Cartesian x, y axes in the hv.Overlay (default True hides the x and y axes).

  • overlay_kwargs – additional values to be called in hv.Overlay().opts() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • text_kwargs – additional kwargs passed to holoviews.Text() call.

  • curve_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a holoviews.Curve() call.

Returns:

holoviews.Overlay.

hiveplotlib.viz.holoviews.edge_viz(instance: HivePlot | P2CP, fig: Overlay | None = None, tags: Hashable | List[Hashable] | None = None, width: float | None = None, height: float | None = None, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, overlay_kwargs: dict | None = None, **curve_kwargs) Overlay

holoviews visualization of constructed edges in a HivePlot or P2CP instance.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • fig – default None builds new overlay. If an overlay is specified, axes will be drawn on that overlay.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • width – width of figure. When the holoviews backend is set to "bokeh", width must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure. When the holoviews backend is set to "bokeh", height must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • axes_off – whether to turn off Cartesian x, y axes in the hv.Overlay (default True hides the x and y axes).

  • overlay_kwargs – additional values to be called in hv.Overlay().opts() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • curve_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.HivePlot.connect_axes() / hiveplotlib.P2CP.build_edges() or hiveplotlib.HivePlot.add_edge_kwargs() / hiveplotlib.P2CP.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.HivePlot.add_edge_kwargs() / hiveplotlib.P2CP.add_edge_kwargs() for more. Note, these are kwargs that affect a holoviews.Curve() call.

Returns:

holoviews.Overlay.

hiveplotlib.viz.holoviews.hive_plot_viz(hive_plot: HivePlot, fig: Overlay | None = None, tags: Hashable | List[Hashable] | None = None, width: float | None = None, height: float | None = None, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, overlay_kwargs: dict | None = None, **edge_kwargs) Overlay

Create default holoviews visualization of a HivePlot instance.

Parameters:
  • hive_plotHivePlot instance for which we want to draw edges.

  • fig – default None builds new overlay. If an overlay is specified, axes will be drawn on that overlay.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • width – width of figure. When the holoviews backend is set to "bokeh", width must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure. When the holoviews backend is set to "bokeh", height must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in hive_plot.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for hive plot axes labels.

  • axes_off – whether to turn off Cartesian x, y axes in the hv.Overlay (default True hides the x and y axes).

  • node_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a holoviews.Points() call.

  • axes_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a holoviews.Curve() call.

  • text_kwargs – additional kwargs passed to holoviews.Text() call.

  • overlay_kwargs – additional values to be called in hv.Overlay().opts() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.HivePlot.connect_axes() or hiveplotlib.HivePlot.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.HivePlot.add_edge_kwargs() for more. Note, these are kwargs that affect a holoviews.Curve() call.

Returns:

holoviews.Overlay.

hiveplotlib.viz.holoviews.label_axes(instance: HivePlot | P2CP, fig: Overlay | None = None, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, buffer: float = 0.3, width: float | None = None, height: float | None = None, center_plot: bool = True, axes_off: bool = True, overlay_kwargs: dict | None = None, **text_kwargs) Overlay

holoviews visualization of axis labels in a HivePlot or P2CP instance.

For HivePlot instances, each axis’ long_name attribute will be used. For P2CP instances, column names in the data attribute will be used.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw axes.

  • fig – default None builds new overlay. If an overlay is specified, axes will be drawn on that overlay.

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for axes labels.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • width – width of figure. When the holoviews backend is set to "bokeh", width must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure. When the holoviews backend is set to "bokeh", height must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • axes_off – whether to turn off Cartesian x, y axes in the hv.Overlay (default True hides the x and y axes).

  • overlay_kwargs – additional values to be called in hv.Overlay().opts() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • text_kwargs – additional kwargs passed to holoviews.Text() call.

Returns:

holoviews.Overlay.

hiveplotlib.viz.holoviews.node_viz(instance: HivePlot | P2CP, fig: Overlay | None = None, width: float | None = None, height: float | None = None, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, overlay_kwargs: dict | None = None, **points_kwargs) Overlay

holoviews visualization of nodes in a HivePlot or P2CP instance that have been placed on their axes.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw nodes.

  • fig – default None builds new overlay. If an overlay is specified, axes will be drawn on that overlay.

  • width – width of figure. When the holoviews backend is set to "bokeh", width must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure. When the holoviews backend is set to "bokeh", height must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • axes_off – whether to turn off Cartesian x, y axes in the hv.Overlay (default True hides the x and y axes).

  • overlay_kwargs – additional values to be called in hv.Overlay().opts() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • points_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a holoviews.Points() call.

Returns:

holoviews.Overlay.

hiveplotlib.viz.holoviews.p2cp_legend(fig: Overlay, **legend_kwargs) Overlay

Generate a legend for a P2CP instance, where entries in the legend will be tags of data added to the instance.

Parameters:
  • p2cpP2CP instance we want to visualize.

  • figplotly figure on which we will draw the legend.

  • legend_kwargs – additional values to be called in hv.Overlay().opts() call.

Returns:

holoviews.Overlay.

hiveplotlib.viz.holoviews.p2cp_viz(p2cp: P2CP, fig: Overlay | None = None, tags: Hashable | List[Hashable] | None = None, width: float | None = None, height: float | None = None, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, overlay_kwargs: dict | None = None, **edge_kwargs) Overlay

Create default holoviews visualization of a P2CP instance.

Parameters:
  • p2cpP2CP instance we want to visualize.

  • fig – default None builds new overlay. If an overlay is specified, axes will be drawn on that overlay.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • width – width of figure. When the holoviews backend is set to "bokeh", width must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure. When the holoviews backend is set to "bokeh", height must be specified in pixels, defaulting to 600. When the holoviews backend is set to "matplotlib", height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in p2cp.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the P2CP axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for P2CP axes labels.

  • axes_off – whether to turn off Cartesian x, y axes in the hv.Overlay (default True hides the x and y axes).

  • node_kwargs – additional params that will be applied to all P2CP nodes. Note, these are kwargs that affect a holoviews.Points() call.

  • axes_kwargs – additional params that will be applied to all P2CP axes. Note, these are kwargs that affect a holoviews.Curve() call.

  • text_kwargs – additional kwargs passed to holoviews.Text() call.

  • overlay_kwargs – additional values to be called in hv.Overlay().opts() call. Note if width and height are added here, then they will be prioritized over the width and height parameters.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.P2CP.build_edges() or hiveplotlib.P2CP.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.P2CP.add_edge_kwargs() for more. Note, these are kwargs that affect a holoviews.Curve() call.

Returns:

holoviews.Overlay.

Plotly

plotly-backend visualizations in hiveplotlib.

hiveplotlib.viz.plotly.axes_viz(instance: HivePlot | P2CP, fig: Figure | None = None, line_width: float = 1.5, opacity: float = 1.0, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.25, axes_labels_fontsize: float = 16, width: int = 600, height: int = 600, center_plot: bool = True, axes_off: bool = True, layout_kwargs: dict | None = None, label_kwargs: dict | None = None, **line_kwargs) Figure

Visualize axes in a HivePlot or P2CP instance with plotly.

Note

The line_width parameter corresponds to the standard width parameter for plotly lines. We are exposing this parameter with a different name because width is already the standard name for figure width throughout hiveplotlib.viz.

plotly out of the box does not support standard opacity for its line plots like it does for scatter plots, but it does support providing an alpha channel in RGBA / HSVA / HSLA strings. The opacity parameter in this function call will behave as opacity behaves for plotly scatter plots, as long as the user-provided colors are either standard named CSS colors (e.g. “blue”, “navy”, “green”) or hex colors.

Users who prefer to provide colors as multi-channel RGBA / HSVA / HSLA strings will override the opacity parameter. For more on how to provide multi-channel color strings, see the plotly docs for the color parameter for lines.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw axes.

  • fig – default None builds new figure. If a figure is specified, axes will be drawn on that figure.

  • line_width – width of axes.

  • opacity – opacity of edges. Must be in [0, 1].

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for axes labels.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting plotly figure (default True hides the x and y axes).

  • layout_kwargs – additional values for the layout parameter to be called in plotly.graph_objects.Figure() call. Note, if width and height are added here, then they will be prioritized over the width and height parameters.

  • label_kwargs – additional kwargs passed to the textfont parameter of plotly.graph_objects.Scatter(). For examples of parameter options, see the plotly docs.

  • line_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.

Returns:

plotly figure.

hiveplotlib.viz.plotly.edge_viz(instance: HivePlot | P2CP, fig: Figure | None = None, tags: Hashable | List[Hashable] | None = None, line_width: float = 1.5, opacity: float = 0.5, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, layout_kwargs: dict | None = None, **edge_kwargs) Figure

Visualize constructed edges in a HivePlot or P2CP instance with plotly.

Note

The line_width parameter corresponds to the standard width parameter for plotly lines. We are exposing this parameter with a different name because width is already the standard name for figure width throughout hiveplotlib.viz.

plotly out of the box does not support standard opacity for its line plots like it does for scatter plots, but it does support providing an alpha channel in RGBA / HSVA / HSLA strings. The opacity parameter in this function call will behave as opacity behaves for plotly scatter plots, as long as the user-provided colors are either standard named CSS colors (e.g. “blue”, “navy”, “green”) or hex colors.

Users who prefer to provide colors as multi-channel RGBA / HSVA / HSLA strings will override the opacity parameter. For more on how to provide multi-channel color strings, see the plotly docs for the color parameter for lines.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • fig – default None builds new figure. If a figure is specified, edges will be drawn on that figure.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • line_width – width of edges.

  • opacity – opacity of edges. Must be in [0, 1].

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • axes_off – whether to turn off Cartesian x, y axes in resulting plotly figure (default True hides the x and y axes).

  • layout_kwargs – additional values for the layout parameter to be called in plotly.graph_objects.Figure() call. Note, if width and height are added here, then they will be prioritized over the width and height parameters.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.HivePlot.connect_axes() / hiveplotlib.P2CP.build_edges() or hiveplotlib.HivePlot.add_edge_kwargs() / hiveplotlib.P2CP.add_edge_kwargs() will take priority). To overwrite previously set kwargs, see hiveplotlib.HivePlot.add_edge_kwargs() / hiveplotlib.P2CP.add_edge_kwargs() for more. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.

Returns:

plotly figure.

hiveplotlib.viz.plotly.hive_plot_viz(hive_plot: HivePlot, fig: Figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.25, axes_labels_fontsize: float = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, label_kwargs: dict | None = None, layout_kwargs: dict | None = None, **edge_kwargs) Figure

Create default plotly visualization of a HivePlot instance.

Note

The line width and opacity of axes can be changed by including the line_width and opacity parameters, respectively, in axes_kwargs. See the documentation for hiveplotlib.viz.plotly.axes_viz() for more information.

If the line width and opacity of edges was not set in the original hive plot, then these parameters can be set by including the line_width and opacity parameters, respectively, as additional keyword arguments. See the documentation for hiveplotlib.viz.plotly.edge_viz() for more information.

Parameters:
  • hive_plotHivePlot instance for which we want to draw edges.

  • fig – default None builds new figure. If a figure is specified, hive plot will be drawn on that figure.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in hive_plot.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for hive plot axes labels.

  • axes_off – whether to turn off Cartesian x, y axes in resulting plotly figure (default True hides the x and y axes).

  • node_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a plotly.graph_objects.scatter.Marker() call.

  • axes_kwargs – additional params that will be applied to all hive plot axes. This includes the line_width and opacity parameters in hiveplotlib.viz.plotly.axes_viz(). Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.

  • label_kwargs – additional kwargs passed to the textfont parameter of plotly.graph_objects.Scatter(). For examples of parameter options, see the plotly docs.

  • layout_kwargs – additional values for the layout parameter to be called in plotly.graph_objects.Figure() call. Note, if width and height are added here, then they will be prioritized over the width and height parameters.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.HivePlot.connect_axes() or hiveplotlib.HivePlot.add_edge_kwargs() will take priority). This includes the line_width and opacity parameters in hiveplotlib.viz.plotly.edge_viz(). To overwrite previously set kwargs, see hiveplotlib.HivePlot.add_edge_kwargs() for more. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.

Returns:

plotly figure.

hiveplotlib.viz.plotly.label_axes(instance: HivePlot | P2CP, fig: Figure | None = None, axes_labels_buffer: float = 1.25, axes_labels_fontsize: float = 16, buffer: float = 0.3, width: int = 600, height: int = 600, center_plot: bool = True, axes_off: bool = True, layout_kwargs: dict | None = None, **label_kwargs) Figure

Visualize axis labels in a HivePlot or P2CP instance with plotly.

For HivePlot instances, each axis’ long_name attribute will be used. For P2CP instances, column names in the data attribute will be used.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw nodes.

  • fig – default None builds new figure. If a figure is specified, axis labels will be drawn on that figure.

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for axes labels.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting plotly figure (default True hides the x and y axes).

  • layout_kwargs – additional values for the layout parameter to be called in plotly.graph_objects.Figure() call. Note, if width and height are added here, then they will be prioritized over the width and height parameters.

  • label_kwargs – additional kwargs passed to the textfont parameter of plotly.graph_objects.Scatter(). For examples of parameter options, see the plotly docs.

Returns:

plotly figure.

hiveplotlib.viz.plotly.node_viz(instance: HivePlot | P2CP, fig: Figure | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, layout_kwargs: dict | None = None, **scatter_kwargs) Figure

Visualize of nodes in a HivePlot or P2CP instance that have been placed on their axes in plotly.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw nodes.

  • fig – default None builds new figure. If a figure is specified, nodes will be drawn on that figure.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in instance.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • axes_off – whether to turn off Cartesian x, y axes in resulting plotly figure (default True hides the x and y axes).

  • layout_kwargs – additional values for the layout parameter to be called in plotly.graph_objects.Figure() call. Note, if width and height are added here, then they will be prioritized over the width and height parameters.

  • scatter_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a plotly.graph_objects.scatter.Marker() call.

Returns:

plotly figure.

hiveplotlib.viz.plotly.p2cp_legend(p2cp: P2CP, fig: Figure, tags: Hashable | List[Hashable] | None = None, title: str = 'Tags', **legend_kwargs) Figure

Generate a legend for a P2CP instance, where entries in the legend will be tags of data added to the instance.

Parameters:
  • p2cpP2CP instance we want to visualize.

  • figplotly figure on which we will draw the legend.

  • tags – which tags of data to include in the legend. Default None uses all tags under p2cp.tags. This can be ignored unless explicitly wanting to exclude certain tags from the legend.

  • title – title of the legend. Default “Tags”.

  • legend_kwargs – additional values for the legend parameter in the plotly.graph_objects.update_layout() call.

Returns:

plotly figure.

hiveplotlib.viz.plotly.p2cp_viz(p2cp: P2CP, fig: Figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.25, axes_labels_fontsize: float = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, label_kwargs: dict | None = None, layout_kwargs: dict | None = None, **edge_kwargs) Figure

Create default plotly visualization of a P2CP instance.

Note

The line width and opacity of axes can be changed by including the line_width and opacity parameters, respectively, in axes_kwargs. See the documentation for hiveplotlib.viz.plotly.axes_viz() for more information.

If the line width and opacity of edges was not set in the original P2CP, then these parameters can be set by including the line_width and opacity parameters, respectively, as additional keyword arguments. See the documentation for hiveplotlib.viz.plotly.edge_viz() for more information.

Parameters:
  • p2cpP2CP instance we want to visualize.

  • fig – default None builds new figure. If a figure is specified, P2CP will be drawn on that figure.

  • tags – which tag(s) of data to plot. Default None plots all tags of data. Can supply either a single tag or list of tags.

  • width – width of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • height – height of figure in pixels. Note: only works if instantiating new figure (e.g. fig is None).

  • center_plot – whether to center the figure on (0, 0), the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis in p2cp.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the P2CP axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for P2CP axes labels.

  • axes_off – whether to turn off Cartesian x, y axes in resulting plotly figure (default True hides the x and y axes).

  • node_kwargs – additional params that will be applied to all P2CP nodes. Note, these are kwargs that affect a plotly.graph_objects.scatter.Marker() call.

  • axes_kwargs – additional params that will be applied to all P2CP axes. This includes the line_width and opacity parameters in hiveplotlib.viz.plotly.axes_viz(). Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.

  • label_kwargs – additional kwargs passed to the textfont parameter of plotly.graph_objects.Scatter(). For examples of parameter options, see the plotly docs.

  • layout_kwargs – additional values for the layout parameter to be called in plotly.graph_objects.Figure() call. Note, if width and height are added here, then they will be prioritized over the width and height parameters.

  • edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in hiveplotlib.P2CP.build_edges() or hiveplotlib.P2CP.add_edge_kwargs() will take priority). This includes the line_width and opacity parameters in hiveplotlib.viz.plotly.edge_viz(). To overwrite previously set kwargs, see hiveplotlib.P2CP.add_edge_kwargs() for more. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.

Returns:

plotly figure.

Datashader in Matplotlib

Datashading capabilities for hiveplotlib.

hiveplotlib.viz.datashader.datashade_edges_mpl(instance: ~hiveplotlib.hiveplot.HivePlot | ~hiveplotlib.p2cp.P2CP, tag: ~typing.Hashable | None = None, cmap: str | ~matplotlib.colors.ListedColormap = <matplotlib.colors.ListedColormap object>, vmin: float = 1, vmax: float | None = None, log_cmap: bool = True, pixel_spread: int = 2, reduction: callable = <datashader.reductions.count object>, buffer: float = 0.1, fig: ~matplotlib.figure.Figure | None = None, ax: ~matplotlib.axes._axes.Axes | None = None, figsize: ~typing.Tuple[float, float] = (10, 10), dpi: int = 300, axes_off: bool = True, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage]

matplotlib visualization of constructed edges in a HivePlot or P2CP instance using datashader.

The main idea of datashader is rather than plot all the lines on top of each other in a figure, one can instead essentially build up a single 2d image of the lines in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function reduction=ds.count (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A high dpi value is recommended when datashading to allow for more nuance in the rasterization. This is why this visualization function defaults to a dpi value of 300 when fig=None and ax=None.

Experimentation with different (low) values for pixel_spread is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in the thinner, more isolated curves “breaking apart” in the final visualization. For more on spreading, see the datashader documentation.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • tag – which tag of data to plot. If None is provided, then plotting will occur if there is only one tag in the instance. For more on data tags, see further discussion in the Comparing Network Subgroups Notebook.

  • cmap – which colormap to use for the datashaded edges. Default is a seaborn colormap similar to the matplotlib "Blues" colormap.

  • vmin – minimum value used in the colormap for plotting the rasterization of curves. Default 1.

  • vmax – maximum value used in the colormap for plotting the rasterization of curves. Default None finds and uses the maximum bin value of the calculated rasterization.

  • log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • reduction – the means of projecting from data space to pixel space for the rasterization. Default ds.count() essentially builds a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • pixel_spread – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 2 pixels. For more on spreading, see the datashader documentation.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • im_kwargs – additional params that will be applied to the final plt.imshow() call on the rasterization.

Returns:

matplotlib figure, axis, image.

hiveplotlib.viz.datashader.datashade_hive_plot_mpl(instance: ~hiveplotlib.hiveplot.HivePlot | ~hiveplotlib.p2cp.P2CP, tag: ~typing.Hashable | None = None, cmap_edges: str | ~matplotlib.colors.ListedColormap = <matplotlib.colors.ListedColormap object>, cmap_nodes: str | ~matplotlib.colors.ListedColormap = 'copper', vmin_nodes: float = 1, vmax_nodes: float | None = None, vmin_edges: float = 1, vmax_edges: float | None = None, log_cmap: bool = True, pixel_spread_nodes: int = 15, pixel_spread_edges: int = 2, reduction: callable = <datashader.reductions.count object>, fig: ~matplotlib.figure.Figure | None = None, ax: ~matplotlib.axes._axes.Axes | None = None, figsize: ~typing.Tuple[float, float] = (10, 10), dpi: int = 300, axes_off: bool = True, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage, AxesImage]

matplotlib visualization of a HivePlot or P2CP instance using datashader.

Plots both nodes and edges with datashader along with standard hive plot / P2CP axes.

The main idea of datashader is rather than plot all the lines on top of each other in a figure, one can instead essentially build up a single 2d image of the lines in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function reduction=ds.count (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A high dpi value is recommended when datashading to allow for more nuance in the rasterization. This is why this visualization function defaults to a dpi value of 300 when fig=None and ax=None.

Experimentation with different (low) values for pixel_spread_nodes and pixel_spread_edges is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in the thinner, more isolated curves “breaking apart” in the final visualization. For more on spreading, see the datashader documentation.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • tag

    which tag of data to plot. If None is provided, then plotting will occur if there is only one tag in the instance. For more on data tags, see further discussion in the Comparing Network Subgroups Notebook.

  • cmap_edges – which colormap to use for the datashaded edges. Default is a seaborn colormap similar to the matplotlib "Blues" colormap.

  • cmap_nodes – which colormap to use for the datashaded nodes. Default “copper”.

  • vmin_nodes – minimum value used in the colormap for plotting the rasterization of nodes. Default 1.

  • vmax_nodes – maximum value used in the colormap for plotting the rasterization of nodes. Default None finds and uses the maximum bin value of the calculated rasterization.

  • vmin_edges – minimum value used in the colormap for plotting the rasterization of edges. Default 1.

  • vmax_edges – maximum value used in the colormap for plotting the rasterization of edges. Default None finds and uses the maximum bin value of the calculated rasterization.

  • log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • reduction – the means of projecting from data space to pixel space for the rasterization. Default ds.count() essentially builds a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • pixel_spread_nodes – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 15 pixels. For more on spreading, see the datashader documentation.

  • pixel_spread_edges – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 2 pixels. For more on spreading, see the datashader documentation.

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for hive plot axes labels.

  • axes_kwargs – additional params that will be applied to all axes. Note, these are kwargs that affect a plt.plot() call.

  • text_kwargs – additional kwargs passed to plt.text() call.

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • im_kwargs – additional params that will be applied to the final plt.imshow() call on the rasterization.

Returns:

matplotlib figure, axis, the image corresponding to node data, and the image corresponding to edge data.

hiveplotlib.viz.datashader.datashade_nodes_mpl(instance: ~hiveplotlib.hiveplot.HivePlot | ~hiveplotlib.p2cp.P2CP, cmap: str | ~matplotlib.colors.ListedColormap = 'copper', vmin: float = 1, vmax: float | None = None, log_cmap: bool = True, pixel_spread: int = 15, reduction: callable = <datashader.reductions.count object>, buffer: float = 0.1, fig: ~matplotlib.figure.Figure | None = None, ax: ~matplotlib.axes._axes.Axes | None = None, figsize: ~typing.Tuple[float, float] = (10, 10), dpi: int = 300, axes_off: bool = True, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage]

matplotlib visualization of nodes / points in a HivePlot / P2CP instance using datashader.

The main idea of datashader is rather than plot all the points on top of each other in a figure, one can instead essentially build up a single 2d image of the points in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function reduction=ds.count (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A high dpi value is recommended when datashading to allow for more nuance in the rasterization. This is why this visualization function defaults to a dpi value of 300 when fig=None and ax=None. Since we are interested in positions rather than the lines from hiveplotlib.viz.datashader.datashade_edges_mpl(), though, one will likely need a much larger pixel_spread value here, on the order of 10 times larger, to see the node density well in the final visualization.

Experimentation with different values for pixel_spread is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in smaller, harder to see points in the final visualization. For more on spreading, see the datashader documentation.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • cmap – which colormap to use for the datashaded nodes. Default “copper”.

  • vmin – minimum value used in the colormap for plotting the rasterization of curves. Default 1.

  • vmax – maximum value used in the colormap for plotting the rasterization of curves. Default None finds and uses the maximum bin value of the calculated rasterization.

  • log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • reduction – the means of projecting from data space to pixel space for the rasterization. Default ds.count() essentially builds a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • pixel_spread – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 15 pixels. For more on spreading, see the datashader documentation.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • im_kwargs – additional params that will be applied to the final plt.imshow() call on the rasterization.

Returns:

matplotlib figure, axis, image.

Example Datasets

Quick example datasets for use in hiveplotlib.

For Hive Plots, many excellent network datasets are available online, including many graphs that can be generated using networkx and pytorch-geometric. The Stanford Large Network Dataset Collection is also a great general source of network datasets. If working with networkx graphs, users can also take advantage of the hiveplotlib.converters.networkx_to_nodes_edges() method to quickly get those graphs into a hiveplotlib-ready format.

For Polar Parallel Coordinates Plots (P2CPs), many datasets are available through packages including statsmodels and scikit-learn.

hiveplotlib.datasets.example_hive_plot(num_nodes: int = 15, num_edges: int = 30, seed: int = 0, **hive_plot_n_axes_kwargs) HivePlot

Generate example hive plot with "Low", "Medium", and "High" axes (plus repeat axes).

Nodes and edges will be generated and placed randomly.

Parameters:
  • num_nodes – number of nodes to generate.

  • num_edges – number of edges to generate.

  • seed – random seed to use when generating nodes and edges.

  • hive_plot_n_axes_kwargs – additional keyword arguments for the underlying hiveplotlib.hive_plot_n_axes() call.

Returns:

resulting HivePlot instance.

hiveplotlib.datasets.example_nodes_and_edges(num_nodes: int = 100, num_edges: int = 200, num_axes: int = 3, seed: int = 0) Tuple[List[Node], List[List[Hashable]], ndarray]

Generate example nodes, node splits (one list of nodes per intended axis), and edges.

Each node will have a "low", "med", and "high" value, where these values are randomly generated, and as the names suggest, for the resulting values of each node, "low" < "med" < "high".

Parameters:
  • num_nodes – how many nodes to randomly generate. Node unique IDs will be the integers 0, 1, … , num_nodes - 1.

  • num_edges – how many edges to randomly generate.

  • num_axes – how many axes into which to partition the randomly generated nodes.

  • seed – random seed to use when randomly generating node and edge data.

Returns:

list of generated Node instances, a list of num_axes lists that evenly split the node IDs to be allocated to their own axes, and a (num_edges, 2) shaped array of random edges between nodes.

hiveplotlib.datasets.example_p2cp(num_points: int = 50, noise: float = 0.5, random_seed: int = 0, four_colors: Tuple[str, str, str, str] = ('#de8f05', '#029e73', '#cc78bc', '#0173b2'), **p2cp_n_axes_kwargs) P2CP

Generate example P2CP of four gaussian blobs.

Points will be generated by calling hiveplotlib.datasets.four_gaussian_blobs_3d() and turned into a P2CP via hiveplotlib.p2cp_n_axes().

Parameters:
  • num_points – number of points in each Gaussian blob.

  • noise – noisiness of Gaussian blobs.

  • random_seed – random seed to generate consistent data between calls.

  • four_colors – four colors to use for four Gaussian blobs.

  • p2cp_n_axes_kwargs – additional keyword arguments for the underlying hiveplotlib.p2cp_n_axes() call.

Returns:

resulting P2CP instance.

hiveplotlib.datasets.four_gaussian_blobs_3d(num_points: int = 50, noise: float = 0.5, random_seed: int = 0) DataFrame

Generate a pandas dataframe of four Gaussian blobs in 3d.

This dataset serves as a simple example for showing 3d viz using Polar Parallel Coordinates Plots (P2CPs) instead of 3d plotting.

Parameters:
  • num_points – number of points in each blob.

  • noise – noisiness of Gaussian blobs.

  • random_seed – random seed to generate consistent data between calls.

Returns:

(num_points * 4, 4) pd.DataFrame of X, Y, Z, and blob labels.

hiveplotlib.datasets.international_trade_data(year: int = 2019, hs92_code: int = 8112, path: str | Path | None = None) Tuple[DataFrame, Dict]

Read in international trade data network from the Harvard Growth Lab.

Note

Only a limited number of subsets of the data are shipped with hiveplotlib, as each year of trade data is roughly 300mb. However, the raw data are available at the Harvard Growth Lab’s website, and the runner to produce the necessary files to use this reader function is available in the repository (make_trade_network_dataset.py).

If you are using the runner to make your own trade datasets that you will read in locally with this function, then you will need to specify the local path accordingly.

Parameters:
  • year – which year of data to pull. If the year of data is not available, an error will be raised.

  • hs92_code – which HS 92 code of export data to pull. If the code requested is not available, an error will be raised. There are different numbers of digits (e.g. 2, 4), where more digits leads to more specificity of trade group. For a reference to what trade groups these codes correspond to, see this resource.

  • path – directory containing both the data and metadata for loading. Default None assumes you are using one of the datasets shipped with hiveplotlib. If you are using the make_trade_network_dataset.py runner discussed above to make your own datasets, then you will need to specify the path to the directory where you saved both the data and metadata files (which must be in the same directory).

Returns:

pandas.DataFrame of trade data, dictionary of metadata explaining meaning of data’s columns, data provenance, citations, etc.

Raises:

AssertionError if the requested files cannot be found.