Low-Level Hive Plot API#
- class hiveplotlib.Node(unique_id: Hashable, data: Dict | None = None)#
Nodeinstances hold the data for individual network node.Each instance is initialized with a
unique_idfor identification. These IDs must beHashable. One can also initialize with a dictionary ofdata, but data can also be added later with theadd_data()method.- Example:
my_node = Node(unique_id="my_unique_node_id", data=my_dataset) my_second_node = Node(unique_id="my_second_unique_node_id") my_second_node.add_data(data=my_second_dataset)
- class hiveplotlib.Axis(axis_id: Hashable, start: float = 1, end: float = 5, angle: float = 0, long_name: Hashable | None = None, metadata: dict | None = None)#
Axisinstance.Axisinstances are initialized based on their intended final position when plotted. EachAxisis also initialized with a unique, hashableaxis_idfor clarity when building hive plots with multiple axes.The eventual size and positioning of the
Axisinstance is dictated in the context of polar coordinates by three parameters:startdictates the distance from the origin to the beginning of the axis when eventually plotted.stopdictates the distance from the origin to the end of the axis when eventually plotted.anglesets the angle theAxisis rotated counterclockwise. For example,angle=0points East,angle=90points North, andangle=180points West.Nodeinstances placed on eachAxisinstance will be scaled to fit onto the span of theAxis, but this is discussed further in theHivePlotclass, which handles this placement.Since
axis_idvalues may be shorthand for easy referencing when typing code, if one desires a formal name to plot against each axis when visualizing, one can provide a separatelong_namethat will show up as the axis label when runninghiveplotlib.vizcode. (For example, one may chooseaxis_id="a1"andlong_name="Axis 1".Note
long_namedefaults toaxis_idif not specified.- Example:
# 3 axes, spaced out 120 degrees apart, all size 4, starting 1 unit off of origin axis0 = Axis(axis_id="a0", start=1, end=5, angle=0, long_name="Axis 0") axis1 = Axis(axis_id="a1", start=1, end=5, angle=120, long_name="Axis 1") axis2 = Axis(axis_id="a2", start=1, end=5, angle=240, long_name="Axis 2")
- add_metadata(metadata: dict) None#
Add metadata to the axis.
This method will overwrite existing metadata with the same keys.
- Parameters:
metadata – dictionary of metadata to add to the axis.
- Returns:
None.
- set_node_placements(placements_df: DataFrame, unique_id: Hashable) None#
Set
Axis.node_placementsto apandas.DataFrameof node placement information with node metadata.Dataframe consists of x cartesian coordinates, y cartesian coordinates, unique node IDs, and polar rho values (e.g. distance from the origin).
Note
This is an internal setter method to be called downstream by the
HivePlot.place_nodes_on_axis()method.- Parameters:
placements_df – dataframe of placement information and other node metadata.
unique_id – column corresponding to node unique IDs.
- Returns:
None.
- set_node_vmin_and_vmax(vmin: float, vmax: float, inferred_vmin: bool, inferred_vmax: bool) None#
Set the vmin and vmax values used to place nodes on the axis.
Note
This is an internal setter method to be called downstream by the
HivePlot.place_nodes_on_axis()method.- Parameters:
vmin – all node scalar values less than
vminwould have been set tovminvmax – all node scalar values greater than
vmaxwould have been set tovmax.inferred_vmin – whether
vminvalue was inferred inHivePlot.place_nodes_on_axis().inferred_vmax – whether
vmaxvalue was inferred inHivePlot.place_nodes_on_axis().
- Returns:
None.
- set_sorting_variable(label: Hashable) None#
Set which scalar variable in each
Nodeinstance will be used to place each node on the axis when plotting.Note
This is an internal setter method to be called downstream by the
HivePlot.place_nodes_on_axis()method.- Parameters:
label – which scalar variable in the node data to reference.
- Returns:
None.
- class hiveplotlib.BaseHivePlot(use_numba: bool = True, n_parallel: int | None = None)#
Hive Plots built from combination of
AxisandNodeinstances.This class is essentially methods for creating and maintaining the nested dictionary attribute
edges, which holds constructed Bézier curves, edge ids, and matplotlib keyword arguments for various sets of edges to be plotted. The nested dictionary structure can be abstracted to the below example.BaseHivePlot.hive_plot_edges["starting axis"]["ending axis"]["tag"]
The resulting dictionary value holds the edge information relating to an addition of edges that are tagged as “tag,” specifically the edges going FROM the axis named “starting axis” TO the axis named “ending axis.” This value is in fact another dictionary, meant to hold the discretized Bézier curves (
curves), the matplotlib keyword arguments for plotting (edge_kwargs), and the abstracted edge ids (an(m, 2) np.ndarray) between which we are drawing Bézier curves (ids).- add_axes(axes: Axis | List[Axis]) None#
Add list of
Axisinstances toaxesattribute.Note
All resulting Axis IDs must be unique.
- Parameters:
axes –
Axisobject(s) to add toaxesattribute.- Returns:
None.
- add_edge_curves_between_axes(axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0, use_numba_curves: bool | None = None) None#
Construct discretized edge curves between two axes of a Hive Plot.
Note
One must run the
add_edge_ids()method first for the two axes of interest.Resulting discretized Bézier curves will be stored as an
(n, 2) numpy.ndarrayof multiple sampled curves where the first column is x position and the second column is y position in Cartesian coordinates.Note
Although each curve is represented by a
(num_steps, 2)array, all the curves are stored curves in a single collectivenumpy.ndarrayseparated by rows of[np.nan, np.nan]between each discretized curve. This allowsmatplotlibto accept a single array when plotting lines viaplt.plot(), which speeds up plotting later.This output will be stored in
hive_plot_edges[axis_id_1][axis_id_2][tag]["curves"].- Parameters:
axis_id_1 – pointer to first of two
Axisinstances in theaxesattribute between which we want to find connections.axis_id_2 – pointer to second of two
Axisinstances in theaxesattribute between which we want to find connections.tag – unique ID specifying which subset of edges specified by their IDs to construct (e.g.
hive_plot_edges[axis_id_1][axis_id_2][tag]["ids"]). Note, if no tag is specified (e.g.tag=None), it is presumed there is only one tag for the specified set of axes to look over, which can be inferred. If no tag is specified and there are multiple tags to choose from, aValueErrorwill be raised.a1_to_a2 – whether to build out the edges going FROM
axis_id_1TOaxis_id_2.a2_to_a1 – whether to build out the edges going FROM
axis_id_2TOaxis_id_1.num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.
short_arc – whether to take the shorter angle arc (
True) or longer angle arc (False). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the defaultTrue. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default
1sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.control_angle_shift – how far to rotate the control point for each edge around the origin. Default
0sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.use_numba_curves – whether to use a numba-accelerated sampler to construct curves. If
None, resolves to the class-level default set in__init__. When enabled and numba is available, a parallel implementation is used. A small-case heuristic may bypass numba when the total sampled points are below the automatic selection policy between serial and parallel numba.
- Returns:
None.
- add_edge_ids(edges: Edges | ndarray, axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True) Hashable#
Find and store the edge IDs relevant to the specified pair of axes.
Find the subset of network connections that involve nodes on
axis_id_1andaxis_id_2. looking over the specifiededgescompared to the IDs of theNodeinstances currently placed on eachAxis. Edges discovered between the specified two axes (depending on the values specified bya1_to_a2anda2_to_a1, more below) will have the relevant edge IDs stored, with other edges disregarded.Generates
(j, 2)and(k, 2)numpy arrays ofaxis_id_1toaxis_id_2connections andaxis_id_2toaxis_id_1connections (or only 1 of those arrays depending on parameter choices fora1_to_a2anda2_to_a1).The resulting arrays of relevant edge IDs (e.g. each row is a [<FROM ID>, <TO ID>] edge) will be stored automatically in the
hive_plot_edgesattribute, a dictionary of dictionaries of dictionaries of edge information, which can later be converted into discretized edges to be plotted in Cartesian space. They are stored ashive_plot_edges[<source_axis_id>][<sink_axis_id>][<tag>]["ids"].Note
If no
tagis provided (e.g. defaultNone), one will be automatically generated and returned by this method call.- Parameters:
edges –
Edgesinstance or(n, 2)array ofHashablevalues representing unique IDs of specificNodeinstances. The first column is the IDs for the “from” nodes and the second column is the IDS for the “to” nodes for each connection.axis_id_1 – pointer to first of two
Axisinstances in theaxesattribute between which we want to find connections.axis_id_2 – pointer to second of two
Axisinstances in theaxesattribute between which we want to find connections.tag – tag corresponding to subset of specified edges. If
Noneis provided, the tag will be set as the lowest unused integer starting at0amongst the available tags underhive_plot_edges[axis_id_1][axis_id_2]and / orhive_plot_edges[axis_id_2][axis_id_1].a1_to_a2 – whether to find the connections going FROM
axis_id_1TOaxis_id_2.a2_to_a1 – whether to find the connections going FROM
axis_id_2TOaxis_id_1.
- Returns:
the resulting unique tag. Note, if both
a1_to_a2anda2_to_a1areTruethe resulting unique tag returned will be the same for both directions of edges.
- add_edge_kwargs(axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, reset_existing_kwargs: bool = False, overwrite_existing_kwargs: bool = True, warn_on_no_edges: bool = True, **edge_kwargs) None#
Add edge kwargs to the constructed
hive_plot_edgesattribute between two axes of a Hive Plot.For a given set of edges for which edge kwargs were already set, any redundant edge kwargs specified by this method call will overwrite the previously set kwargs.
Expected to have found edge IDs between the two axes before calling this method, which can be done either by calling the
connect_axes()method or the lower-leveladd_edge_ids()method for the two axes of interest. A warning will be raised if no edges exist between the two axes andwarn_on_no_edges=True.Resulting kwargs will be stored as a dict. This output will be stored in
hive_plot_edges[axis_id_1][axis_id_2][tag]["edge_kwargs"].Note
There is special handling in here for when the two provided axes have names
"<axis_name>"and"<axis_name>_repeat". This is for use withhiveplotlib.hive_plot_n_axes(), which when creating repeat axes always names the repeated one"<axis_name>_repeat". By definition, the edges between an axis and its repeat are the same, and therefore edges between these two axes should only be plotted in one direction. If one is running this method on aHiveplotinstance fromhiveplotlib.hive_plot_n_axes()though, a warning of a lack of edges in both directions for repeat edges is not productive, so we formally catch this case.- Parameters:
axis_id_1 – Hashable pointer to the first
Axisinstance in theaxesattribute to which we want to add plotting kwargs.axis_id_2 – Hashable pointer to the second
Axisinstance in theaxesattribute to which we want to add plotting kwargs.tag – which subset of curves to modify kwargs for. Note, if no tag is specified (e.g.
tag=None), it is presumed there is only one tag for the specified set of axes to look over and that will be inferred. If no tag is specified and there are multiple tags to choose from, aValueErrorwill be raised.a1_to_a2 – whether to add kwargs for connections going FROM
axis_id_1TOaxis_id_2.a2_to_a1 – whether to add kwargs for connections going FROM
axis_id_2TOaxis_id_1.reset_existing_kwargs – whether to remove all existing edge kwargs before adding provided
edge_kwargsfor the edges specified by other parameters, default False leaves existing edge kwargs unchanged.overwrite_existing_kwargs – whether to overwrite existing edge kwargs if provided again, default
Trueoverwrites already-provided edge kwargs with the new value(s) inedge_kwargs.warn_on_no_edges – whether to warn if adding kwargs for edges that don’t exist. Default
True.edge_kwargs – additional
matplotlibkeyword arguments that will be applied to the specified edges.
- Returns:
None.
- add_edges(edges: Edges | ndarray) None#
Add edges to
edgesattribute.- Parameters:
edges –
Edgesinstance or 2d array of [from, to] edges, where values correspond to unique node IDs.- Returns:
None.
- add_nodes(nodes: NodeCollection | List[Node], check_uniqueness: bool = True) None#
Add
NodeCollectionorNodeinstances tonodesattribute.- Parameters:
nodes –
NodeCollectioninstance or list ofNodeinstances, will be added tonodesattribute.check_uniqueness – whether to formally check for uniqueness. WARNING: the only reason to turn this off is if the dataset becomes big enough that this operation becomes expensive, and you have already established uniqueness another way (for example, you are pulling data from a database and the key in your table is the unique ID). If you add non-unique IDs with
check_uniqueness=False, we make no promises about output.
- Returns:
None.
- connect_axes(edges: Edges | ndarray, axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0, reset_existing_kwargs: bool = False, overwrite_existing_kwargs: bool = True, warn_on_no_edges: bool = True, **edge_kwargs) Hashable#
Construct all the curves and set all the curve kwargs between
axis_id_1andaxis_id_2.Based on the specified
edgesparameter, build out the resulting Bézier curves, and set any kwargs for those edges for later visualization.The curves will be tracked by a unique
tag, and the resulting constructions will be stored inhive_plot_edges[axis_id_1][axis_id_2][tag]ifa1_to_a2isTrueandhive_plot_edges[axis_id_2][axis_id_1][tag]ifa2_to_a1isTrue.Note
If trying to draw different subsets of edges with different kwargs, one can run this method multiple times with different subsets of the entire edges array, providing unique
tagvalues with each subset ofedges, and specifying differentedge_kwargseach time. The resulting Hive Plot would be plotted showing each set of edges styled with each set of unique kwargs.Note
You can choose to construct edges in only one of either directions by specifying a1_to_a2 or a2_to_a1 as False (both are True by default).
- Parameters:
edges –
hiveplotlib.Edgesinstance or(n, 2)array ofHashablevalues representing pointers to specificNodeinstances. If providing an array input, the first column is the “from” and the second column is the “to” for each connection.axis_id_1 – Hashable pointer to the first
Axisinstance in theaxesattribute we want to find connections between.axis_id_2 – Hashable pointer to the second
Axisinstance in theaxesattribute we want to find connections between.tag – tag corresponding to specified
edges. IfNoneis provided, the tag will be set as the lowest unused integer starting at0amongst the available tags underhive_plot_edges[from_axis_id][to_axis_id]and / orhive_plot_edges[to_axis_id][from_axis_id].a1_to_a2 – whether to find and build the connections going FROM
axis_id_1TOaxis_id_2.a2_to_a1 – whether to find and build the connections going FROM
axis_id_2TOaxis_id_1.num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.
short_arc – whether to take the shorter angle arc (
True) or longer angle arc (False). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the defaultTrue. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default
1sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.control_angle_shift – how far to rotate the control point for each edge around the origin. Default
0sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.edge_kwargs – additional
matplotlibparams that will be applied to the related edges.reset_existing_kwargs – whether to remove all existing edge kwargs before adding provided
edge_kwargsfor the edges specified by other parameters, default False leaves existing edge kwargs unchanged.overwrite_existing_kwargs – whether to overwrite existing edge kwargs if provided again, default
Trueoverwrites already-provided edge kwargs with the new value(s) inedge_kwargs.warn_on_no_edges – whether to warn if adding kwargs for edges that don’t exist. Default
True.
- Returns:
Hashabletag that identifies the generated curves and kwargs.
- construct_curves(num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0, use_numba_curves: bool | None = None) None#
Construct Bézier curves for any connections for which we’ve specified the edges to draw.
(e.g.
hive_plot_edges[axis_0][axis_1][<tag>]["ids"]is non-empty buthive_plot_edges[axis_0][axis_1][<tag>]["curves"]does not yet exist).Note
Checks all <tag> values between axes.
- Parameters:
num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.
short_arc – whether to take the shorter angle arc (
True) or longer angle arc (False). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the defaultTrue. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default
1sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.control_angle_shift – how far to rotate the control point for each edge around the origin. Default
0sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.use_numba_curves – whether to use a numba-accelerated sampler to construct curves. If
None, resolves to the class-level default set in__init__. When enabled and numba is available, a parallel implementation is used. A small-case heuristic may bypass numba when the total sampled points are below the automatic selection policy between serial and parallel numba.
- Returns:
None.
- copy()#
Return a copy of the instance.
- Returns:
copy of the instance.
- place_nodes_on_axis(axis_id: Hashable, node_df: DataFrame | None = None, sorting_feature_to_use: Hashable | None = None, vmin: float | None = None, vmax: float | None = None, unique_ids: None = None) None#
Set node positions on specific
Axis.Cartesian coordinates will be normalized to specified
vminandvmax. Thosevminandvmaxvalues will then be normalized to span the length of the axis when plotted.Note
unique_idswas removed as a parameter in version 0.26.0. Node data must now be provided as apandas.DataFramevia thenode_dfparameter.- Parameters:
axis_id – which axis (as specified by the keys from the
axesattribute) for which to plot nodes.node_df – dataframe of node information to assign to this axis. If previously set with
BaseHivePlot._allocate_nodes_to_axis(), this will overwrite those node assignments. IfNone, method will check and confirm there are existing node ID assignments.sorting_feature_to_use – which feature in the node data to use to align nodes on an axis. Default
Noneuses the feature previously assigned viaBaseHivePlot.axes[axis_id].set_sorting_variable().vmin – all values less than
vminwill be set tovmin. DefaultNonesets as global minimum of feature values for allNodeinstances on specifiedAxis.vmax – all values greater than
vmaxwill be set tovmax. DefaultNonesets as global maximum of feature values for allNodeinstances on specifiedAxis.unique_ids – REMOVED IN VERSION 0.26.0. See note above.
- Raises:
TypeError – if no-longer supported
unique_idsparameter used.- Returns:
None.
- reset_edges(axis_id_1: Hashable | None = None, axis_id_2: Hashable | None = None, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True) None#
Reset
hive_plot_edgesattribute and correspondingedges.relevant_edges(ifedgesexists).Setting all the parameters to
Nonedeletes any stored connections between axes previously computed. If any subset of the parameters is notNone, the resulting edges will be deleted:If
axis_id_1,axis_id_2, andtagare all specified as notNone, the implied single subset of edges will be deleted. (Note, tags are required to be unique within a specified (axis_id_1, axis_id_2) pair.) In this case, the default is to delete all the edges bidirectionally (e.g. goingaxis_id_1->axis_id_2andaxis_id_2->axis_id_1) with the specifiedtag. To only delete edges in one of these directions, see the description of theboolparametersa1_to_a2anda2_to_a1below.If only
axis_id_1andaxis_id_2are provided as notNone, then the default is to delete all edge subsets bidirectionally betweenaxis_id_1toaxis_id_2(e.g. goingaxis_id_1->axis_id_2andaxis_id_2->axis_id_1) with the specifiedtag. To only delete edges in one of these directions, see the description of theboolparametersa1_to_a2anda2_to_a1below.If only
axis_id_1is provided as notNone, then all edges going TO and FROMaxis_id_1will be deleted. To only delete edges in one of these directions, see the description of theboolparametersa1_to_a2anda2_to_a1below.- Parameters:
axis_id_1 – specifies edges all coming FROM the axis identified by this unique ID.
axis_id_2 – specifies edges all coming TO the axis identified by this unique ID.
tag – tag corresponding to explicit subset of added edges.
a1_to_a2 – whether to remove the connections going FROM
axis_id_1TOaxis_id_2. Note, ifaxis_id_1is specified byaxis_id_2isNone, then this dictates whether to remove all edges going fromaxis_id_1.a2_to_a1 – whether to remove the connections going FROM
axis_id_2TOaxis_id_1. Note, ifaxis_id_1is specified byaxis_id_2isNone, then this dictates whether to remove all edges going toaxis_id_1.
- Returns:
None.
- to_json() str#
Return the information from the axes, nodes, and edges in Cartesian space as a serialized JSON string.
This allows users to visualize hive plots with arbitrary libraries, even outside of python.
The dictionary structure of the resulting JSON will consist of two top-level keys:
“axes” - contains the information for plotting each axis (including angle and long_name), plus the nodes on each axis in Cartesian space.
“edges” - contains the information for plotting the discretized edges in Cartesian space, plus the corresponding to and from IDs that go with each edge, as well as any kwargs that were set for plotting each set of edges.
- Returns:
JSON output of axis, node, and edge information.