Hive Plots
Node, Axis, and HivePlot Classes
- class hiveplotlib.Node(unique_id: Hashable, data: Dict | None = None)
Node
instances hold the data for individual network node.Each instance is initialized with a
unique_id
for identification. These IDs must beHashable
. One can also initialize with a dictionary ofdata
, but data can also be added later with theadd_data()
method.- Example:
my_node = Node(unique_id="my_unique_node_id", data=my_dataset) my_second_node = Node(unique_id="my_second_unique_node_id") my_second_node.add_data(data=my_second_dataset)
- add_data(data: Dict, overwrite_old_data: bool = False) None
Add dictionary of data to
Node.data
.- Parameters:
data – dict of data to associate with
Node
instance.overwrite_old_data – whether to delete existing data dict and overwrite with
data
. DefaultFalse
.
- Returns:
None
.
- class hiveplotlib.Axis(axis_id: Hashable, start: float = 1, end: float = 5, angle: float = 0, long_name: Hashable | None = None)
Axis
instance.Axis
instances are initialized based on their intended final position when plotted. EachAxis
is also initialized with a unique, hashableaxis_id
for clarity when building hive plots with multiple axes.The eventual size and positioning of the
Axis
instance is dictated in the context of polar coordinates by three parameters:start
dictates the distance from the origin to the beginning of the axis when eventually plotted.stop
dictates the distance from the origin to the end of the axis when eventually plotted.angle
sets the angle theAxis
is rotated counterclockwise. For example,angle=0
points East,angle=90
points North, andangle=180
points West.Node
instances placed on eachAxis
instance will be scaled to fit onto the span of theAxis
, but this is discussed further in theHivePlot
class, which handles this placement.Since
axis_id
values may be shorthand for easy referencing when typing code, if one desires a formal name to plot against each axis when visualizing, one can provide a separatelong_name
that will show up as the axis label when runninghiveplotlib.viz
code. (For example, one may chooseaxis_id="a1"
andlong_name="Axis 1"
.Note
long_name
defaults toaxis_id
if not specified.- Example:
# 3 axes, spaced out 120 degrees apart, all size 4, starting 1 unit off of origin axis0 = Axis(axis_id="a0", start=1, end=5, angle=0, long_name="Axis 0") axis1 = Axis(axis_id="a1", start=1, end=5, angle=120, long_name="Axis 1") axis2 = Axis(axis_id="a2", start=1, end=5, angle=240, long_name="Axis 2")
- class hiveplotlib.HivePlot
Hive Plots built from combination of
Axis
andNode
instances.This class is essentially methods for creating and maintaining the nested dictionary attribute
edges
, which holds constructed Bézier curves, edge ids, and matplotlib keyword arguments for various sets of edges to be plotted. The nested dictionary structure can be abstracted to the below example.HivePlot.edges["starting axis"]["ending axis"]["tag"]
The resulting dictionary value holds the edge information relating to an addition of edges that are tagged as “tag,” specifically the edges going FROM the axis named “starting axis” TO the axis named “ending axis.” This value is in fact another dictionary, meant to hold the discretized Bézier curves (
curves
), the matplotlib keyword arguments for plotting (edge_kwargs
), and the abstracted edge ids (an(m, 2) np.ndarray
) between which we are drawing Bézier curves (ids
).- add_axes(axes: Axis | List[Axis]) None
Add list of
Axis
instances toHivePlot.axes
.Note
All resulting Axis IDs must be unique.
- Parameters:
axes –
Axis
object(s) to add to HivePlot instance.- Returns:
None
.
- add_edge_curves_between_axes(axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0) None
Construct discretized edge curves between two axes of a
HivePlot
instance.Note
One must run
HivePlot.add_edge_ids()
first for the two axes of interest.Resulting discretized Bézier curves will be stored as an
(n, 2) numpy.ndarray
of multiple sampled curves where the first column is x position and the second column is y position in Cartesian coordinates.Note
Although each curve is represented by a
(num_steps, 2)
array, all the curves are stored curves in a single collectivenumpy.ndarray
separated by rows of[np.nan, np.nan]
between each discretized curve. This allowsmatplotlib
to accept a single array when plotting lines viaplt.plot()
, which speeds up plotting later.This output will be stored in
HivePlot.edges[axis_id_1][axis_id_2][tag]["curves"]
.- Parameters:
axis_id_1 – pointer to first of two
Axis
instances inHivePlot.axes
between which we want to find connections.axis_id_2 – pointer to second of two
Axis
instances inHivePlot.axes
between which we want to find connections.tag – unique ID specifying which subset of edges specified by their IDs to construct (e.g.
HivePlot.edges[axis_id_1][axis_id_2][tag]["ids"]
). Note, if no tag is specified (e.g.tag=None
), it is presumed there is only one tag for the specified set of axes to look over, which can be inferred. If no tag is specified and there are multiple tags to choose from, aValueError
will be raised.a1_to_a2 – whether to build out the edges going FROM
axis_id_1
TOaxis_id_2
.a2_to_a1 – whether to build out the edges going FROM
axis_id_2
TOaxis_id_1
.num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.
short_arc – whether to take the shorter angle arc (
True
) or longer angle arc (False
). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the defaultTrue
. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False
). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default
1
sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.control_angle_shift – how far to rotate the control point for each edge around the origin. Default
0
sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.
- Returns:
None
.
- add_edge_ids(edges: ndarray, axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True) Hashable
Find and store the edge IDs relevant to the specified pair of axes.
Find the subset of network connections that involve nodes on
axis_id_1
andaxis_id_2
. looking over the specifiededges
compared to the IDs of theNode
instances currently placed on eachAxis
. Edges discovered between the specified two axes (depending on the values specified bya1_to_a2
anda2_to_a1
, more below) will have the relevant edge IDs stored, with other edges disregarded.Generates
(j, 2)
and(k, 2)
numpy arrays ofaxis_id_1
toaxis_id_2
connections andaxis_id_2
toaxis_id_1
connections (or only 1 of those arrays depending on parameter choices fora1_to_a2
anda2_to_a1
).The resulting arrays of relevant edge IDs (e.g. each row is a [<FROM ID>, <TO ID>] edge) will be stored automatically in
HivePlot.edges
, a dictionary of dictionaries of dictionaries of edge information, which can later be converted into discretized edges to be plotted in Cartesian space. They are stored asHivePlot.edges[<source_axis_id>][<sink_axis_id>][<tag>]["ids"]
.Note
If no
tag
is provided (e.g. defaultNone
), one will be automatically generated and returned by this method call.- Parameters:
edges –
(n, 2)
array ofHashable
values representing unique IDs of specificNode
instances. The first column is the IDs for the “from” nodes and the second column is the IDS for the “to” nodes for each connection.axis_id_1 – pointer to first of two
Axis
instances inHivePlot.axes
between which we want to find connections.axis_id_2 – pointer to second of two
Axis
instances inHivePlot.axes
between which we want to find connections.tag – tag corresponding to subset of specified edges. If
None
is provided, the tag will be set as the lowest unused integer starting at0
amongst the available tags underHivePlot.edges[axis_id_1][axis_id_2]
and / orHivePlot.edges[axis_id_2][axis_id_1]
.a1_to_a2 – whether to find the connections going FROM
axis_id_1
TOaxis_id_2
.a2_to_a1 – whether to find the connections going FROM
axis_id_2
TOaxis_id_1
.
- Returns:
the resulting unique tag. Note, if both
a1_to_a2
anda2_to_a1
areTrue
the resulting unique tag returned will be the same for both directions of edges.
- add_edge_kwargs(axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, **edge_kwargs) None
Add edge kwargs to the constructed
HivePlot.edges
between two axes of aHivePlot
.For a given set of edges for which edge kwargs were already set, any redundant edge kwargs specified by this method call will overwrite the previously set kwargs.
Expected to have found edge IDs between the two axes before calling this method, which can be done either by calling
HivePlot.connect_axes()
method or the lower-levelHivePlot.add_edge_ids()
method for the two axes of interest.Resulting kwargs will be stored as a dict. This output will be stored in
HivePlot.edges[axis_id_1][axis_id_2][tag]["edge_kwargs"]
.Note
There is special handling in here for when the two provided axes have names
"<axis_name>"
and"<axis_name>_repeat"
. This is for use withhiveplotlib.hive_plot_n_axes()
, which when creating repeat axes always names the repeated one"<axis_name>_repeat"
. By definition, the edges between an axis and its repeat are the same, and therefore edges between these two axes should only be plotted in one direction. If one is running this method on aHiveplot
instance fromhiveplotlib.hive_plot_n_axes()
though, a warning of a lack of edges in both directions for repeat edges is not productive, so we formally catch this case.- Parameters:
axis_id_1 – Hashable pointer to the first
Axis
instance inHivePlot.axes
we want to add plotting kwargs to.axis_id_2 – Hashable pointer to the second
Axis
instance inHivePlot.axes
we want to add plotting kwargs to.tag – which subset of curves to modify kwargs for. Note, if no tag is specified (e.g.
tag=None
), it is presumed there is only one tag for the specified set of axes to look over and that will be inferred. If no tag is specified and there are multiple tags to choose from, aValueError
will be raised.a1_to_a2 – whether to add kwargs for connections going FROM
axis_id_1
TOaxis_id_2
.a2_to_a1 – whether to add kwargs for connections going FROM
axis_id_2
TOaxis_id_1
.edge_kwargs – additional
matplotlib
keyword arguments that will be applied to the specified edges.
- Returns:
None
.
- add_nodes(nodes: List[Node], check_uniqueness: bool = True) None
Add
Node
instances toHivePlot.nodes
.- Parameters:
nodes – collection of
Node
instances, will be added toHivePlot.nodes
dict with unique IDs as keys.check_uniqueness – whether to formally check for uniqueness. WARNING: the only reason to turn this off is if the dataset becomes big enough that this operation becomes expensive, and you have already established uniqueness another way (for example, you are pulling data from a database and the key in your table is the unique ID). If you add non-unique IDs with
check_uniqueness=False
, we make no promises about output.
- Returns:
None
.
- connect_axes(edges: ndarray, axis_id_1: Hashable, axis_id_2: Hashable, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True, num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0, **edge_kwargs) Hashable
Construct all the curves and set all the curve kwargs between
axis_id_1
andaxis_id_2
.Based on the specified
edges
parameter, build out the resulting Bézier curves, and set any kwargs for those edges for later visualization.The curves will be tracked by a unique
tag
, and the resulting constructions will be stored inHivePlot.edges[axis_id_1][axis_id_2][tag]
ifa1_to_a2
isTrue
andHivePlot.edges[axis_id_2][axis_id_1][tag]
ifa2_to_a1
isTrue
.Note
If trying to draw different subsets of edges with different kwargs, one can run this method multiple times with different subsets of the entire edges array, providing unique
tag
values with each subset ofedges
, and specifying differentedge_kwargs
each time. The resultingHivePlot
instance would be plotted showing each set of edges styled with each set of unique kwargs.Note
You can choose to construct edges in only one of either directions by specifying a1_to_a2 or a2_to_a1 as False (both are True by default).
- Parameters:
edges –
(n, 2)
array ofHashable
values representing pointers to specificNode
instances. The first column is the “from” and the second column is the “to” for each connection.axis_id_1 – Hashable pointer to the first
Axis
instance inHivePlot.axes
we want to find connections between.axis_id_2 – Hashable pointer to the second
Axis
instance inHivePlot.axes
we want to find connections between.tag – tag corresponding to specified
edges
. IfNone
is provided, the tag will be set as the lowest unused integer starting at0
amongst the available tags underHivePlot.edges[from_axis_id][to_axis_id]
and / orHivePlot.edges[to_axis_id][from_axis_id]
.a1_to_a2 – whether to find and build the connections going FROM
axis_id_1
TOaxis_id_2
.a2_to_a1 – whether to find and build the connections going FROM
axis_id_2
TOaxis_id_1
.num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.
short_arc – whether to take the shorter angle arc (
True
) or longer angle arc (False
). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the defaultTrue
. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False
). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default
1
sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.control_angle_shift – how far to rotate the control point for each edge around the origin. Default
0
sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.edge_kwargs – additional
matplotlib
params that will be applied to the related edges.
- Returns:
Hashable
tag that identifies the generated curves and kwargs.
- construct_curves(num_steps: int = 100, short_arc: bool = True, control_rho_scale: float = 1, control_angle_shift: float = 0) None
Construct Bézier curves for any connections for which we’ve specified the edges to draw.
(e.g.
HivePlot.edges[axis_0][axis_1][<tag>]["ids"]
is non-empty butHivePlot.edges[axis_0][axis_1][<tag>]["curves"]
does not yet exist).Note
Checks all <tag> values between axes.
- Parameters:
num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.
short_arc – whether to take the shorter angle arc (
True
) or longer angle arc (False
). There are always two ways to traverse between axes: with one angle being x, the other option being 360 - x. For most visualizations, the user should expect to traverse the “short arc,” hence the defaultTrue
. For full user flexibility, however, we offer the ability to force the arc the other direction, the “long arc” (short_arc=False
). Note: in the case of 2 axes 180 degrees apart, there is no “wrong” angle, so in this case an initial decision will be made, but switching this boolean will switch the arc to the other hemisphere.control_rho_scale – how much to multiply the distance of the control point for each edge to / from the origin. Default
1
sets the control rho for each edge as the mean rho value for each pair of nodes being connected by that edge. A value greater than 1 will pull the resulting edges further away from the origin, making edges more convex, while a value between 0 and 1 will pull the resulting edges closer to the origin, making edges more concave. Note, this affects edges further from the origin by larger magnitudes than edges closer to the origin.control_angle_shift – how far to rotate the control point for each edge around the origin. Default
0
sets the control angle for each edge as the mean angle for each pair of nodes being connected by that edge. A positive value will pull the resulting edges further counterclockwise, while a negative value will pull the resulting edges further clockwise.
- Returns:
None
.
- place_nodes_on_axis(axis_id: Hashable, unique_ids: List[Hashable] | None | ndarray = None, sorting_feature_to_use: Hashable | None = None, vmin: float | None = None, vmax: float | None = None) None
Set node positions on specific
Axis
.Cartesian coordinates will be normalized to specified
vmin
andvmax
. Thosevmin
andvmax
values will then be normalized to span the length of the axis when plotted.- Parameters:
axis_id – which axis (as specified by the keys from
HivePlot.axes
) for which to plot nodes.unique_ids – list of node IDs to assign to this axis. If previously set with
HivePlot._allocate_nodes_to_axis()
, this will overwrite those node assignments. IfNone
, method will check and confirm there are existing node ID assignments.sorting_feature_to_use – which feature in the node data to use to align nodes on an axis. Default
None
uses the feature previously assigned viaHivePlot.axes[axis_id]._set_node_placement_label()
.vmin – all values less than
vmin
will be set tovmin
. DefaultNone
sets as global minimum of feature values for allNode
instances on specifiedAxis
.vmax – all values greater than
vmax
will be set tovmin
. DefaultNone
sets as global maximum of feature values for allNode
instances on specifiedAxis
.
- Returns:
None
.
- reset_edges(axis_id_1: Hashable | None = None, axis_id_2: Hashable | None = None, tag: Hashable | None = None, a1_to_a2: bool = True, a2_to_a1: bool = True) None
Reset
HivePlot.edges
.Setting all the parameters to
None
deletes any stored connections between axes previously computed. If any subset of the parameters is notNone
, the resulting edges will be deleted:If
axis_id_1
,axis_id_2
, andtag
are all specified as notNone
, the implied single subset of edges will be deleted. (Note, tags are required to be unique within a specified (axis_id_1, axis_id_2) pair.) In this case, the default is to delete all the edges bidirectionally (e.g. goingaxis_id_1
->axis_id_2
andaxis_id_2
->axis_id_1
) with the specifiedtag
. To only delete edges in one of these directions, see the description of thebool
parametersa1_to_a2
anda2_to_a1
below.If only
axis_id_1
andaxis_id_2
are provided as notNone
, then the default is to delete all edge subsets bidirectionally betweenaxis_id_1
toaxis_id_2
(e.g. goingaxis_id_1
->axis_id_2
andaxis_id_2
->axis_id_1
) with the specifiedtag
. To only delete edges in one of these directions, see the description of thebool
parametersa1_to_a2
anda2_to_a1
below.If only
axis_id_1
is provided as notNone
, then all edges going TO and FROMaxis_id_1
will be deleted. To only delete edges in one of these directions, see the description of thebool
parametersa1_to_a2
anda2_to_a1
below.- Parameters:
axis_id_1 – specifies edges all coming FROM the axis identified by this unique ID.
axis_id_2 – specifies edges all coming TO the axis identified by this unique ID.
tag – tag corresponding to explicit subset of added edges.
a1_to_a2 – whether to remove the connections going FROM
axis_id_1
TOaxis_id_2
. Note, ifaxis_id_1
is specified byaxis_id_2
isNone
, then this dictates whether to remove all edges going fromaxis_id_1
.a2_to_a1 – whether to remove the connections going FROM
axis_id_2
TOaxis_id_1
. Note, ifaxis_id_1
is specified byaxis_id_2
isNone
, then this dictates whether to remove all edges going toaxis_id_1
.
- Returns:
None
.
- to_json() str
Return the information from the axes, nodes, and edges in Cartesian space as a serialized JSON string.
This allows users to visualize hive plots with arbitrary libraries, even outside of python.
The dictionary structure of the resulting JSON will consist of two top-level keys:
“axes” - contains the information for plotting each axis, plus the nodes on each axis in Cartesian space.
“edges” - contains the information for plotting the discretized edges in Cartesian space, plus the corresponding to and from IDs that go with each edge, as well as any kwargs that were set for plotting each set of edges.
- Returns:
JSON output of axis, node, and edge information.
Quick Hive Plots
- hiveplotlib.hive_plot_n_axes(node_list: List[Node], edges: ndarray | List[ndarray], axes_assignments: List[List[Hashable | None]], sorting_variables: List[Hashable], axes_names: List[Hashable] | None = None, repeat_axes: List[bool] | None = None, vmins: List[float] | None = None, vmaxes: List[float] | None = None, angle_between_repeat_axes: float = 40, orient_angle: float = 0, all_edge_kwargs: Dict | None = None, edge_list_kwargs: List[Dict] | None = None, cw_edge_kwargs: Dict | None = None, ccw_edge_kwargs: Dict | None = None, repeat_edge_kwargs: Dict | None = None) HivePlot
Generate a
HivePlot
Instance with an arbitrary number of axes, as specified by passing a partition of node IDs.Repeat axes can be generated for any desired subset of axes, but repeat axes will be sorted by the same variable as the original axis.
Axes will be added in counterclockwise order.
Axes will all be the same length and position from the origin.
Changes to all the edge kwargs can be affected with the
all_edge_kwargs
parameter. If providing multiple sets of edges (e.g. alist
input for theedges
parameter), one can also provide unique kwargs for each set of edges by specifying a correspondinglist
of kwargs with theedge_list_kwargs
parameter.Edges directed counterclockwise will be drawn as solid lines by default. Clockwise edges will be drawn as solid lines by default. All CW / CCW lines kwargs can be changed with the
cw_edge_kwargs
andccw_edge_kwargs
parameters, respectively. Edges between repeat axes will be drawn as solid lines by default. Repeat edges operate under their own set of visual kwargs (repeat_edge_kwargs
) as clockwise vs counterclockwise edges don’t have much meaning when looking within a single group.Specific edge kwargs can also be changed by running the
add_edge_kwargs()
method on the resultingHivePlot
instance, where the specifiedtag
ofedges
to change will be the index value in the list of lists inedges
(note: a tag is only necessary if theindices
input is a list of lists, otherwise there would only be a single tag of edges, which can be inferred).There is a hierarchy to these various kwarg arguments. That is, if redundant / overlapping kwargs are provided for different kwarg parameters, a warning will be raised and priority will be given according to the below hierarchy (Note:
cw_edge_kwargs, ``ccw_edge_kwargs
, andrepeat_edge_kwargs
do not interact with each other in practice, and are therefore equal in the hierarchy):edge_list_kwargs
>cw_edge_kwargs
/ccw_edge_kwargs
/repeat_edge_kwargs
>all_edge_kwargs
.- Parameters:
node_list – List of
Node
instances to go into outputHivePlot
instance.edges –
(n, 2)
array ofHashable
values representing pointers to specificNode
instances. The first column is the “from” and the second column is the “to” for each connection. Alternatively, one can provide a list of two-column arrays, which will allow for plotting different sets of edges with different kwargs.axes_assignments – list of lists of node unique IDs. Each list of node IDs will be assigned to a separate axis in the resulting
HivePlot
instance, built out in counterclockwise order. IfNone
is provided as one of the elements instead of a list of node IDs, then all unassigned nodes will be aggregated onto this axis.sorting_variables – list of
Hashable
variables on which to sort each axis, where the ith indexHashable
corresponds to the ith index list of nodes inaxes_assignments
(e.g. the ith axis of the resultingHivePlot
).axes_names – list of
Hashable
names for each axis, where the ith indexHashable
corresponds to the ith index list of nodes inaxes_assignments
(e.g. the ith axis of the resultingHivePlot
). DefaultNone
names the groups as “Group 1,” “Group 2,” etc.repeat_axes – list of
bool
values of whether to generate a repeat axis, where the ith index bool corresponds to the ith index list of nodes inaxes_assignments
(e.g. the ith axis of the resultingHivePlot
). ATrue
value generates a repeat axis. DefaultNone
assumes no repeat axes (e.g. allFalse
).vmins – list of
float
values (orNone
values) specifying the vmin for each axis, where the ith index value corresponds to the ith index list of nodes inaxes_assignments
(e.g. the ith axis of the resultingHivePlot
). ANone
value infers the global min for that axis. DefaultNone
uses the global min for all the axes.vmaxes – list of
float
values (orNone
values) specifying the vmax for each axis, where the ith index value corresponds to the ith index list of nodes inaxes_assignments
(e.g. the ith axis of the resultingHivePlot
). ANone
value infers the global max for that axis. DefaultNone
uses the global max for all the axes.angle_between_repeat_axes – angle between repeat axes. Default 40 degrees.
orient_angle – rotates all axes counterclockwise from their initial angles (default 0 degrees).
all_edge_kwargs – kwargs for all edges. Default
None
specifies no additional kwargs.edge_list_kwargs – list of dictionaries of kwargs for each element of
edges
whenedges
is alist
. The ith set of kwargs inedge_list_kwargs
will only be applied to edges constructed from the ith element ofedges
. DefaultNone
provides no additional kwargs. Note, list must be same length asedges
.cw_edge_kwargs – kwargs for edges going clockwise. Default
None
specifies a solid line.ccw_edge_kwargs – kwargs for edges going counterclockwise. Default
None
specifies a solid line.repeat_edge_kwargs – kwargs for edges between repeat axes. Default
None
specifies a solid line.
- Returns:
HivePlot
instance.
Converters
Converters from various data structures to hiveplotlib
-ready structures.
- hiveplotlib.converters.networkx_to_nodes_edges(graph: networkx.classes.graph.Graph instance) Tuple[List[Node], ndarray]
Take a
networkx
graph and returnhiveplotlib
-friendly data structures.Specifically, returns a list of
hiveplotlib.Node
instances and an(n, 2)
np.ndarray
of edges. These outputs can be fed directly intohive_plot_n_axes()
- Parameters:
graph –
networkx
graph.- Returns:
list
ofNode
instances,(n, 2)
np.ndarray
of edges.
Utility Functions
Helper static methods for working with node data.
- hiveplotlib.node.dataframe_to_node_list(df: DataFrame, unique_id_column: Hashable) List[Node]
Convert a dataframe into
Node
instances, where each row will be turned into a single instance.- Parameters:
df – dataframe to use to generate
Node
instances.unique_id_column – which column corresponds to unique IDs for the eventual nodes.
- Returns:
list of
Node
instances.
- hiveplotlib.node.split_nodes_on_variable(node_list: List[Node], variable_name: Hashable, cutoffs: List[float] | int | None = None, labels: List[Hashable] | None = None) Dict[Hashable, List[Node]]
Split a
list
ofNode
instances into a partition of node IDs.By default, splits will group node IDs on unique values of
variable_name
.If
variable_name
corresponds to numerical data, and alist
ofcutoffs
is provided, node IDs will be separated into bins according to the following binning scheme:(-inf,
cutoff[0]
], (cutoff[0]
,cutoff[1]
], … , (cutoff[-1]
, inf]If
variable_name
corresponds to numerical data, andcutoffs
is provided as anint
, node IDs will be separated intocutoffs
equal-sized quantiles.Note
This method currently only supports splits where
variable_name
corresponds to numerical data.- Parameters:
node_list – list of
Node
instances to partition.variable_name – which variable in each
Node
instances to group by.cutoffs – cutoffs to use in binning nodes according to data under
variable_name
. DefaultNone
will bin nodes by unique values ofvariable_name
. When provided as alist
, the specified cutoffs will bin according to (-inf,cutoffs[0]
], (`cutoffs[0]`,cutoffs[1]
], … , (cutoffs[-1]
, inf). When provided as anint
, the exact numerical break points will be determined to createcutoffs
equally-sized quantiles.labels – labels assigned to each bin. Only referenced when
cutoffs
is notNone
. DefaultNone
labels each bin as a string based on its range of values. Note, whencutoffs
is a list,len(labels)
must be 1 greater thanlen(cutoffs)
. Whencutoffs
is anint
,len(labels)
must be equal tocutoffs
.
- Returns:
dict
whose values are lists ofNode
unique IDs. Ifcutoffs
isNone
, keys will be the unique values for the variable. Otherwise, each key will be the string representation of a bin range.
Utility functions for hive plot curvature and coordinates.
- hiveplotlib.utils.bezier(start: float, end: float, control: float, num_steps: int = 100) ndarray
Calculate 1-dimensional Bézier curve values between
start
andend
with curve based oncontrol
.Note, this function is hardcoded for exactly 1 control point.
- Parameters:
start – starting point.
end – ending point.
control – “pull” point.
num_steps – number of points on Bézier curve.
- Returns:
(num_steps, )
sizednp.ndarray
of 1-dimensional discretized Bézier curve output.
- hiveplotlib.utils.bezier_all(start_arr: List[float] | ndarray, end_arr: List[float] | ndarray, control_arr: List[float] | ndarray, num_steps: int = 100) ndarray
Calculate Bézier curve between multiple start and end values.
Note, this function is hardcoded for exactly 1 control point per curve.
- Parameters:
start_arr – starting point of each curve.
end_arr – corresponding ending point of each curve.
control_arr – corresponding “pull” points for each curve.
num_steps – number of points on each Bézier curve.
- Returns:
(start_arr * num_steps, )
sizednp.ndarray
of 1-dimensional discretized Bézier curve output. Note, everynum_steps
chunk of the output corresponds to a different Bézier curve.
- hiveplotlib.utils.cartesian2polar(x: ndarray | float, y: ndarray | float) Tuple[ndarray | float, ndarray | float]
Convert cartesian coordinates e.g. (x, y) to polar coordinates.
(Polar coordinates e.g. (rho, phi), where rho is distance from origin, and phi is counterclockwise angle off of x-axis in degrees.)
- Parameters:
x – Cartesian x coordinates.
y – Cartesian y coordinates.
- Returns:
(rho, phi) polar coordinates.
- hiveplotlib.utils.polar2cartesian(rho: ndarray | float, phi: ndarray | float) Tuple[ndarray | float, ndarray | float]
Convert polar coordinates to cartesian coordinates e.g. (x, y).
(Polar coordinates e.g. (rho, phi), where rho is distance from origin, and phi is counterclockwise angle off of x-axis in degrees.)
- Parameters:
rho – distance from origin.
phi – counterclockwise angle off of x-axis in degrees (not radians).
- Returns:
(x, y) cartesian coordinates.
Polar Parallel Coordinates Plots
P2CP Class
- class hiveplotlib.P2CP(data: DataFrame | None = None)
Polar Parallel Coordinates Plots (P2CPs).
Conceptually similar to Hive Plots, P2CPs can be used for any multivariate data as opposed to solely for network visualizations. Features of the data are placed on their own axes in the same polar setup as Hive Plots, resulting in each representation of a complete data point being a loop in the resulting figure. For more on the nuances of P2CPs, see Koplik and Valente, 2021.
- add_edge_kwargs(tag: Hashable | None = None, **edge_kwargs) None
Add edge kwargs to a tag of Bézier curves previously constructed with
P2CP.build_edges()
.For a given tag of curves for which edge kwargs were already set, any redundant edge kwargs specified by this method call will overwrite the previously set kwargs.
Note
Expected to have previously called
P2CP.build_edges()
before calling this method, for the tag of interest. However, if no tags were ever set (e.g. there’s only 1 tag of curves), then no tag is necessary here.- Parameters:
tag – which subset of curves to modify kwargs for. Note, if no tag is specified (e.g.
tag=None
), it is presumed there is only one tag to look over and that will be inferred. If no tag is specified and there are multiple tags to choose from, aValueError
will be raised.edge_kwargs – additional
matplotlib
keyword arguments that will be applied to edges constructed for the referenced indices.
- Returns:
None
.
- build_edges(indices: List[int] | ndarray | str = 'all', tag: Hashable | None = None, num_steps: int = 100, **edge_kwargs) Hashable
Construct the loops of the P2CP for the specified subset of
indices
.These index values correspond to the indices of the
pandas
dataframeP2CP.data
.Note
Specifying
indices="all"
draws the curves for the entire dataframe.- Parameters:
indices – which indices of the underlying dataframe to draw on the P2CP. Note, “all” draws the entire dataframe.
tag – tag corresponding to specified indices. If
None
is provided, the tag will be set as the lowest unused integer starting at 0 amongst the tags.num_steps – number of points sampled along a given Bézier curve. Larger numbers will result in smoother curves when plotting later, but slower rendering.
edge_kwargs – additional
matplotlib
keyword arguments that will be applied to edges constructed for the referenced indices.
- Returns:
the unique,
Hashable
tag used for the constructed edges.
- reset_edges(tag: Hashable | None = None) None
Drop the constructed edges with the specified
tag
.Note
If no tags were ever set (e.g. there’s only 1 tag of curves), then no tag is necessary here.
- Parameters:
tag – which subset of curves to delete. Note, if no tag is specified (e.g.
tag=None
), then all curves will be deleted.- Returns:
None
.
- set_axes(columns: List[Hashable] | ndarray, angles: List[float] | None = None, vmins: List[float] | None = None, vmaxes: List[float] | None = None, axis_kwargs: List[Dict] | None = None, overwrite_previously_set_axes: bool = True, start_angle: float = 0) None
Set the axes that will be used in the eventual P2CP visualization.
- Parameters:
columns – column names from
P2CP.data
to use. Note, these need not be unique, as repeat axes may be desired. By default, repeat column names will be internally renamed to name +"\nRepeat"
.angles – corresponding angles (in degrees) to set for each desired axis. Default
None
sets the angles evenly spaced over 360 degrees, starting atstart_angle
degrees for the first axis and moving counterclockwise.vmins – list of
float
values (orNone
values) specifying the vmin for each axis, where the ith index value corresponds to the ith axis set bycolumns
. ANone
value infers the global min for that axis. DefaultNone
uses the global min for all axes.vmaxes – list of
float
values (orNone
values) specifying the vmax for each axis, where the ith index value corresponds to the ith axis set bycolumns
. ANone
value infers the global max for that axis. DefaultNone
uses the global max for all axes.axis_kwargs – list of dictionaries of additional kwargs that will be used for the underlying
Axis
instances that will be created for each column. Only relevant if you want to change the positioning / length of an axis with thestart
andend
parameters. For more on these kwargs, see the documentation forhiveplotlib.Axis
. Note, if you want to add these kwargs for only a subset of the desired axes, you can skip adding kwargs for specific columns by putting aNone
at those indices in youraxis_kwargs
input.overwrite_previously_set_axes – Whether to overwrite any previously decided axes. Default
True
overwrites any existing axes.start_angle – if
angles
isNone
, sets the starting angle from which we place the axes around the origin counterclockwise.
- Returns:
None
.
- set_data(data: DataFrame) None
Add a dataset to the
P2CP
instance.All P2CP construction will be based on this dataset, which will be stored as
P2CP.data
.- Parameters:
data – dataframe to add.
- Returns:
None
.
- to_json() str
Return the information from the axes, point placement on each axis, and edges in Cartesian space as JSON.
This allows users to visualize P2CPs with arbitrary libraries, even outside of python.
The dictionary structure of the resulting JSON will consist of two top-level keys:
“axes” - contains the information for plotting each axis, plus the points on each axis in Cartesian space.
“edges” - contains the information for plotting the discretized edges in Cartesian space broken up by tag values, plus the corresponding unique IDs of points that go with each tag, as well as any kwargs that were set for plotting each set of points in a given tag.
- Returns:
JSON output of axis, point, and edge information.
Quick P2CPs
- hiveplotlib.p2cp_n_axes(data: DataFrame, indices: List[int] | List[List[int]] | List[ndarray] | str = 'all', split_on: Hashable | List[Hashable] | None = None, axes: List[Hashable] | None = None, vmins: List[float] | None = None, vmaxes: List[float] | None = None, orient_angle: float = 0, all_edge_kwargs: Dict | None = None, indices_list_kwargs: List[Dict] | None = None) P2CP
Generate a
P2CP
instance with an arbitrary number of axes for an arbitrary dataframe.Can specify a desired subset of column names, each of which will become an axis in the resulting P2CP. Default grabs all columns in the dataframe, unless
split_on
is a column name, in which case that specified column will be excluded from the list of axes in the finalP2CP
instance. Note, repeat axes (e.g. repeated column names) are allowed here.Axes will be added in counterclockwise order. Axes will all be the same length and position from the origin.
In deciding what edges of
data
get drawn (and how they get drawn), the user has several options. The default behavior plots all data points indata
with the same keyword arguments. If one instead wanted to plot a subset of data points, one can provide alist
of a subset of indices from the dataframe to theindices
parameter.If one wants to plot multiple sets of edges in different styles, there are two means of doing this. The more automated means is to split on the unique values of a column in the provided
data
. By specifying a column name to thesplit_on
parameter, data will be added in chunks according to the unique values of the specified column. If one instead includes a list of values corresponding to the records indata
, data will be added according to the unique values of this provided list. Each subset ofdata
corresponding to a unique column value will be given a separate tag, with the tag being the unique column value. Note, however, this only works whenindices="all"
. If one prefers to split indices manually, one can instead provide a list of lists to theindices
parameter, allowing for arbitrary splitting of the data. Regardless of how one chooses to split the data, one can then assign different keyword arguments to each subset of data.Changes to all the edge kwargs can be affected with the
all_edge_kwargs
parameter. If providing multiple sets of edges though in one of the ways discussed above, one can also provide unique kwargs for each set of edges by specifying a correspondinglist
of dictionaries of kwargs with theindices_list_kwargs
parameter.Specific edge kwargs can also be changed later by running the
add_edge_kwargs()
method on the returnedP2CP
instance. If one only added a single set of indices (e.g.indices="all"
orindices
was provided as a flat list of index values), then this method can simply be called with kwargs. However, if multiple subsets of edges were specified, then one will need to be precise about whichtag
of edge kwargs to change. If multiple sets were provided via theindices
parameter, then the resultingtag
for each subset will correspond to the index value in the list of lists inindices
. If insteadsplit_on_column
was specified as notNone
, then tags will be the unique values in the specified column / list of values. Regardless of splitting methodology, existing tags can be found under the returnedP2CP.tags
.There is a hierarchy to these kwarg arguments. That is, if redundant / overlapping kwargs are provided for different kwarg parameters, a warning will be raised and priority will be given according to the below hierarchy:
indices_list_kwargs
>all_edge_kwargs
.- Parameters:
data – dataframe to add.
indices –
list
of index values from the index of the added dataframedata
. Default “all” creates edges for every row indata
, but alist
input creates edges for only the specified subset. Alternatively, one can provide a list of lists of indices, which will allow for plotting different sets of edges with different kwargs. These subsets will be added to the resultingP2CP
instance with tags corresponding to the index value inindices
.split_on – column name from
data
or list of values corresponding to the records ofdata
. If specified as notNone
, the resultingP2CP
instance will split data according to unique values with respect to the column ofdata
/ the list of provided values, with each subset of data given a tag of the unique value corresponding to each subset. When specifying a column indata
, this column will be excluded from consideration ifaxes
isNone
. Note: this subsetting can only be run whenindices="all"
. DefaultNone
plots all the records indata
with the same line kwargs.axes – list of
Hashable
column names indata
. Each column name will be assigned to a separate axis in the resultingP2CP
instance, built out in counterclockwise order. DefaultNone
grabs all columns in the dataframe, unlesssplit_on
is a column name, in which case that specified column will be excluded from the list of axes in the finalP2CP
instance. Note, repeat axes (e.g. repeated column names) are allowed here.vmins – list of
float
values (orNone
values) specifying the vmin for each axis, where the ith index value corresponds to the ith index axis inaxes
(e.g. the ith axis of the resultingP2CP
instance). ANone
value infers the global min for that axis. DefaultNone
uses the global min for all the axes.vmaxes – list of
float
values (orNone
values) specifying the vmax for each axis, where the ith index value corresponds to the ith index axis inaxes
(e.g. the ith axis of the resultingP2CP
instance). ANone
value infers the global max for that axis. DefaultNone
uses the global max for all the axes.orient_angle – rotates all axes counterclockwise from their initial angles (default 0 degrees).
all_edge_kwargs – kwargs for all edges. Default
None
specifies no additional kwargs.indices_list_kwargs – list of dictionaries of kwargs for each element of
indices
whenindices
is a list of lists orsplit_on
is notNone
. The ith set of kwargs inindices_list_kwargs
will only be applied to index values corresponding to the ith list inindices
or to index values which have the ith unique value in a sorted list of unique values insplit_on
. DefaultNone
provides no additional kwargs. Note, this list must be same length asindices
or the same number of values as the number of unique values insplit_on
.
- Returns:
P2CP
instance.
Utility Functions
Helper static methods for generating and working with P2CP
instances.
- hiveplotlib.p2cp.indices_for_unique_values(df: DataFrame, column: Hashable) Dict[Hashable, ndarray]
Find the indices corresponding to each unique value in a column of a
pandas
dataframe.Works when the values contained in
column
are numerical or categorical.- Parameters:
df – dataframe from which to find index values.
column – column of the dataframe to use to find indices corresponding to each of the column’s unique values.
- Returns:
dict
whose keys are the unique values in the column of data and whose values are 1d arrays of index values.
- hiveplotlib.p2cp.split_df_on_variable(df: DataFrame, column: Hashable, cutoffs: List[float] | int, labels: List[Hashable] | ndarray | None = None) ndarray
Generate value for each record in a dataframe according to a splitting criterion.
Using either specified cutoff values or a specified number of quantiles for
cutoffs
, return an(n, 1)
np.ndarray
where the ith value corresponds to the partition assignment of the ith record ofdf
.If
column
corresponds to numerical data, and alist
ofcutoffs
is provided, then dataframe records will be assigned according to the following binning scheme:(-inf,
cutoff[0]
], (cutoff[0]
,cutoff[1]
], … , (cutoff[-1]
, inf]If
column
corresponds to numerical data, andcutoffs
is provided as anint
, then dataframe records will be assigned intocutoffs
equal-sized quantiles.Note
This method currently only supports splits where
column
corresponds to numerical data. For splits on categorical data values, seeindices_for_unique_values()
.- Parameters:
df – dataframe whose records will be assigned to a partition.
column – column of the dataframe to use to assign partition of records.
cutoffs – cutoffs to use in partitioning records according to the data under
column
. When provided as alist
, the specified cutoffs will partition according to (-inf,cutoffs[0]
], (`cutoffs[0]`,cutoffs[1]
], … , (cutoffs[-1]
, inf). When provided as anint
, the exact numerical break points will be determined to createcutoffs
equally-sized quantiles.labels – labels assigned to each bin. Default
None
labels each bin as a string based on its range of values. Note, whencutoffs
is a list,len(labels)
must be 1 greater thanlen(cutoffs)
. Whencutoffs
is anint
,len(labels)
must be equal tocutoffs
.
- Returns:
(n, 1)
np.ndarray
whose values are partition assignments corresponding to records indf
.
Visualization
Matplotlib
matplotlib
-backend visualizations in hiveplotlib
.
- hiveplotlib.viz.matplotlib.axes_viz(instance: HivePlot | P2CP, fig: Figure | None = None, ax: Axes | None = None, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, axes_off: bool = True, fig_kwargs: dict | None = None, text_kwargs: dict | None = None, **axes_kwargs) Tuple[Figure, Axes]
matplotlib
visualization of axes in aHivePlot
orP2CP
instance.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw axes.fig – default
None
builds new figure. If a figure is specified, axes will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified,Axis
instances will be drawn on that axis. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for axes labels.
figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.text_kwargs – additional kwargs passed to
plt.text()
call.axes_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a
plt.plot()
call.
- Returns:
matplotlib
figure, axis.
- hiveplotlib.viz.matplotlib.edge_viz(instance: HivePlot | P2CP, fig: Figure | None = None, ax: Axes | None = None, tags: Hashable | List[Hashable] | None = None, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, buffer: float = 0.1, axes_off: bool = True, fig_kwargs: dict | None = None, **edge_kwargs) Tuple[Figure, Axes]
matplotlib
visualization of constructed edges in aHivePlot
orP2CP
instance.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw edges.fig – default
None
builds new figure. If a figure is specified, edges will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified, edges will be drawn on that axis. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.HivePlot.connect_axes()
/hiveplotlib.P2CP.build_edges()
orhiveplotlib.HivePlot.add_edge_kwargs()
/hiveplotlib.P2CP.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.HivePlot.add_edge_kwargs()
/hiveplotlib.P2CP.add_edge_kwargs()
for more. Note, these are kwargs that affect amatplotlib.collections.LineCollection()
call.
- Returns:
matplotlib
figure, axis.
- hiveplotlib.viz.matplotlib.hive_plot_viz(hive_plot: HivePlot, fig: Figure | None = None, ax: Axes | None = None, tags: Hashable | List[Hashable] | None = None, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) Tuple[Figure, Axes]
matplotlib
visualization of aHivePlot
instance.- Parameters:
hive_plot –
HivePlot
instance for which we want to draw edges.fig – default
None
builds new figure. If a figure is specified, hive plot will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified, hive plot will be drawn on that axis. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis inhive_plot
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for hive plot axes labels.
axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).node_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a
plt.scatter()
call.axes_kwargs – additional params that will be applied to all axes. Note, these are kwargs that affect a
plt.plot()
call.text_kwargs – additional kwargs passed to
plt.text()
call.fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.HivePlot.connect_axes()
orhiveplotlib.HivePlot.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.HivePlot.add_edge_kwargs()
for more. Note, these are kwargs that affect amatplotlib.collections.LineCollection()
call.
- Returns:
matplotlib
figure, axis.
- hiveplotlib.viz.matplotlib.label_axes(instance: HivePlot | P2CP, fig: Figure | None = None, ax: Axes | None = None, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, buffer: float = 0.1, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, axes_off: bool = True, fig_kwargs: dict | None = None, **text_kwargs) Tuple[Figure, Axes]
matplotlib
visualization of axis labels in aHivePlot
orP2CP
instance.For
HivePlot
instances, each axis’long_name
attribute will be used. ForP2CP
instances, column names in thedata
attribute will be used.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw nodes.fig – default
None
builds new figure. If a figure is specified, axis labels will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified, axis labels will be drawn on that axis. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for axes labels.
buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.text_kwargs – additional kwargs passed to
plt.text()
call.
- Returns:
matplotlib
figure, axis.
- hiveplotlib.viz.matplotlib.node_viz(instance: HivePlot | P2CP, fig: Figure | None = None, ax: Axes | None = None, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, buffer: float = 0.1, axes_off: bool = True, fig_kwargs: dict | None = None, **scatter_kwargs) Tuple[Figure, Axes]
matplotlib
visualization of nodes in aHivePlot
orP2CP
instance that have been placed on its axes.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw nodes.fig – default
None
builds new figure. If a figure is specified, nodes will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified, nodes will be drawn on that axis. Note: ``fig` andax
must BOTH beNone
to instantiate new figure and axes.figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.scatter_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a
plt.scatter()
call.
- Returns:
matplotlib
figure, axis.
- hiveplotlib.viz.matplotlib.p2cp_legend(p2cp: P2CP, fig: Figure, ax: Axes, tags: Hashable | List[Hashable] | None = None, title: str = 'Tags', line_kwargs: dict | None = None, **legend_kwargs) Tuple[Figure, Axes]
Generate a legend for a
P2CP
instance, where entries in the legend will be tags of data added to the instance.- Parameters:
p2cp –
P2CP
instance we want to visualize.fig –
matplotlib
figure on which we will draw the legend.ax –
matplotlib
axis on which we will draw the legend.tags – which tags of data to include in the legend. Default
None
uses all tags underp2cp.tags
. This can be ignored unless explicitly wanting to exclude certain tags from the legend.title – title of the legend. Default “Tags”.
line_kwargs – keyword arguments that will add to / overwrite _all_ of the legend line markers from the defaults used in the original
P2CP
instance plot. For example, if one plots a large number of lines with lowalpha
and / or a smalllw
, one will likely want to includeline_kwargs=dict(alpha=1, lw=2)
so the representative lines in the legend are legible.legend_kwargs – additional params that will be applied to the legend. Note, these are kwargs that affect a
plt.legend()
call. Default is to plot the legend in the upper right, outside of the bounding box (e.g.loc="upper left", bbox_to_anchor=(1, 1)
).
- Returns:
matplotlib
figure, axis.
- hiveplotlib.viz.matplotlib.p2cp_viz(p2cp: P2CP, fig: Figure | None = None, ax: Axes | None = None, tags: Hashable | List[Hashable] | None = None, figsize: Tuple[float, float] = (10, 10), center_plot: bool = True, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) Tuple[Figure, Axes]
matplotlib
visualization of aP2CP
instance.- Parameters:
p2cp –
P2CP
instance we want to visualize.fig – default
None
builds new figure. If a figure is specified, P2CP will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified, P2CP will be drawn on that axis. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis inp2cp
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the P2CP axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for P2CP axes labels.
axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).node_kwargs – additional params that will be applied to all points on axes. Note, these are kwargs that affect a
plt.scatter()
call.axes_kwargs – additional params that will be applied to all axes. Note, these are kwargs that affect a
plt.plot()
call.fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.P2CP.build_edges()
orhiveplotlib.P2CP.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.P2CP.add_edge_kwargs()
for more. Note, these are kwargs that affect amatplotlib.collections.LineCollection()
call.
- Returns:
matplotlib
figure, axis.
Bokeh
bokeh
-backend visualizations in hiveplotlib
.
- hiveplotlib.viz.bokeh.axes_viz(instance: HivePlot | P2CP, fig: figure | None = None, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: str = '16px', width: int = 600, height: int = 600, center_plot: bool = True, axes_off: bool = True, fig_kwargs: dict | None = None, label_kwargs: dict | None = None, **line_kwargs) figure
bokeh
visualization of axes in aHivePlot
orP2CP
instance.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw axes.fig – default
None
builds new figure. If a figure is specified, axes will be drawn on that figure.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for axes labels.
width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.axes_off – whether to turn off Cartesian x, y axes in resulting
bokeh
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if
width
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.label_kwargs – additional kwargs passed to bokeh.models.Label() call.
line_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a bokeh.models.Line() call.
- Returns:
bokeh
figure.
- hiveplotlib.viz.bokeh.edge_viz(instance: HivePlot | P2CP, fig: figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, fig_kwargs: dict | None = None, **edge_kwargs) figure
bokeh
visualization of constructed edges in aHivePlot
orP2CP
instance.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw edges.fig – default
None
builds new figure. If a figure is specified, edges will be drawn on that figure.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).axes_off – whether to turn off Cartesian x, y axes in resulting
bokeh
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if
width
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.HivePlot.connect_axes()
/hiveplotlib.P2CP.build_edges()
orhiveplotlib.HivePlot.add_edge_kwargs()
/hiveplotlib.P2CP.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.HivePlot.add_edge_kwargs()
/hiveplotlib.P2CP.add_edge_kwargs()
for more. Note, these are kwargs that affect a bokeh.models.MultiLine() call.
- Returns:
bokeh
figure.
- hiveplotlib.viz.bokeh.hive_plot_viz(hive_plot: HivePlot, fig: figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: str = '16px', axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, label_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) figure
Create default
bokeh
visualization of aHivePlot
instance.- Parameters:
hive_plot –
HivePlot
instance for which we want to draw edges.fig – default
None
builds new figure. If a figure is specified, hive plot will be drawn on that figure.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis inhive_plot
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for hive plot axes labels.
axes_off – whether to turn off Cartesian x, y axes in resulting
bokeh
figure (defaultTrue
hides the x and y axes).node_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a fig.scatter() call.
axes_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a bokeh.models.Line() call.
label_kwargs – additional kwargs passed to bokeh.models.Label() call.
fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if
width
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.HivePlot.connect_axes()
orhiveplotlib.HivePlot.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.HivePlot.add_edge_kwargs()
for more. Note, these are kwargs that affect a bokeh.models.MultiLine() call.
- Returns:
bokeh
figure.
- hiveplotlib.viz.bokeh.label_axes(instance: HivePlot | P2CP, fig: figure | None = None, axes_labels_buffer: float = 1.1, axes_labels_fontsize: str = '16px', buffer: float = 0.3, width: int = 600, height: int = 600, center_plot: bool = True, axes_off: bool = True, fig_kwargs: dict | None = None, **label_kwargs) figure
bokeh
visualization of axis labels in aHivePlot
orP2CP
instance.For
HivePlot
instances, each axis’long_name
attribute will be used. ForP2CP
instances, column names in thedata
attribute will be used.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw nodes.fig – default
None
builds new figure. If a figure is specified, axis labels will be drawn on that figure.axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for axes labels.
buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.axes_off – whether to turn off Cartesian x, y axes in resulting
bokeh
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if
width
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.label_kwargs – additional kwargs passed to bokeh.models.Label() call.
- Returns:
bokeh
figure.
- hiveplotlib.viz.bokeh.node_viz(instance: HivePlot | P2CP, fig: figure | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, fig_kwargs: dict | None = None, **scatter_kwargs) figure
bokeh
visualization of nodes in aHivePlot
orP2CP
instance that have been placed on their axes.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw nodes.fig – default
None
builds new figure. If a figure is specified, nodes will be drawn on that figure.width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).axes_off – whether to turn off Cartesian x, y axes in resulting
bokeh
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if
width
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.scatter_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a fig.scatter() call.
- Returns:
bokeh
figure.
- hiveplotlib.viz.bokeh.p2cp_legend(p2cp: P2CP, fig: figure, tags: Hashable | List[Hashable] | None = None, title: str = 'Tags') figure
Generate a legend for a
P2CP
instance, where entries in the legend will be tags of data added to the instance.Note
The legend can be further modified by changing its attributes under
fig.legend
. For more on the flexibility in changing the legend, see the bokeh.models.Legend() docs.- Parameters:
p2cp –
P2CP
instance we want to visualize.fig –
bokeh
figure on which we will draw the legend.tags – which tags of data to include in the legend. Default
None
uses all tags underp2cp.tags
. This can be ignored unless explicitly wanting to exclude certain tags from the legend.title – title of the legend. Default “Tags”.
- Returns:
bokeh
figure.
- hiveplotlib.viz.bokeh.p2cp_viz(p2cp: P2CP, fig: figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: str = '16px', axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, label_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) figure
Create default
bokeh
visualization of aP2CP
instance.- Parameters:
p2cp –
P2CP
instance we want to visualize.fig – default
None
builds new figure. If a figure is specified, P2CP will be drawn on that figure.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis inp2cp
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the P2CP axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for P2CP axes labels.
axes_off – whether to turn off Cartesian x, y axes in resulting
bokeh
figure (defaultTrue
hides the x and y axes).node_kwargs – additional params that will be applied to all P2CP nodes. Note, these are kwargs that affect a fig.scatter() call.
axes_kwargs – additional params that will be applied to all P2CP axes. Note, these are kwargs that affect a bokeh.models.Line() call.
label_kwargs – additional kwargs passed to bokeh.models.Label() call.
fig_kwargs – additional values to be called in bokeh.plotting.figure() call. Note if
width
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.P2CP.build_edges()
orhiveplotlib.P2CP.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.P2CP.add_edge_kwargs()
for more. Note, these are kwargs that affect a bokeh.models.MultiLine() call.
- Returns:
bokeh
figure.
Holoviews
holoviews
visualizations in hiveplotlib
.
Currently, hiveplotlib
supports a bokeh
and matplotlib
backend for holoviews
.
- hiveplotlib.viz.holoviews.axes_viz(instance: HivePlot | P2CP, fig: Overlay | None = None, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, width: float | None = None, height: float | None = None, center_plot: bool = True, axes_off: bool = True, overlay_kwargs: dict | None = None, text_kwargs: dict | None = None, **curve_kwargs) Overlay
holoviews
visualization of axes in aHivePlot
orP2CP
instance.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw axes.fig – default
None
builds new overlay. If an overlay is specified, axes will be drawn on that overlay.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for axes labels.
width – width of figure. When the
holoviews
backend is set to"bokeh"
, width must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).height – height of figure. When the
holoviews
backend is set to"bokeh"
, height must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.axes_off – whether to turn off Cartesian x, y axes in the
hv.Overlay
(defaultTrue
hides the x and y axes).overlay_kwargs – additional values to be called in
hv.Overlay().opts()
call. Note ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.text_kwargs – additional kwargs passed to holoviews.Text() call.
curve_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a holoviews.Curve() call.
- Returns:
holoviews.Overlay
.
- hiveplotlib.viz.holoviews.edge_viz(instance: HivePlot | P2CP, fig: Overlay | None = None, tags: Hashable | List[Hashable] | None = None, width: float | None = None, height: float | None = None, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, overlay_kwargs: dict | None = None, **curve_kwargs) Overlay
holoviews
visualization of constructed edges in aHivePlot
orP2CP
instance.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw edges.fig – default
None
builds new overlay. If an overlay is specified, axes will be drawn on that overlay.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.width – width of figure. When the
holoviews
backend is set to"bokeh"
, width must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).height – height of figure. When the
holoviews
backend is set to"bokeh"
, height must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).axes_off – whether to turn off Cartesian x, y axes in the
hv.Overlay
(defaultTrue
hides the x and y axes).overlay_kwargs – additional values to be called in
hv.Overlay().opts()
call. Note ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.curve_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.HivePlot.connect_axes()
/hiveplotlib.P2CP.build_edges()
orhiveplotlib.HivePlot.add_edge_kwargs()
/hiveplotlib.P2CP.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.HivePlot.add_edge_kwargs()
/hiveplotlib.P2CP.add_edge_kwargs()
for more. Note, these are kwargs that affect a holoviews.Curve() call.
- Returns:
holoviews.Overlay
.
- hiveplotlib.viz.holoviews.hive_plot_viz(hive_plot: HivePlot, fig: Overlay | None = None, tags: Hashable | List[Hashable] | None = None, width: float | None = None, height: float | None = None, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, overlay_kwargs: dict | None = None, **edge_kwargs) Overlay
Create default
holoviews
visualization of aHivePlot
instance.- Parameters:
hive_plot –
HivePlot
instance for which we want to draw edges.fig – default
None
builds new overlay. If an overlay is specified, axes will be drawn on that overlay.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.width – width of figure. When the
holoviews
backend is set to"bokeh"
, width must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).height – height of figure. When the
holoviews
backend is set to"bokeh"
, height must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis inhive_plot
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for hive plot axes labels.
axes_off – whether to turn off Cartesian x, y axes in the
hv.Overlay
(defaultTrue
hides the x and y axes).node_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a holoviews.Points() call.
axes_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a holoviews.Curve() call.
text_kwargs – additional kwargs passed to holoviews.Text() call.
overlay_kwargs – additional values to be called in
hv.Overlay().opts()
call. Note ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.HivePlot.connect_axes()
orhiveplotlib.HivePlot.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.HivePlot.add_edge_kwargs()
for more. Note, these are kwargs that affect a holoviews.Curve() call.
- Returns:
holoviews.Overlay
.
- hiveplotlib.viz.holoviews.label_axes(instance: HivePlot | P2CP, fig: Overlay | None = None, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, buffer: float = 0.3, width: float | None = None, height: float | None = None, center_plot: bool = True, axes_off: bool = True, overlay_kwargs: dict | None = None, **text_kwargs) Overlay
holoviews
visualization of axis labels in aHivePlot
orP2CP
instance.For
HivePlot
instances, each axis’long_name
attribute will be used. ForP2CP
instances, column names in thedata
attribute will be used.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw axes.fig – default
None
builds new overlay. If an overlay is specified, axes will be drawn on that overlay.axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for axes labels.
buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).width – width of figure. When the
holoviews
backend is set to"bokeh"
, width must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).height – height of figure. When the
holoviews
backend is set to"bokeh"
, height must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.axes_off – whether to turn off Cartesian x, y axes in the
hv.Overlay
(defaultTrue
hides the x and y axes).overlay_kwargs – additional values to be called in
hv.Overlay().opts()
call. Note ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.text_kwargs – additional kwargs passed to holoviews.Text() call.
- Returns:
holoviews.Overlay
.
- hiveplotlib.viz.holoviews.node_viz(instance: HivePlot | P2CP, fig: Overlay | None = None, width: float | None = None, height: float | None = None, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, overlay_kwargs: dict | None = None, **points_kwargs) Overlay
holoviews
visualization of nodes in aHivePlot
orP2CP
instance that have been placed on their axes.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw nodes.fig – default
None
builds new overlay. If an overlay is specified, axes will be drawn on that overlay.width – width of figure. When the
holoviews
backend is set to"bokeh"
, width must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).height – height of figure. When the
holoviews
backend is set to"bokeh"
, height must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).axes_off – whether to turn off Cartesian x, y axes in the
hv.Overlay
(defaultTrue
hides the x and y axes).overlay_kwargs – additional values to be called in
hv.Overlay().opts()
call. Note ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.points_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a holoviews.Points() call.
- Returns:
holoviews.Overlay
.
- hiveplotlib.viz.holoviews.p2cp_legend(fig: Overlay, **legend_kwargs) Overlay
Generate a legend for a
P2CP
instance, where entries in the legend will be tags of data added to the instance.- Parameters:
p2cp –
P2CP
instance we want to visualize.fig –
plotly
figure on which we will draw the legend.legend_kwargs – additional values to be called in
hv.Overlay().opts()
call.
- Returns:
holoviews.Overlay
.
- hiveplotlib.viz.holoviews.p2cp_viz(p2cp: P2CP, fig: Overlay | None = None, tags: Hashable | List[Hashable] | None = None, width: float | None = None, height: float | None = None, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, overlay_kwargs: dict | None = None, **edge_kwargs) Overlay
Create default
holoviews
visualization of aP2CP
instance.- Parameters:
p2cp –
P2CP
instance we want to visualize.fig – default
None
builds new overlay. If an overlay is specified, axes will be drawn on that overlay.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.width – width of figure. When the
holoviews
backend is set to"bokeh"
, width must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, width must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).height – height of figure. When the
holoviews
backend is set to"bokeh"
, height must be specified in pixels, defaulting to 600. When theholoviews
backend is set to"matplotlib"
, height must be specified in inches, defaulting to 10. Note: only works if instantiating new figure (e.g.fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis inp2cp
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the P2CP axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for P2CP axes labels.
axes_off – whether to turn off Cartesian x, y axes in the
hv.Overlay
(defaultTrue
hides the x and y axes).node_kwargs – additional params that will be applied to all P2CP nodes. Note, these are kwargs that affect a holoviews.Points() call.
axes_kwargs – additional params that will be applied to all P2CP axes. Note, these are kwargs that affect a holoviews.Curve() call.
text_kwargs – additional kwargs passed to holoviews.Text() call.
overlay_kwargs – additional values to be called in
hv.Overlay().opts()
call. Note ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.P2CP.build_edges()
orhiveplotlib.P2CP.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.P2CP.add_edge_kwargs()
for more. Note, these are kwargs that affect a holoviews.Curve() call.
- Returns:
holoviews.Overlay
.
Plotly
plotly
-backend visualizations in hiveplotlib
.
- hiveplotlib.viz.plotly.axes_viz(instance: HivePlot | P2CP, fig: Figure | None = None, line_width: float = 1.5, opacity: float = 1.0, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.25, axes_labels_fontsize: float = 16, width: int = 600, height: int = 600, center_plot: bool = True, axes_off: bool = True, layout_kwargs: dict | None = None, label_kwargs: dict | None = None, **line_kwargs) Figure
Visualize axes in a
HivePlot
orP2CP
instance withplotly
.Note
The
line_width
parameter corresponds to the standardwidth
parameter for plotly lines. We are exposing this parameter with a different name becausewidth
is already the standard name for figure width throughouthiveplotlib.viz
.plotly
out of the box does not support standardopacity
for its line plots like it does for scatter plots, but it does support providing an alpha channel in RGBA / HSVA / HSLA strings. Theopacity
parameter in this function call will behave asopacity
behaves forplotly
scatter plots, as long as the user-provided colors are either standard named CSS colors (e.g. “blue”, “navy”, “green”) or hex colors.Users who prefer to provide colors as multi-channel RGBA / HSVA / HSLA strings will override the
opacity
parameter. For more on how to provide multi-channel color strings, see theplotly
docs for the color parameter for lines.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw axes.fig – default
None
builds new figure. If a figure is specified, axes will be drawn on that figure.line_width – width of axes.
opacity – opacity of edges. Must be in [0, 1].
buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for axes labels.
width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.axes_off – whether to turn off Cartesian x, y axes in resulting
plotly
figure (defaultTrue
hides the x and y axes).layout_kwargs – additional values for the
layout
parameter to be called in plotly.graph_objects.Figure() call. Note, ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.label_kwargs – additional kwargs passed to the
textfont
parameter ofplotly.graph_objects.Scatter()
. For examples of parameter options, see the plotly docs.line_kwargs – additional params that will be applied to all hive plot axes. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.
- Returns:
plotly
figure.
- hiveplotlib.viz.plotly.edge_viz(instance: HivePlot | P2CP, fig: Figure | None = None, tags: Hashable | List[Hashable] | None = None, line_width: float = 1.5, opacity: float = 0.5, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, layout_kwargs: dict | None = None, **edge_kwargs) Figure
Visualize constructed edges in a
HivePlot
orP2CP
instance withplotly
.Note
The
line_width
parameter corresponds to the standardwidth
parameter for plotly lines. We are exposing this parameter with a different name becausewidth
is already the standard name for figure width throughouthiveplotlib.viz
.plotly
out of the box does not support standardopacity
for its line plots like it does for scatter plots, but it does support providing an alpha channel in RGBA / HSVA / HSLA strings. Theopacity
parameter in this function call will behave asopacity
behaves forplotly
scatter plots, as long as the user-provided colors are either standard named CSS colors (e.g. “blue”, “navy”, “green”) or hex colors.Users who prefer to provide colors as multi-channel RGBA / HSVA / HSLA strings will override the
opacity
parameter. For more on how to provide multi-channel color strings, see theplotly
docs for the color parameter for lines.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw edges.fig – default
None
builds new figure. If a figure is specified, edges will be drawn on that figure.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.line_width – width of edges.
opacity – opacity of edges. Must be in [0, 1].
width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).axes_off – whether to turn off Cartesian x, y axes in resulting
plotly
figure (defaultTrue
hides the x and y axes).layout_kwargs – additional values for the
layout
parameter to be called in plotly.graph_objects.Figure() call. Note, ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.HivePlot.connect_axes()
/hiveplotlib.P2CP.build_edges()
orhiveplotlib.HivePlot.add_edge_kwargs()
/hiveplotlib.P2CP.add_edge_kwargs()
will take priority). To overwrite previously set kwargs, seehiveplotlib.HivePlot.add_edge_kwargs()
/hiveplotlib.P2CP.add_edge_kwargs()
for more. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.
- Returns:
plotly
figure.
- hiveplotlib.viz.plotly.hive_plot_viz(hive_plot: HivePlot, fig: Figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.25, axes_labels_fontsize: float = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, label_kwargs: dict | None = None, layout_kwargs: dict | None = None, **edge_kwargs) Figure
Create default
plotly
visualization of aHivePlot
instance.Note
The line width and opacity of axes can be changed by including the
line_width
andopacity
parameters, respectively, inaxes_kwargs
. See the documentation forhiveplotlib.viz.plotly.axes_viz()
for more information.If the line width and opacity of edges was not set in the original hive plot, then these parameters can be set by including the
line_width
andopacity
parameters, respectively, as additional keyword arguments. See the documentation forhiveplotlib.viz.plotly.edge_viz()
for more information.- Parameters:
hive_plot –
HivePlot
instance for which we want to draw edges.fig – default
None
builds new figure. If a figure is specified, hive plot will be drawn on that figure.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis inhive_plot
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for hive plot axes labels.
axes_off – whether to turn off Cartesian x, y axes in resulting
plotly
figure (defaultTrue
hides the x and y axes).node_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a plotly.graph_objects.scatter.Marker() call.
axes_kwargs – additional params that will be applied to all hive plot axes. This includes the
line_width
andopacity
parameters inhiveplotlib.viz.plotly.axes_viz()
. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.label_kwargs – additional kwargs passed to the
textfont
parameter ofplotly.graph_objects.Scatter()
. For examples of parameter options, see the plotly docs.layout_kwargs – additional values for the
layout
parameter to be called in plotly.graph_objects.Figure() call. Note, ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.HivePlot.connect_axes()
orhiveplotlib.HivePlot.add_edge_kwargs()
will take priority). This includes theline_width
andopacity
parameters inhiveplotlib.viz.plotly.edge_viz()
. To overwrite previously set kwargs, seehiveplotlib.HivePlot.add_edge_kwargs()
for more. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.
- Returns:
plotly
figure.
- hiveplotlib.viz.plotly.label_axes(instance: HivePlot | P2CP, fig: Figure | None = None, axes_labels_buffer: float = 1.25, axes_labels_fontsize: float = 16, buffer: float = 0.3, width: int = 600, height: int = 600, center_plot: bool = True, axes_off: bool = True, layout_kwargs: dict | None = None, **label_kwargs) Figure
Visualize axis labels in a
HivePlot
orP2CP
instance withplotly
.For
HivePlot
instances, each axis’long_name
attribute will be used. ForP2CP
instances, column names in thedata
attribute will be used.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw nodes.fig – default
None
builds new figure. If a figure is specified, axis labels will be drawn on that figure.axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for axes labels.
buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.axes_off – whether to turn off Cartesian x, y axes in resulting
plotly
figure (defaultTrue
hides the x and y axes).layout_kwargs – additional values for the
layout
parameter to be called in plotly.graph_objects.Figure() call. Note, ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.label_kwargs – additional kwargs passed to the
textfont
parameter ofplotly.graph_objects.Scatter()
. For examples of parameter options, see the plotly docs.
- Returns:
plotly
figure.
- hiveplotlib.viz.plotly.node_viz(instance: HivePlot | P2CP, fig: Figure | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, axes_off: bool = True, layout_kwargs: dict | None = None, **scatter_kwargs) Figure
Visualize of nodes in a
HivePlot
orP2CP
instance that have been placed on their axes inplotly
.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw nodes.fig – default
None
builds new figure. If a figure is specified, nodes will be drawn on that figure.width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis ininstance
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).axes_off – whether to turn off Cartesian x, y axes in resulting
plotly
figure (defaultTrue
hides the x and y axes).layout_kwargs – additional values for the
layout
parameter to be called in plotly.graph_objects.Figure() call. Note, ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.scatter_kwargs – additional params that will be applied to all hive plot nodes. Note, these are kwargs that affect a plotly.graph_objects.scatter.Marker() call.
- Returns:
plotly
figure.
- hiveplotlib.viz.plotly.p2cp_legend(p2cp: P2CP, fig: Figure, tags: Hashable | List[Hashable] | None = None, title: str = 'Tags', **legend_kwargs) Figure
Generate a legend for a
P2CP
instance, where entries in the legend will be tags of data added to the instance.- Parameters:
p2cp –
P2CP
instance we want to visualize.fig –
plotly
figure on which we will draw the legend.tags – which tags of data to include in the legend. Default
None
uses all tags underp2cp.tags
. This can be ignored unless explicitly wanting to exclude certain tags from the legend.title – title of the legend. Default “Tags”.
legend_kwargs – additional values for the
legend
parameter in the plotly.graph_objects.update_layout() call.
- Returns:
plotly
figure.
- hiveplotlib.viz.plotly.p2cp_viz(p2cp: P2CP, fig: Figure | None = None, tags: Hashable | List[Hashable] | None = None, width: int = 600, height: int = 600, center_plot: bool = True, buffer: float = 0.3, show_axes_labels: bool = True, axes_labels_buffer: float = 1.25, axes_labels_fontsize: float = 16, axes_off: bool = True, node_kwargs: dict | None = None, axes_kwargs: dict | None = None, label_kwargs: dict | None = None, layout_kwargs: dict | None = None, **edge_kwargs) Figure
Create default
plotly
visualization of aP2CP
instance.Note
The line width and opacity of axes can be changed by including the
line_width
andopacity
parameters, respectively, inaxes_kwargs
. See the documentation forhiveplotlib.viz.plotly.axes_viz()
for more information.If the line width and opacity of edges was not set in the original P2CP, then these parameters can be set by including the
line_width
andopacity
parameters, respectively, as additional keyword arguments. See the documentation forhiveplotlib.viz.plotly.edge_viz()
for more information.- Parameters:
p2cp –
P2CP
instance we want to visualize.fig – default
None
builds new figure. If a figure is specified, P2CP will be drawn on that figure.tags – which tag(s) of data to plot. Default
None
plots all tags of data. Can supply either a single tag or list of tags.width – width of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).height – height of figure in pixels. Note: only works if instantiating new figure (e.g.
fig
isNone
).center_plot – whether to center the figure on
(0, 0)
, the currently fixed center that the axes are drawn around by default. Will only run if there is at least one axis inp2cp
.buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the P2CP axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for P2CP axes labels.
axes_off – whether to turn off Cartesian x, y axes in resulting
plotly
figure (defaultTrue
hides the x and y axes).node_kwargs – additional params that will be applied to all P2CP nodes. Note, these are kwargs that affect a plotly.graph_objects.scatter.Marker() call.
axes_kwargs – additional params that will be applied to all P2CP axes. This includes the
line_width
andopacity
parameters inhiveplotlib.viz.plotly.axes_viz()
. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.label_kwargs – additional kwargs passed to the
textfont
parameter ofplotly.graph_objects.Scatter()
. For examples of parameter options, see the plotly docs.layout_kwargs – additional values for the
layout
parameter to be called in plotly.graph_objects.Figure() call. Note, ifwidth
andheight
are added here, then they will be prioritized over thewidth
andheight
parameters.edge_kwargs – additional params that will be applied to all edges on all axes (but kwargs specified beforehand in
hiveplotlib.P2CP.build_edges()
orhiveplotlib.P2CP.add_edge_kwargs()
will take priority). This includes theline_width
andopacity
parameters inhiveplotlib.viz.plotly.edge_viz()
. To overwrite previously set kwargs, seehiveplotlib.P2CP.add_edge_kwargs()
for more. Note, these are kwargs that affect a plotly.graph_objects.scatter.Line() call.
- Returns:
plotly
figure.
Datashader in Matplotlib
Datashading capabilities for hiveplotlib
.
- hiveplotlib.viz.datashader.datashade_edges_mpl(instance: ~hiveplotlib.hiveplot.HivePlot | ~hiveplotlib.p2cp.P2CP, tag: ~typing.Hashable | None = None, cmap: str | ~matplotlib.colors.ListedColormap = <matplotlib.colors.ListedColormap object>, vmin: float = 1, vmax: float | None = None, log_cmap: bool = True, pixel_spread: int = 2, reduction: callable = <datashader.reductions.count object>, buffer: float = 0.1, fig: ~matplotlib.figure.Figure | None = None, ax: ~matplotlib.axes._axes.Axes | None = None, figsize: ~typing.Tuple[float, float] = (10, 10), dpi: int = 300, axes_off: bool = True, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage]
matplotlib
visualization of constructed edges in aHivePlot
orP2CP
instance usingdatashader
.The main idea of
datashader
is rather than plot all the lines on top of each other in a figure, one can instead essentially build up a single 2d image of the lines in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction functionreduction=ds.count
(counting values in bins), we are essentially building a 2d histogram. For more on reductions indatashader
, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.Note
A high
dpi
value is recommended when datashading to allow for more nuance in the rasterization. This is why this visualization function defaults to adpi
value of 300 whenfig=None
andax=None
.Experimentation with different (low) values for
pixel_spread
is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in the thinner, more isolated curves “breaking apart” in the final visualization. For more on spreading, see the datashader documentation.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw edges.tag – which tag of data to plot. If
None
is provided, then plotting will occur if there is only one tag in the instance. For more on data tags, see further discussion in the Comparing Network Subgroups Notebook.cmap – which colormap to use for the datashaded edges. Default is a
seaborn
colormap similar to thematplotlib
"Blues"
colormap.vmin – minimum value used in the colormap for plotting the rasterization of curves. Default 1.
vmax – maximum value used in the colormap for plotting the rasterization of curves. Default
None
finds and uses the maximum bin value of the calculated rasterization.log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default
True
.reduction – the means of projecting from data space to pixel space for the rasterization. Default
ds.count()
essentially builds a 2d histogram. For more on reductions indatashader
, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.pixel_spread – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 2 pixels. For more on spreading, see the datashader documentation.
buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).fig – default
None
builds new figure. If a figure is specified,Axis
instances will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified,Axis
instances will be drawn on that axis. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.
axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.im_kwargs – additional params that will be applied to the final
plt.imshow()
call on the rasterization.
- Returns:
matplotlib
figure, axis, image.
- hiveplotlib.viz.datashader.datashade_hive_plot_mpl(instance: ~hiveplotlib.hiveplot.HivePlot | ~hiveplotlib.p2cp.P2CP, tag: ~typing.Hashable | None = None, cmap_edges: str | ~matplotlib.colors.ListedColormap = <matplotlib.colors.ListedColormap object>, cmap_nodes: str | ~matplotlib.colors.ListedColormap = 'copper', vmin_nodes: float = 1, vmax_nodes: float | None = None, vmin_edges: float = 1, vmax_edges: float | None = None, log_cmap: bool = True, pixel_spread_nodes: int = 15, pixel_spread_edges: int = 2, reduction: callable = <datashader.reductions.count object>, fig: ~matplotlib.figure.Figure | None = None, ax: ~matplotlib.axes._axes.Axes | None = None, figsize: ~typing.Tuple[float, float] = (10, 10), dpi: int = 300, axes_off: bool = True, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage, AxesImage]
matplotlib
visualization of aHivePlot
orP2CP
instance usingdatashader
.Plots both nodes and edges with datashader along with standard hive plot / P2CP axes.
The main idea of
datashader
is rather than plot all the lines on top of each other in a figure, one can instead essentially build up a single 2d image of the lines in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction functionreduction=ds.count
(counting values in bins), we are essentially building a 2d histogram. For more on reductions indatashader
, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.Note
A high
dpi
value is recommended when datashading to allow for more nuance in the rasterization. This is why this visualization function defaults to adpi
value of 300 whenfig=None
andax=None
.Experimentation with different (low) values for
pixel_spread_nodes
andpixel_spread_edges
is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in the thinner, more isolated curves “breaking apart” in the final visualization. For more on spreading, see the datashader documentation.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw edges.tag –
which tag of data to plot. If
None
is provided, then plotting will occur if there is only one tag in the instance. For more on data tags, see further discussion in the Comparing Network Subgroups Notebook.cmap_edges – which colormap to use for the datashaded edges. Default is a
seaborn
colormap similar to thematplotlib
"Blues"
colormap.cmap_nodes – which colormap to use for the datashaded nodes. Default “copper”.
vmin_nodes – minimum value used in the colormap for plotting the rasterization of nodes. Default 1.
vmax_nodes – maximum value used in the colormap for plotting the rasterization of nodes. Default
None
finds and uses the maximum bin value of the calculated rasterization.vmin_edges – minimum value used in the colormap for plotting the rasterization of edges. Default 1.
vmax_edges – maximum value used in the colormap for plotting the rasterization of edges. Default
None
finds and uses the maximum bin value of the calculated rasterization.log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default
True
.reduction – the means of projecting from data space to pixel space for the rasterization. Default
ds.count()
essentially builds a 2d histogram. For more on reductions indatashader
, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.pixel_spread_nodes – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 15 pixels. For more on spreading, see the datashader documentation.
pixel_spread_edges – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 2 pixels. For more on spreading, see the datashader documentation.
fig – default
None
builds new figure. If a figure is specified,Axis
instances will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified,Axis
instances will be drawn on that axis. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.
axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).show_axes_labels – whether to label the hive plot axes in the figure (uses
Axis.long_name
for eachAxis
.)axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting
axes_label_buffer
to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).axes_labels_fontsize – font size for hive plot axes labels.
axes_kwargs – additional params that will be applied to all axes. Note, these are kwargs that affect a
plt.plot()
call.text_kwargs – additional kwargs passed to
plt.text()
call.fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.im_kwargs – additional params that will be applied to the final
plt.imshow()
call on the rasterization.
- Returns:
matplotlib
figure, axis, the image corresponding to node data, and the image corresponding to edge data.
- hiveplotlib.viz.datashader.datashade_nodes_mpl(instance: ~hiveplotlib.hiveplot.HivePlot | ~hiveplotlib.p2cp.P2CP, cmap: str | ~matplotlib.colors.ListedColormap = 'copper', vmin: float = 1, vmax: float | None = None, log_cmap: bool = True, pixel_spread: int = 15, reduction: callable = <datashader.reductions.count object>, buffer: float = 0.1, fig: ~matplotlib.figure.Figure | None = None, ax: ~matplotlib.axes._axes.Axes | None = None, figsize: ~typing.Tuple[float, float] = (10, 10), dpi: int = 300, axes_off: bool = True, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage]
matplotlib
visualization of nodes / points in aHivePlot
/P2CP
instance usingdatashader
.The main idea of
datashader
is rather than plot all the points on top of each other in a figure, one can instead essentially build up a single 2d image of the points in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction functionreduction=ds.count
(counting values in bins), we are essentially building a 2d histogram. For more on reductions indatashader
, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.Note
A high
dpi
value is recommended when datashading to allow for more nuance in the rasterization. This is why this visualization function defaults to adpi
value of 300 whenfig=None
andax=None
. Since we are interested in positions rather than the lines fromhiveplotlib.viz.datashader.datashade_edges_mpl()
, though, one will likely need a much largerpixel_spread
value here, on the order of 10 times larger, to see the node density well in the final visualization.Experimentation with different values for
pixel_spread
is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in smaller, harder to see points in the final visualization. For more on spreading, see the datashader documentation.- Parameters:
instance –
HivePlot
orP2CP
instance for which we want to draw edges.cmap – which colormap to use for the datashaded nodes. Default “copper”.
vmin – minimum value used in the colormap for plotting the rasterization of curves. Default 1.
vmax – maximum value used in the colormap for plotting the rasterization of curves. Default
None
finds and uses the maximum bin value of the calculated rasterization.log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default
True
.reduction – the means of projecting from data space to pixel space for the rasterization. Default
ds.count()
essentially builds a 2d histogram. For more on reductions indatashader
, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.pixel_spread – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 15 pixels. For more on spreading, see the datashader documentation.
buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting
buffer
to 0.1 will find the maximum radius spanned by anyAxis
instance and set the x and y bounds as(-max_radius - buffer * max_radius, max_radius + buffer * max_radius)
).fig – default
None
builds new figure. If a figure is specified,Axis
instances will be drawn on that figure. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.ax – default
None
builds new axis. If an axis is specified,Axis
instances will be drawn on that axis. Note:fig
andax
must BOTH beNone
to instantiate new figure and axes.figsize – size of figure. Note: only works if instantiating new figure and axes (e.g.
fig
andax
areNone
).dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.
axes_off – whether to turn off Cartesian x, y axes in resulting
matplotlib
figure (defaultTrue
hides the x and y axes).fig_kwargs – additional values to be called in
plt.subplots()
call. Note iffigsize
is added here, then it will be prioritized over thefigsize
parameter.im_kwargs – additional params that will be applied to the final
plt.imshow()
call on the rasterization.
- Returns:
matplotlib
figure, axis, image.
Example Datasets
Quick example datasets for use in hiveplotlib
.
For Hive Plots, many excellent network datasets are available online, including many graphs that can be generated using
networkx and
pytorch-geometric.
The Stanford Large Network Dataset Collection is also a great general source of
network datasets. If working with networkx
graphs,
users can also take advantage of the hiveplotlib.converters.networkx_to_nodes_edges()
method to quickly get those
graphs into a hiveplotlib
-ready format.
For Polar Parallel Coordinates Plots (P2CPs), many datasets are available through packages including statsmodels and scikit-learn.
- hiveplotlib.datasets.example_hive_plot(num_nodes: int = 15, num_edges: int = 30, seed: int = 0, **hive_plot_n_axes_kwargs) HivePlot
Generate example hive plot with
"Low"
,"Medium"
, and"High"
axes (plus repeat axes).Nodes and edges will be generated and placed randomly.
- Parameters:
num_nodes – number of nodes to generate.
num_edges – number of edges to generate.
seed – random seed to use when generating nodes and edges.
hive_plot_n_axes_kwargs – additional keyword arguments for the underlying
hiveplotlib.hive_plot_n_axes()
call.
- Returns:
resulting
HivePlot
instance.
- hiveplotlib.datasets.example_nodes_and_edges(num_nodes: int = 100, num_edges: int = 200, num_axes: int = 3, seed: int = 0) Tuple[List[Node], List[List[Hashable]], ndarray]
Generate example nodes, node splits (one list of nodes per intended axis), and edges.
Each node will have a
"low"
,"med"
, and"high"
value, where these values are randomly generated, and as the names suggest, for the resulting values of each node,"low"
<"med"
<"high"
.- Parameters:
num_nodes – how many nodes to randomly generate. Node unique IDs will be the integers 0, 1, … ,
num_nodes - 1
.num_edges – how many edges to randomly generate.
num_axes – how many axes into which to partition the randomly generated nodes.
seed – random seed to use when randomly generating node and edge data.
- Returns:
list of generated
Node
instances, a list ofnum_axes
lists that evenly split the node IDs to be allocated to their own axes, and a(num_edges, 2)
shaped array of random edges between nodes.
- hiveplotlib.datasets.example_p2cp(num_points: int = 50, noise: float = 0.5, random_seed: int = 0, four_colors: Tuple[str, str, str, str] = ('#de8f05', '#029e73', '#cc78bc', '#0173b2'), **p2cp_n_axes_kwargs) P2CP
Generate example P2CP of four gaussian blobs.
Points will be generated by calling
hiveplotlib.datasets.four_gaussian_blobs_3d()
and turned into a P2CP viahiveplotlib.p2cp_n_axes()
.- Parameters:
num_points – number of points in each Gaussian blob.
noise – noisiness of Gaussian blobs.
random_seed – random seed to generate consistent data between calls.
four_colors – four colors to use for four Gaussian blobs.
p2cp_n_axes_kwargs – additional keyword arguments for the underlying
hiveplotlib.p2cp_n_axes()
call.
- Returns:
resulting
P2CP
instance.
- hiveplotlib.datasets.four_gaussian_blobs_3d(num_points: int = 50, noise: float = 0.5, random_seed: int = 0) DataFrame
Generate a
pandas
dataframe of four Gaussian blobs in 3d.This dataset serves as a simple example for showing 3d viz using Polar Parallel Coordinates Plots (P2CPs) instead of 3d plotting.
- Parameters:
num_points – number of points in each blob.
noise – noisiness of Gaussian blobs.
random_seed – random seed to generate consistent data between calls.
- Returns:
(num_points * 4, 4)
pd.DataFrame
of X, Y, Z, and blob labels.
- hiveplotlib.datasets.international_trade_data(year: int = 2019, hs92_code: int = 8112, path: str | Path | None = None) Tuple[DataFrame, Dict]
Read in international trade data network from the Harvard Growth Lab.
Note
Only a limited number of subsets of the data are shipped with
hiveplotlib
, as each year of trade data is roughly 300mb. However, the raw data are available at the Harvard Growth Lab’s website, and the runner to produce the necessary files to use this reader function is available in the repository (make_trade_network_dataset.py
).If you are using the runner to make your own trade datasets that you will read in locally with this function, then you will need to specify the local
path
accordingly.- Parameters:
year – which year of data to pull. If the year of data is not available, an error will be raised.
hs92_code – which HS 92 code of export data to pull. If the code requested is not available, an error will be raised. There are different numbers of digits (e.g. 2, 4), where more digits leads to more specificity of trade group. For a reference to what trade groups these codes correspond to, see this resource.
path – directory containing both the data and metadata for loading. Default
None
assumes you are using one of the datasets shipped withhiveplotlib
. If you are using themake_trade_network_dataset.py
runner discussed above to make your own datasets, then you will need to specify the path to the directory where you saved both the data and metadata files (which must be in the same directory).
- Returns:
pandas.DataFrame
of trade data, dictionary of metadata explaining meaning of data’s columns, data provenance, citations, etc.- Raises:
AssertionError
if the requested files cannot be found.