Toy Hive Plots#
Tools for generating toy hive plots.
- hiveplotlib.datasets.toy_hive_plots.example_base_hive_plot(num_nodes: int = 15, num_edges: int = 30, seed: int = 0, **hive_plot_n_axes_kwargs) BaseHivePlot#
Generate example hive plot with
"Low","Medium", and"High"axes (plus repeat axes).Nodes and edges will be generated and placed randomly.
- Parameters:
num_nodes – number of nodes to generate.
num_edges – number of edges to generate.
seed – random seed to use when generating nodes and edges.
hive_plot_n_axes_kwargs – additional keyword arguments for the underlying
hiveplotlib.hive_plot_n_axes()call.
- Returns:
resulting
BaseHivePlotinstance.
- hiveplotlib.datasets.toy_hive_plots.example_edge_data(nodes: NodeCollection, num_edges: int = 100, from_column_name: Hashable = 'from', to_column_name: Hashable = 'to', seed: int = 0) DataFrame#
Generate example edge data from a provided
NodeCollection.- Parameters:
nodes – nodes from which to generate example edges.
num_edges – how many example edges to randomly generate.
from_column_name – name to assign to the edge origin column, whose values correspond to node IDs where a given edge starts.
to_column_name – name to assign to the edge destination column, whose values correspond to node IDs where a given edge ends.
seed – random seed to use when randomly generating edge data.
- Returns:
random edge data as (n, 2) DataFrame of [from, to] edges.
- hiveplotlib.datasets.toy_hive_plots.example_edges(nodes: NodeCollection, num_edges: int = 100, from_column_name: Hashable = 'from', to_column_name: Hashable = 'to', seed: int = 0) Edges#
Generate example edges from a provided
NodeCollection.- Parameters:
nodes – nodes from which to generate example edges.
num_edges – how many example edges to randomly generate.
from_column_name – name to assign to the edge origin column, whose values correspond to node IDs where a given edge starts.
to_column_name – name to assign to the edge destination column, whose values correspond to node IDs where a given edge ends.
seed – random seed to use when randomly generating edge data.
- Returns:
random edge data.
- hiveplotlib.datasets.toy_hive_plots.example_hive_plot(num_nodes: int = 100, num_edges: int = 100, partition_data_column: Literal['low', 'med', 'high'] = 'low', labels: List[Hashable] | None = ('A', 'B', 'C'), cutoffs: List[float] | int | None = 3, partition_variable_name: Hashable | None = None, sorting_variables: Hashable | Dict[Hashable, Hashable] = 'low', seed: int = 0, node_unique_id_column: str = 'unique_id', **hive_plot_kwargs) HivePlot#
Generate example
HivePlotinstance.Each node will have a
"low","med", and"high"value, where these values are randomly generated, and as the names suggest, for the resulting values of each node,"low"<"med"<"high".Each edge will also have a
"low","med", and"high"value, with each value being the average “low” / “med” / “high” level of the two nodes composing the edge.Note
The generated
num_edgesedges will be randomly generated between all possible axes, including repeat axes. Thus, calling this function without requesting all repeat axes (i.e.repeat_axes=True) will result in less thannum_edgesedges visualized in the final hive plot. (All generated edges will be stored in the resultinghive_plot.edges, even though some will not be plotted if excluding repeat axes in the plot.)- Parameters:
num_nodes – how many nodes to randomly generate. Node unique IDs will be the integers 0, 1, … ,
num_nodes - 1.num_edges – how many example edges to randomly generate.
partition_data_column – which column of data in the underlying
dataattribute to use to partition the node data. Node data generated viahiveplotlib.datasets.toy_hive_plots.example_node_data().labels – labels assigned to each bin. Only referenced when
cutoffsis notNone.Nonelabels each bin as a string based on its range of values. Note, whencutoffsis a list,len(labels)must be 1 greater thanlen(cutoffs). Whencutoffsis anint,len(labels)must be equal tocutoffs.cutoffs – cutoffs to use in binning nodes according to data under
partition_data_column. DefaultNonewill bin nodes by unique values ofpartition_data_column. When provided as alist, the specified cutoffs will bin according to (-inf,cutoffs[0]], (cutoffs[0],cutoffs[1]], … , (cutoffs[-1], inf). When provided as anint, the exact numerical break points will be determined to createcutoffsequally-sized quantiles.partition_variable_name – name of the resulting partition variable to add to the
nodes.dataattribute of the resultingHivePlotinstance. DefaultNonewill name the partition column as"partition_0".sorting_variable – which node variable to use to sort / place the nodes on each axis. Providing a single value uses the same variable for each axis. Alternatively, providing a dictionary of keys as the unique values from
partition_variable_namecolumn data in thenodes.dataattribute and values being the corresponding sorting variable to use for that axis.seed – random seed to use when randomly generating node and edge data.
node_unique_id_column – name to assign to the column in the
nodes.dataattribute that corresponds to the unique IDs.hive_plot_kwargs – additional keyword arguments when creating the returned
hiveplotlib.HivePlot()instance.
- Returns:
randomly-generated
HivePlotinstance.
- hiveplotlib.datasets.toy_hive_plots.example_node_collection(num_nodes: int = 100, seed: int = 0, unique_id_column: str = 'unique_id') NodeCollection#
Generate example
NodeCollection.Each node will have a
"low","med", and"high"value, where these values are randomly generated, and as the names suggest, for the resulting values of each node,"low"<"med"<"high".Unique ID column will be given the name
"unique_id".- Parameters:
num_nodes – how many nodes to randomly generate. Node unique IDs will be the integers 0, 1, … ,
num_nodes - 1.seed – random seed to use when randomly generating node data.
unique_id_column – name to assign to the column in the resulting
NodeCollection.dataattribute that corresponds to the unique IDs.
- Returns:
NodeCollectionof node data.
- hiveplotlib.datasets.toy_hive_plots.example_node_data(num_nodes: int = 100, seed: int = 0) DataFrame#
Generate example node dataframe.
Each node will have a
"low","med", and"high"value, where these values are randomly generated, and as the names suggest, for the resulting values of each node,"low"<"med"<"high".Unique ID column will be given the name
"unique_id".- Parameters:
num_nodes – how many nodes to randomly generate. Node unique IDs will be the integers 0, 1, … ,
num_nodes - 1.seed – random seed to use when randomly generating node data.
- Returns:
dataframe of node data.
- hiveplotlib.datasets.toy_hive_plots.example_nodes_and_edges(num_nodes: int = 100, num_edges: int = 200, num_axes: int = 3, seed: int = 0) Tuple[List[Node], List[List[Hashable]], ndarray]#
Generate example nodes, node splits (one list of nodes per intended axis), and edges.
Each node will have a
"low","med", and"high"value, where these values are randomly generated, and as the names suggest, for the resulting values of each node,"low"<"med"<"high".- Parameters:
num_nodes – how many nodes to randomly generate. Node unique IDs will be the integers 0, 1, … ,
num_nodes - 1.num_edges – how many edges to randomly generate.
num_axes – how many axes into which to partition the randomly generated nodes.
seed – random seed to use when randomly generating node and edge data.
- Returns:
list of generated
Nodeinstances, a list ofnum_axeslists that evenly split the node IDs to be allocated to their own axes, and a(num_edges, 2)shaped array of random edges between nodes.