Utility Functions#

Helper static methods for working with node data.

Node-Specific#

hiveplotlib.node.dataframe_to_node_list(df: DataFrame, unique_id_column: Hashable) List[Node]#

Convert a dataframe into Node instances, where each row will be turned into a single instance.

Parameters:
  • df – dataframe to use to generate Node instances.

  • unique_id_column – which column corresponds to unique IDs for the eventual nodes.

Returns:

list of Node instances.

hiveplotlib.node.node_collection_from_node_list(node_list: List[Node], unique_id_name: str = 'unique_id', check_uniqueness: bool = True) NodeCollection#

Create hiveplotlib.NodeCollection from list of hiveplotlib.Node instances.

Parameters:
  • node_list – list of Node instances to convert into a NodeCollection.

  • unique_id_name – name to use for unique IDs.

  • check_uniqueness – whether or not to check that provided Node instances have unique IDs.

Returns:

the resulting NodeCollection.

hiveplotlib.node.subset_node_collection_by_unique_ids(node_collection: NodeCollection, ids: List[Hashable] | Hashable) DataFrame#

Subset NodeCollection dataframe by specific unique IDs.

Parameters:
  • node_collection – node data to subset.

  • ids – unique ID(s) of node data to subset.

Returns:

dataframe of node data subset for only the provided ids.

hiveplotlib.node.split_nodes_on_variable(node_list: list[Node], variable_name: Hashable, cutoffs: List[float] | int | None = None, labels: List[Hashable] | None = None) Dict[Hashable, List[Hashable]]#

Split a list of Node instances into a partition of node IDs.

By default, splits will group node IDs on unique values of variable_name.

If variable_name corresponds to numerical data, and a list of cutoffs is provided, node IDs will be separated into bins according to the following binning scheme:

(-inf, cutoff[0]], (cutoff[0], cutoff[1]], … , (cutoff[-1], inf]

If variable_name corresponds to numerical data, and cutoffs is provided as an int, node IDs will be separated into cutoffs equal-sized quantiles.

Note

This method currently only supports splits where variable_name corresponds to numerical data.

Parameters:
  • node_list – list of Node instances to partition.

  • variable_name – which variable in each Node instances to group by.

  • cutoffs – cutoffs to use in binning nodes according to data under variable_name. Default None will bin nodes by unique values of variable_name. When provided as a list, the specified cutoffs will bin according to (-inf, cutoffs[0]], (`cutoffs[0]`, cutoffs[1]], … , (cutoffs[-1], inf). When provided as an int, the exact numerical break points will be determined to create cutoffs equally-sized quantiles.

  • labels – labels assigned to each bin. Only referenced when cutoffs is not None. Default None labels each bin as a string based on its range of values. Note, when cutoffs is a list, len(labels) must be 1 greater than len(cutoffs). When cutoffs is an int, len(labels) must be equal to cutoffs.

Returns:

dict whose values are lists of Node unique IDs. If cutoffs is None, keys will be the unique values for the variable. Otherwise, each key will be the string representation of a bin range.

Other#

Utility functions for hive plot curvature and coordinates.

hiveplotlib.utils.bezier(start: float, end: float, control: float, num_steps: int = 100) ndarray#

Calculate 1-dimensional Bézier curve values between start and end with curve based on control.

Note, this function is hardcoded for exactly 1 control point.

Parameters:
  • start – starting point.

  • end – ending point.

  • control – “pull” point.

  • num_steps – number of points on Bézier curve.

Returns:

(num_steps, ) sized np.ndarray of 1-dimensional discretized Bézier curve output.

hiveplotlib.utils.bezier_all(start_arr: List[float] | ndarray, end_arr: List[float] | ndarray, control_arr: List[float] | ndarray, num_steps: int = 100) ndarray#

Calculate Bézier curve between multiple start and end values.

Note, this function is hardcoded for exactly 1 control point per curve.

Parameters:
  • start_arr – starting point of each curve.

  • end_arr – corresponding ending point of each curve.

  • control_arr – corresponding “pull” points for each curve.

  • num_steps – number of points on each Bézier curve.

Returns:

(start_arr * num_steps, ) sized np.ndarray of 1-dimensional discretized Bézier curve output. Note, every num_steps chunk of the output corresponds to a different Bézier curve.

hiveplotlib.utils.bezier_xy_with_nans(start_arr: ~typing.List[float] | ~numpy.ndarray, end_arr: ~typing.List[float] | ~numpy.ndarray, control_xy: ~typing.Tuple[~typing.List[float] | ~numpy.ndarray, ~typing.List[float] | ~numpy.ndarray], num_steps: int = 100, numba_mode: str | None = None, dtype: ~numpy.dtype | type | str = <class 'numpy.float64'>) ndarray#

Vectorized 2D quadratic Bézier sampling for multiple curves with NaN separators.

Input arrays represent per-curve start/end points (shape (n, 2)) and control points (x,y). Returns an array of shape (n * (num_steps + 1), 2), where each curve contributes num_steps rows followed by a NaN separator row.

Parameters:
  • numba_mode

    selects the computation backend.

    • "parallel": use numba-parallel implementation (if available).

    • "serial": use numba single-threaded implementation (if available).

    • "off" or None (default): use pure NumPy implementation.

  • dtype – output array dtype. Default numpy.float64.

hiveplotlib.utils.cartesian2polar(x: ndarray | float, y: ndarray | float) Tuple[ndarray | float, ndarray | float]#

Convert cartesian coordinates e.g. (x, y) to polar coordinates.

(Polar coordinates e.g. (rho, phi), where rho is distance from origin, and phi is counterclockwise angle off of x-axis in degrees.)

Parameters:
  • x – Cartesian x coordinates.

  • y – Cartesian y coordinates.

Returns:

(rho, phi) polar coordinates.

hiveplotlib.utils.polar2cartesian(rho: float, phi: float) tuple[float, float]#
hiveplotlib.utils.polar2cartesian(rho: ndarray, phi: float) tuple[ndarray, ndarray]
hiveplotlib.utils.polar2cartesian(rho: float, phi: ndarray) tuple[ndarray, ndarray]
hiveplotlib.utils.polar2cartesian(rho: ndarray, phi: ndarray) tuple[ndarray, ndarray]

Convert polar coordinates to cartesian coordinates e.g. (x, y).

(Polar coordinates e.g. (rho, phi), where rho is distance from origin, and phi is counterclockwise angle off of x-axis in degrees.)

Parameters:
  • rho – distance from origin.

  • phi – counterclockwise angle off of x-axis in degrees (not radians).

Returns:

(x, y) cartesian coordinates.