Datashader in Matplotlib#

Datashading capabilities for hiveplotlib.

hiveplotlib.viz.datashader.datashade_edges_mpl(instance: BaseHivePlot | HivePlot | P2CP, tag: Hashable | None = None, cmap: str | ListedColormap | None = None, vmin: float = 1, vmax: float | None = None, log_cmap: bool = True, pixel_spread: int = 1, reduction: Reduction | None = None, buffer: float = 0.1, fig: Figure | None = None, ax: Axes | None = None, figsize: Tuple[float, float] = (10, 10), dpi: int = 150, axes_off: bool = True, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage | None]#

matplotlib visualization of constructed edges in a HivePlot or P2CP instance using datashader.

The main idea of datashader is rather than plot all the lines on top of each other in a figure, one can instead essentially build up a single 2d image of the lines in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function reduction=ds.count (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A higher dpi value when datashading will allow for more nuance in the rasterization, but will require adjusting the pixel_spread parameter accordingly (larger spread likely needed when using a higher dpi).

Experimentation with different (low) values for pixel_spread is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in the thinner, more isolated curves “breaking apart” in the final visualization. For more on spreading, see the datashader documentation.

Any provided edge plotting keyword arguments in HivePlot.edges.edge_viz_kwargs will be disregarded in this visualization, as this flexibility is reserved for datashading the edges. Inclusion of any edge kwargs here as part of the im_kwargs will likely trigger an error.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • tag – which tag of data to plot. If None is provided, then plotting will occur if there is only one tag in the instance. For more on data tags, see further discussion in the Comparing Network Subgroups Notebook.

  • cmap – which colormap to use for the datashaded edges. Default uses a seaborn colormap similar to the matplotlib "Blues" colormap.

  • vmin – minimum value used in the colormap for plotting the rasterization of curves. Default 1.

  • vmax – maximum value used in the colormap for plotting the rasterization of curves. Default None finds and uses the maximum bin value of the calculated rasterization.

  • log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • reduction – the means of projecting from data space to pixel space for the rasterization. Default None uses ds.count(), essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • pixel_spread – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 1 pixel. For more on spreading, see the datashader documentation.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • im_kwargs – additional params that will be applied to the final plt.imshow() call on the rasterization. Must not clash with the cmap, vmin, or vmax parameters.

Raises:

ValueError – If a column-based reduction (e.g. ds.sum('col'), ds.mean('col'), ds.count_cat('col')) references an edge metadata column that does not exist for the selected tag.

Returns:

matplotlib figure, axis, image. If there are no edges to plot, the returned image will be None.

hiveplotlib.viz.datashader.datashade_hive_plot_mpl(instance: BaseHivePlot | HivePlot | P2CP, tag: Hashable | None = None, cmap_edges: str | ListedColormap | None = None, cmap_nodes: str | ListedColormap = 'copper', vmin_nodes: float = 1, vmax_nodes: float | None = None, vmin_edges: float = 1, vmax_edges: float | None = None, node_kwargs: dict | None = None, log_cmap_nodes: bool = True, pixel_spread_nodes: int = 7, reduction_nodes: Reduction | None = None, log_cmap_edges: bool = True, pixel_spread_edges: int = 1, reduction_edges: Reduction | None = None, fig: Figure | None = None, ax: Axes | None = None, figsize: Tuple[float, float] = (10, 10), dpi: int = 150, axes_off: bool = True, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) Tuple[Figure, Axes, AxesImage | None, AxesImage | None]#

matplotlib visualization of a HivePlot or P2CP instance using datashader.

Plots both nodes and edges with datashader along with standard hive plot / P2CP axes.

The main idea of datashader is rather than plot all the lines on top of each other in a figure, one can instead essentially build up a single 2d image of the lines in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function ds.count() (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A higher dpi value when datashading will allow for more nuance in the rasterization, but will require adjusting the pixel_spread_nodes and pixel_spread_edges parameters accordingly (larger spreads likely needed when using a higher dpi). Higher dpi values will also increase computation time.

Experimentation with different (low) values for pixel_spread_nodes and pixel_spread_edges is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in the thinner, more isolated curves “breaking apart” in the final visualization. For more on spreading, see the datashader documentation.

Any provided node plotting keyword arguments in HivePlot.nodes.node_viz_kwargs will be disregarded in this visualization, as this flexibility is reserved for datashading the nodes. Inclusion of any node_kwargs here will also raise a warning.

Any provided edge plotting keyword arguments in HivePlot.edges.edge_viz_kwargs will be disregarded in this visualization, as this flexibility is reserved for datashading the edges. Inclusion of any edge kwargs here as part of the im_kwargs will likely trigger an error.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to visualize.

  • tag – which tag of data to plot. If None is provided, then plotting will occur if there is only one tag in the instance. For more on data tags, see further discussion in the Comparing Network Subgroups Notebook.

  • cmap_edges – which colormap to use for the datashaded edges. Default uses a seaborn colormap similar to the matplotlib "Blues" colormap.

  • cmap_nodes – which colormap to use for the datashaded nodes. Default “copper”.

  • vmin_nodes – minimum value used in the colormap for plotting the rasterization of nodes. Default 1.

  • vmax_nodes – maximum value used in the colormap for plotting the rasterization of nodes. Default None finds and uses the maximum bin value of the calculated rasterization.

  • vmin_edges – minimum value used in the colormap for plotting the rasterization of edges. Default 1.

  • vmax_edges – maximum value used in the colormap for plotting the rasterization of edges. Default None finds and uses the maximum bin value of the calculated rasterization.

  • node_kwargs – additional params that will be applied to the final plt.imshow() call on the edge rasterization. Must not clash with the cmap_nodes, vmin_nodes, or vmax_nodes parameters.

  • log_cmap_nodes – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • pixel_spread_nodes – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 7 pixels. For more on spreading, see the datashader documentation.

  • reduction_nodes – the means of projecting from data space to pixel space for the rasterization of nodes. Default None uses ds.count(), essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • log_cmap_edges – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • pixel_spread_edges – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 1 pixel. For more on spreading, see the datashader documentation.

  • reduction_edges – the means of projecting from data space to pixel space for the rasterization of edges. Default None uses ds.count(), essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for hive plot axes labels.

  • axes_kwargs – additional params that will be applied to all axes. Note, these are kwargs that affect a plt.plot() call.

  • text_kwargs – additional kwargs passed to plt.text() call.

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • edge_kwargs – additional params that will be applied to the final plt.imshow() call on the edge rasterization. Must not clash with the cmap_edges, vmin_edges, or vmax_edges parameters.

Raises:

ValueError – If a column-based reduction_edges (e.g. ds.sum('col'), ds.mean('col'), ds.count_cat('col')) references an edge metadata column that does not exist for the selected tag.

Returns:

matplotlib figure, axis, the image corresponding to node data, and the image corresponding to edge data. If there are no edges / nodes to plot, the returned edges image / nodes image will be None.

hiveplotlib.viz.datashader.datashade_nodes_mpl(instance: BaseHivePlot | HivePlot | P2CP, cmap: str | ListedColormap = 'copper', vmin: float = 1, vmax: float | None = None, log_cmap: bool = True, pixel_spread: int = 7, reduction: Reduction | None = None, buffer: float = 0.1, fig: Figure | None = None, ax: Axes | None = None, figsize: Tuple[float, float] = (10, 10), dpi: int = 150, axes_off: bool = True, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage | None]#

matplotlib visualization of nodes / points in a HivePlot / P2CP instance using datashader.

The main idea of datashader is rather than plot all the points on top of each other in a figure, one can instead essentially build up a single 2d image of the points in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function reduction=ds.count() (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A higher dpi value when datashading will allow for more nuance in the rasterization, but will require adjusting the pixel_spread parameter accordingly (larger spread likely needed when using a higher dpi).

Experimentation with different values for pixel_spread is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in smaller, harder to see points in the final visualization. For more on spreading, see the datashader documentation.

Any provided node plotting keyword arguments in HivePlot.nodes.node_viz_kwargs will be disregarded in this visualization, as this flexibility is reserved for datashading the nodes. Inclusion of any node_kwargs here will also raise a warning.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • cmap – which colormap to use for the datashaded nodes. Default “copper”.

  • vmin – minimum value used in the colormap for plotting the rasterization of curves. Default 1.

  • vmax – maximum value used in the colormap for plotting the rasterization of curves. Default None finds and uses the maximum bin value of the calculated rasterization.

  • log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • reduction – the means of projecting from data space to pixel space for the rasterization. Default None uses ds.count(), essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • pixel_spread – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 7 pixels. For more on spreading, see the datashader documentation.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • im_kwargs – additional params that will be applied to the final plt.imshow() call on the rasterization. Must not clash with the cmap, vmin, or vmax parameters.

Returns:

matplotlib figure, axis, image. If there are no nodes to plot, the returned image will be None.

hiveplotlib.viz.datashader.edge_viz(instance: BaseHivePlot | HivePlot | P2CP, tag: Hashable | None = None, cmap: str | ListedColormap | None = None, vmin: float = 1, vmax: float | None = None, log_cmap: bool = True, pixel_spread: int = 1, reduction: Reduction | None = None, buffer: float = 0.1, fig: Figure | None = None, ax: Axes | None = None, figsize: Tuple[float, float] = (10, 10), dpi: int = 150, axes_off: bool = True, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage | None]#

matplotlib visualization of constructed edges in a HivePlot or P2CP instance using datashader.

The main idea of datashader is rather than plot all the lines on top of each other in a figure, one can instead essentially build up a single 2d image of the lines in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function reduction=ds.count (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A higher dpi value when datashading will allow for more nuance in the rasterization, but will require adjusting the pixel_spread parameter accordingly (larger spread likely needed when using a higher dpi).

Experimentation with different (low) values for pixel_spread is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in the thinner, more isolated curves “breaking apart” in the final visualization. For more on spreading, see the datashader documentation.

Any provided edge plotting keyword arguments in HivePlot.edges.edge_viz_kwargs will be disregarded in this visualization, as this flexibility is reserved for datashading the edges. Inclusion of any edge kwargs here as part of the im_kwargs will likely trigger an error.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • tag – which tag of data to plot. If None is provided, then plotting will occur if there is only one tag in the instance. For more on data tags, see further discussion in the Comparing Network Subgroups Notebook.

  • cmap – which colormap to use for the datashaded edges. Default uses a seaborn colormap similar to the matplotlib "Blues" colormap.

  • vmin – minimum value used in the colormap for plotting the rasterization of curves. Default 1.

  • vmax – maximum value used in the colormap for plotting the rasterization of curves. Default None finds and uses the maximum bin value of the calculated rasterization.

  • log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • reduction – the means of projecting from data space to pixel space for the rasterization. Default None uses ds.count(), essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • pixel_spread – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 1 pixel. For more on spreading, see the datashader documentation.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • im_kwargs – additional params that will be applied to the final plt.imshow() call on the rasterization. Must not clash with the cmap, vmin, or vmax parameters.

Raises:

ValueError – If a column-based reduction (e.g. ds.sum('col'), ds.mean('col'), ds.count_cat('col')) references an edge metadata column that does not exist for the selected tag.

Returns:

matplotlib figure, axis, image. If there are no edges to plot, the returned image will be None.

hiveplotlib.viz.datashader.hive_plot_viz(instance: BaseHivePlot | HivePlot | P2CP, tag: Hashable | None = None, cmap_edges: str | ListedColormap | None = None, cmap_nodes: str | ListedColormap = 'copper', vmin_nodes: float = 1, vmax_nodes: float | None = None, vmin_edges: float = 1, vmax_edges: float | None = None, node_kwargs: dict | None = None, log_cmap_nodes: bool = True, pixel_spread_nodes: int = 7, reduction_nodes: Reduction | None = None, log_cmap_edges: bool = True, pixel_spread_edges: int = 1, reduction_edges: Reduction | None = None, fig: Figure | None = None, ax: Axes | None = None, figsize: Tuple[float, float] = (10, 10), dpi: int = 150, axes_off: bool = True, buffer: float = 0.1, show_axes_labels: bool = True, axes_labels_buffer: float = 1.1, axes_labels_fontsize: int = 16, axes_kwargs: dict | None = None, text_kwargs: dict | None = None, fig_kwargs: dict | None = None, **edge_kwargs) Tuple[Figure, Axes, AxesImage | None, AxesImage | None]#

matplotlib visualization of a HivePlot or P2CP instance using datashader.

Plots both nodes and edges with datashader along with standard hive plot / P2CP axes.

The main idea of datashader is rather than plot all the lines on top of each other in a figure, one can instead essentially build up a single 2d image of the lines in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function ds.count() (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A higher dpi value when datashading will allow for more nuance in the rasterization, but will require adjusting the pixel_spread_nodes and pixel_spread_edges parameters accordingly (larger spreads likely needed when using a higher dpi). Higher dpi values will also increase computation time.

Experimentation with different (low) values for pixel_spread_nodes and pixel_spread_edges is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in the thinner, more isolated curves “breaking apart” in the final visualization. For more on spreading, see the datashader documentation.

Any provided node plotting keyword arguments in HivePlot.nodes.node_viz_kwargs will be disregarded in this visualization, as this flexibility is reserved for datashading the nodes. Inclusion of any node_kwargs here will also raise a warning.

Any provided edge plotting keyword arguments in HivePlot.edges.edge_viz_kwargs will be disregarded in this visualization, as this flexibility is reserved for datashading the edges. Inclusion of any edge kwargs here as part of the im_kwargs will likely trigger an error.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to visualize.

  • tag – which tag of data to plot. If None is provided, then plotting will occur if there is only one tag in the instance. For more on data tags, see further discussion in the Comparing Network Subgroups Notebook.

  • cmap_edges – which colormap to use for the datashaded edges. Default uses a seaborn colormap similar to the matplotlib "Blues" colormap.

  • cmap_nodes – which colormap to use for the datashaded nodes. Default “copper”.

  • vmin_nodes – minimum value used in the colormap for plotting the rasterization of nodes. Default 1.

  • vmax_nodes – maximum value used in the colormap for plotting the rasterization of nodes. Default None finds and uses the maximum bin value of the calculated rasterization.

  • vmin_edges – minimum value used in the colormap for plotting the rasterization of edges. Default 1.

  • vmax_edges – maximum value used in the colormap for plotting the rasterization of edges. Default None finds and uses the maximum bin value of the calculated rasterization.

  • node_kwargs – additional params that will be applied to the final plt.imshow() call on the edge rasterization. Must not clash with the cmap_nodes, vmin_nodes, or vmax_nodes parameters.

  • log_cmap_nodes – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • pixel_spread_nodes – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 7 pixels. For more on spreading, see the datashader documentation.

  • reduction_nodes – the means of projecting from data space to pixel space for the rasterization of nodes. Default None uses ds.count(), essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • log_cmap_edges – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • pixel_spread_edges – amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 1 pixel. For more on spreading, see the datashader documentation.

  • reduction_edges – the means of projecting from data space to pixel space for the rasterization of edges. Default None uses ds.count(), essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • show_axes_labels – whether to label the hive plot axes in the figure (uses Axis.long_name for each Axis.)

  • axes_labels_buffer – fraction which to radially buffer axes labels (e.g. setting axes_label_buffer to 1.1 will be 10% further past the end of the axis moving from the origin of the plot).

  • axes_labels_fontsize – font size for hive plot axes labels.

  • axes_kwargs – additional params that will be applied to all axes. Note, these are kwargs that affect a plt.plot() call.

  • text_kwargs – additional kwargs passed to plt.text() call.

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • edge_kwargs – additional params that will be applied to the final plt.imshow() call on the edge rasterization. Must not clash with the cmap_edges, vmin_edges, or vmax_edges parameters.

Raises:

ValueError – If a column-based reduction_edges (e.g. ds.sum('col'), ds.mean('col'), ds.count_cat('col')) references an edge metadata column that does not exist for the selected tag.

Returns:

matplotlib figure, axis, the image corresponding to node data, and the image corresponding to edge data. If there are no edges / nodes to plot, the returned edges image / nodes image will be None.

hiveplotlib.viz.datashader.node_viz(instance: BaseHivePlot | HivePlot | P2CP, cmap: str | ListedColormap = 'copper', vmin: float = 1, vmax: float | None = None, log_cmap: bool = True, pixel_spread: int = 7, reduction: Reduction | None = None, buffer: float = 0.1, fig: Figure | None = None, ax: Axes | None = None, figsize: Tuple[float, float] = (10, 10), dpi: int = 150, axes_off: bool = True, fig_kwargs: dict | None = None, **im_kwargs) Tuple[Figure, Axes, AxesImage | None]#

matplotlib visualization of nodes / points in a HivePlot / P2CP instance using datashader.

The main idea of datashader is rather than plot all the points on top of each other in a figure, one can instead essentially build up a single 2d image of the points in 2-space. We can then plot just this rasterization, which is much smaller. By using the default reduction function reduction=ds.count() (counting values in bins), we are essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

Note

A higher dpi value when datashading will allow for more nuance in the rasterization, but will require adjusting the pixel_spread parameter accordingly (larger spread likely needed when using a higher dpi).

Experimentation with different values for pixel_spread is encouraged. As the name suggests, this parameter spreads out calculated pixel values in the rasterization radially. Values that are too low tends to result in smaller, harder to see points in the final visualization. For more on spreading, see the datashader documentation.

Any provided node plotting keyword arguments in HivePlot.nodes.node_viz_kwargs will be disregarded in this visualization, as this flexibility is reserved for datashading the nodes. Inclusion of any node_kwargs here will also raise a warning.

Parameters:
  • instanceHivePlot or P2CP instance for which we want to draw edges.

  • cmap – which colormap to use for the datashaded nodes. Default “copper”.

  • vmin – minimum value used in the colormap for plotting the rasterization of curves. Default 1.

  • vmax – maximum value used in the colormap for plotting the rasterization of curves. Default None finds and uses the maximum bin value of the calculated rasterization.

  • log_cmap – whether to use a logarithmic (base 10) scale for the colormap. Default True.

  • reduction – the means of projecting from data space to pixel space for the rasterization. Default None uses ds.count(), essentially building a 2d histogram. For more on reductions in datashader, see the datashader documentation, and for a complete list of reduction functions available, see the datashader API docs.

  • pixel_spread

    amount of pixel units in which to “spread” pixel values in the resulting rasterization before plotting. Default amount of spreading is 7 pixels. For more on spreading, see the datashader documentation.

  • buffer – fraction of the axes past which to buffer x and y dimensions (e.g. setting buffer to 0.1 will find the maximum radius spanned by any Axis instance and set the x and y bounds as (-max_radius - buffer * max_radius, max_radius + buffer * max_radius)).

  • fig – default None builds new figure. If a figure is specified, Axis instances will be drawn on that figure. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • ax – default None builds new axis. If an axis is specified, Axis instances will be drawn on that axis. Note: fig and ax must BOTH be None to instantiate new figure and axes.

  • figsize – size of figure. Note: only works if instantiating new figure and axes (e.g. fig and ax are None).

  • dpi – resolution (Dots Per Inch) of resulting figure. A higher-than-usual DPI is recommended to show more pixels in the final rasterization, which will show more nuance.

  • axes_off – whether to turn off Cartesian x, y axes in resulting matplotlib figure (default True hides the x and y axes).

  • fig_kwargs – additional values to be called in plt.subplots() call. Note if figsize is added here, then it will be prioritized over the figsize parameter.

  • im_kwargs – additional params that will be applied to the final plt.imshow() call on the rasterization. Must not clash with the cmap, vmin, or vmax parameters.

Returns:

matplotlib figure, axis, image. If there are no nodes to plot, the returned image will be None.