year-vs-climatology / hvplot_docs /Customizing_Plots.md
ahuang11's picture
Upload 52 files
b9a0f21 verified
|
raw
history blame
21.5 kB

Customizing Plots

import numpy as np
import holoviews as hv
from holoviews import dim, opts

hv.extension('bokeh', 'matplotlib')

The HoloViews options system allows controlling the various attributes of a plot. While different plotting extensions like bokeh, matplotlib and plotly offer different features and the style options may differ, there are a wide array of options and concepts that are shared across the different extensions. Specifically this guide provides an overview on controlling the various aspects of a plot including titles, axes, legends and colorbars.

Plots have an overall hierarchy and here we will break down the different components:

Customizing the plot

Title

A plot's title is usually constructed using a formatter which takes the group and label along with the plots dimensions into consideration. The default formatter is:

'{label} {group}  {dimensions}'

where the {label} and {group} are inherited from the objects group and label parameters and dimensions represent the key dimensions in a HoloMap/DynamicMap:

hv.HoloMap({i: hv.Curve([1, 2, 3-i], group='Group', label='Label') for i in range(3)}, 'Value')

The title formatter may however be overridden with an explicit title, which may include any combination of the three formatter variables:

hv.Curve([1, 2, 3]).opts(title="Custom Title")

Background

Another option which can be controlled at the level of a plot is the background color which may be set using the bgcolor option:

hv.Curve([1, 2, 3]).opts(bgcolor='lightgray')

Font sizes

Controlling the font sizes of a plot is very common so HoloViews provides a convenient option to set the fontsize. The fontsize accepts a dictionary which allows supplying fontsizes for different components of the plot from the title, to the axis labels, ticks and legends. The full list of plot components that can be customized separately include:

['xlabel', 'ylabel', 'zlabel', 'labels', 'xticks', 'yticks', 'zticks', 'ticks', 'minor_xticks', 'minor_yticks', 'minor_ticks', 'title', 'legend', 'legend_title']

Let's take a simple example customizing the title, the axis labels and the x/y-ticks separately:

hv.Curve([1, 2, 3], label='Title').opts(fontsize={'title': 16, 'labels': 14, 'xticks': 6, 'yticks': 12})

Font scaling

Instead of control each property individually it is often useful to scale all fonts by a constant factor, e.g. to produce a more legible plot for presentations and posters. The fontscale option will affect the title, axis labels, tick labels, and legend:

(hv.Curve([1, 2, 3], label='A') * hv.Curve([3, 2, 1], label='B')).opts(fontscale=2, width=500, height=400, title='Title')

Legend customization

When overlaying plots with different labels, a legend automatically appears to differentiate elements in the overlay. This legend can be customized in several ways:

  • by position
    • by adjusting the legend location within the figure using the legend_position option (e.g. legend_position='bottom_right')
    • by adjusting the legend location outside of the figure using the legend_position and legend_offset parameters (which then positions the legend in screen space) (e.g. legend_position='right', legend_offset=(0, 200)). Note: the legend_position option applies to bokeh and matplotlib backends but the legend_offset only applies to bokeh.
  • by style
    • by muting elements with legend_muted=True (applies only to the bokeh backend)
    • by putting the legend elements in a column layout with legend_cols=True or (legend_cols=int in matplotlib)

These customizations are demonstrated by the examples that follow.

Moving the legend to the bottom right:

overlay = (hv.Curve([1, 2, 3], label='A') * hv.Curve([3, 2, 1], label='B')).opts(width=500, height=400)
overlay.opts(legend_position='bottom_right')

Moving the legend outside, to the right of the plot:

overlay.opts(legend_position='right')

Moving the legend outside, to the right of the plot but offset it 200 pixels higher:

overlay.opts(width=500, height=400, legend_position='right', legend_offset=(0, 200))

Muting the legend and laying the labels out as columns.

overlay.opts(legend_muted=True, legend_cols=2)

Plot hooks

HoloViews does not expose every single option a plotting extension like matplotlib or bokeh provides, therefore it is sometimes necessary to dig deeper to achieve precisely the customizations one might need. One convenient way of doing so is to use plot hooks to modify the plot object directly. The hooks are applied after HoloViews is done with the plot, allowing for detailed manipulations of the backend specific plot object.

The signature of a hook has two arguments, the HoloViews plot object that is rendering the plot and the element being rendered. From there the hook can modify the objects in the plot's handles, which provides convenient access to various components of a plot or simply access the plot.state which corresponds to the plot as a whole, e.g. in this case we define colors for the x- and y-labels of the plot.

def hook(plot, element):
    print('plot.state:   ', plot.state)
    print('plot.handles: ', sorted(plot.handles.keys()))
    plot.handles['xaxis'].axis_label_text_color = 'red'
    plot.handles['yaxis'].axis_label_text_color = 'blue'
    
hv.Curve([1, 2, 3]).opts(hooks=[hook])

Customizing axes

Controlling the axis scales is one of the most common changes to make to a plot, so we will provide a quick overview of the four main types of axes and then go into some more detail on how to control the axis labels, ranges, ticks and orientation.

Types of axes

There are four main types of axes supported across plotting backends, standard linear axes, log axes, datetime axes and categorical axes. In most cases HoloViews automatically detects the appropriate axis type to use based on the type of the data, e.g. numeric values use linear/log axes, date(time) values use datetime axes and string or other object types use categorical axes.

Linear axes

A linear axes is simply the default, as long as the data is numeric HoloViews will automatically use a linear axis on the plot.

Log axes

When the data is exponential it is often useful to use log axes, which can be enabled using independent logx and logy options. This way both semi-log and log-log plots can be achieved:

semilogy = hv.Curve(np.logspace(0, 5), label='Semi-log y axes')
loglog = hv.Curve((np.logspace(0, 5), np.logspace(0, 5)), label='Log-log axes')

semilogy.opts(logy=True) + loglog.opts(logx=True, logy=True, shared_axes=False)

Datetime axes

All current plotting extensions allow plotting datetime data, if you ensure the dates array is of a valid datetime dtype.

from bokeh.sampledata.stocks import GOOG, AAPL

goog_dates = np.array(GOOG['date'], dtype=np.datetime64)
aapl_dates = np.array(AAPL['date'], dtype=np.datetime64)

goog = hv.Curve((goog_dates, GOOG['adj_close']), 'Date', 'Stock Index', label='Google')
aapl = hv.Curve((aapl_dates, AAPL['adj_close']), 'Date', 'Stock Index', label='Apple')

(goog * aapl).opts(width=600, legend_position='top_left')

Categorical axes

While the handling of categorical data handles significantly between plotting extensions the same basic concepts apply. If the data is a string type or other object type it is formatted as a string and each unique category is assigned a tick along the axis. When overlaying elements the categories are combined and overlaid appropriately.

Whether an axis is categorical also depends on the Element type, e.g. a HeatMap always has two categorical axes while a Bars element always has a categorical x-axis. As a simple example let us create a set of points with categories along the x- and y-axes and render them on top of a HeatMap of th same data:

points = hv.Points([(chr(i+65), chr(j+65), i*j) for i in range(10) for j in range(10)], vdims='z')

heatmap = hv.HeatMap(points)

(heatmap * points).opts(
    opts.HeatMap(toolbar='above', tools=['hover']),
    opts.Points(tools=['hover'], size=dim('z')*0.3))

As a more complex example which does not implicitly assume categorical axes due to the element type we will create a set of random samples indexed by categories from 'A' to 'E' using the Scatter Element and overlay them. Secondly we compute the mean and standard deviation for each category displayed using a set of ErrorBars and finally we overlay these two elements with a Curve representing the mean value . All these Elements respect the categorical index, providing us a view of the distribution of values in each category:

overlay = hv.NdOverlay({group: hv.Scatter(([group]*100, np.random.randn(100)*(5-i)-i))
                        for i, group in enumerate(['A', 'B', 'C', 'D', 'E'])})

errorbars = hv.ErrorBars([(k, el.reduce(function=np.mean), el.reduce(function=np.std))
                          for k, el in overlay.items()])

curve = hv.Curve(errorbars)

(errorbars * overlay * curve).opts(
    opts.ErrorBars(line_width=5), opts.Scatter(jitter=0.2, alpha=0.5, size=6, height=400, width=600))

Categorical axes are special in that they support multi-level nesting in some cases. Currently this is only supported for certain element types (BoxWhisker, Violin and Bars) but eventually all chart-like elements will interpret multiple key dimensions as a multi-level categorical hierarchy. To demonstrate this behavior consider the BoxWhisker plot below which support two-level nested categories:

groups = [chr(65+g) for g in np.random.randint(0, 3, 200)]
boxes = hv.BoxWhisker((groups, np.random.randint(0, 5, 200), np.random.randn(200)),
                      ['Group', 'Category'], 'Value').sort()

boxes.opts(width=600)

Axis positions

Axes may be hidden or moved to a different location using the xaxis and yaxis options, which accept None, 'right'/'bottom', 'left'/'top' and 'bare' as values.

np.random.seed(42)
ys = np.random.randn(101).cumsum(axis=0)

curve = hv.Curve(ys, ('x', 'x-label'), ('y', 'y-label'))

(curve.relabel('No axis').opts(xaxis=None, yaxis=None) +
 curve.relabel('Bare axis').opts(xaxis='bare') +
 curve.relabel('Moved axis').opts(xaxis='top', yaxis='right'))

Inverting axes

Another option to control axes is to invert the x-/y-axes using the invert_axes options, i.e. turn a vertical plot into a horizontal plot. Secondly each individual axis can be flipped left to right or upside down respectively using the invert_xaxis and invert_yaxis options.

bars = hv.Bars([('Australia', 10), ('United States', 14), ('United Kingdom', 7)], 'Country')

(bars.relabel('Invert axes').opts(invert_axes=True, width=400) +
 bars.relabel('Invert x-axis').opts(invert_xaxis=True) +
 bars.relabel('Invert y-axis').opts(invert_yaxis=True)).opts(shared_axes=False)

Axis labels

Ordinarily axis labels are controlled using the dimension label, however explicitly xlabel and ylabel options make it possible to override the label at the plot level. Additionally the labelled option allows specifying which axes should be labelled at all, making it possible to hide axis labels:

(curve.relabel('Dimension labels') +
 curve.relabel("xlabel='Custom x-label'").opts(xlabel='Custom x-label') +
 curve.relabel('Unlabelled').opts(labelled=[]))

Axis ranges

The ranges of a plot are ordinarily controlled by computing the data range and combining it with the dimension range and soft_range but they may also be padded or explicitly overridden using xlim and ylim options.

Dimension ranges

  • data range: The data range is computed by min and max of the dimension values
  • range: Hard override of the data range
  • soft_range: Soft override of the data range
Dimension.range

Setting the range of a Dimension overrides the data ranges, i.e. here we can see that despite the fact the data extends to x=100 the axis is cut off at 90:

curve.redim(x=hv.Dimension('x', range=(-10, 90)))
Dimension.soft_range

Declaringa soft_range on the other hand combines the data range and the supplied range, i.e. it will pick whichever extent is wider. Using the same example as above we can see it uses the -10 value supplied in the soft_range but also extends to 100, which is the upper bound of the actual data:

curve.redim(x=hv.Dimension('x', soft_range=(-10, 90)))

Padding

Applying padding to the ranges is an easy way to ensure that the data is not obscured by the margins. The padding is specified by the fraction by which to increase auto-ranged extents to make datapoints more visible around borders. The default for most elements is padding=0.1. The padding considers the width and height of the plot to keep the visual extent of the padding equal. The padding values can be specified with three levels of detail:

    1. A single numeric value (e.g. padding=0.1)
    1. A tuple specifying the padding for the x/y(/z) axes respectively (e.g. padding=(0, 0.1))
    1. A tuple of tuples specifying padding for the lower and upper bound respectively (e.g. padding=(0, (0, 0.1)))
(curve.relabel('Pad both axes').opts(padding=0.1) +
 curve.relabel('Pad y-axis').opts(padding=(0, 0.1)) +
 curve.relabel('Pad y-axis upper bound').opts(padding=(0, (0, 0.1)))).opts(shared_axes=False)

xlim/ylim

The data ranges, dimension ranges and padding combine across plots in an overlay to ensure that all the data is contained in the viewport. In some cases it is more convenient to override the ranges with explicit xlim and ylim options which have the highest precedence and will be respected no matter what.

curve.relabel('Explicit xlim/ylim').opts(xlim=(-10, 110), ylim=(-14, 6))

Autoranging

With the autorange keyword, you can ensure the data in the viewport is automatically ranged to maximise the use of the x- or y-axis. To illustrate, here is the same curve autoranging on the y-axis: note the difference in behavior when zooming into the data:

curve.relabel('Autoranging on y').opts(autorange='y')

To pin the ends of the ranges you can use the xlim and ylim options, using a value of None to allow autoranging to operate. Here the bottom range of the y-axis is pinned to the value of -14:

curve.relabel('Autoranging on y with set lower limit').opts(autorange='y', ylim=(-14,None))

Autoranging works analogously for the x-axis and also respects the padding setting. In addition, autoranging is triggered when the plotted data is updated dynamically, as is common when building interactive visualizations with operations or DynamicMaps.

Axis ticks

Setting tick locations differs a little bit depending on the plotting extension, interactive backends such as bokeh or plotly dynamically update the ticks, which means fixed tick locations may not be appropriate and the formatters have to be applied in Javascript code. Nevertheless most options to control the ticking are consistent across extensions.

Tick locations

The number and locations of ticks can be set in three main ways:

  • Number of ticks: Declare the number of desired ticks as an integer
  • List of tick positions: An explicit list defining the list of positions at which to draw a tick
  • List of tick positions and labels: A list of tuples of the form (position, label)
(curve.relabel('N ticks (xticks=10)').opts(xticks=10) +
 curve.relabel('Listed ticks (xticks=[0, 1, 2])').opts(xticks=[0, 50, 100]) +
 curve.relabel("Tick labels (xticks=[(0, 'zero'), ...").opts(xticks=[(0, 'zero'), (50, 'fifty'), (100, 'one hundred')]))

Lastly each extension will accept the custom Ticker objects the library provides, which can be used to achieve layouts not usually available.

Tick formatters

Tick formatting works very differently in different backends, however the xformatter and yformatter options try to minimize these differences. Tick formatters may be defined in one of three formats:

  • A classic format string such as '%d', '%.3f' or '%d' which may also contain other characters ('$%.2f')
  • A bokeh.models.TickFormatter in bokeh and a matplotlib.ticker.Formatter instance in matplotlib

Here is a small example demonstrating how to use the string approaches:

curve.relabel('Tick formatters').opts(xformatter='%.0f days', yformatter='$%.2f', width=500)

Tick orientation

Particularly when dealing with categorical axes it is often useful to control the tick rotation. This can be achieved using the xrotation and yrotation options which accept angles in degrees.

bars.opts(xrotation=45)

Twin axes

(Available in HoloViews >= 1.17, requires Bokeh >=3.2)

HoloViews now supports displaying overlays containing two different value dimensions as twin axes for chart elements. To maintain backwards compatibility, this feature is only enabled by setting the multi_y=True option on the overlay.

To illustrate, here is an overlay containing three curves with two value dimensions ('A' and 'B'). Setting multi_y=True then maps these two value dimensions to twin-axes:

overlay = hv.Curve([1, 2, 3], vdims=['A']) * hv.Curve([2, 3, 4], vdims=['A']) * hv.Curve([3, 2, 1], vdims=['B'])
overlay.opts(multi_y=True)

Additional value dimensions do map to additional axes but be aware that support of multi axes beyond twin axes is currently considered experimental.

The first value dimension is mapped to the left-hand axis and the second value dimension maps to the right axis. Note that the two axes are individually zoomable by hovering over them and using the Bokeh wheelzoom tool.

Supported multi_y options

When multi_y is enabled, you can set individual axis options on the elements of the overlay.

In this example, the left axis uses the default options while the right axis is an inverted, autoranged, log axis with a set ylim:

(hv.Curve([1, 2, 3], vdims=['A']) 
 * hv.Curve([2, 3, 4], vdims=['B']).opts(autorange='y', invert_yaxis=True, logy=True, ylim=(1,10), 
                                         ylabel='B custom', fontsize={'ylabel':10})
).opts(multi_y=True)

Supported options for customizing individual axes are apply_ranges, autorange='y', invert_yaxis, logy and ylim, yaxis as well as the following options for labelling: labelled, ylabel and the 'ylabel' setting in fontsize.

Note that as of HoloViews 1.17.0, multi_y does not have streaming plot support, extra axis labels are not dynamic and only the RangeXY linked stream is aware of additional y-axes.

Subcoordinate y-axis

(Available in HoloViews >= 1.18)

HoloViews enables you to create overlays where each element has its own distinct y-axis subcoordinate system. To activate this feature, set the subcoordinate_y keyword to True for each overlay element; the default is False. When using subcoordinate_y=True, setting a label for each element is required for proper rendering and identification.This will automatically distribute overlay elements along the y-axis.

For more fine-grained control over y-axis positioning, you can specify a numerical 2-tuple for subcoordinate_y with values ranging from 0 to 1. Additionally, the subcoordinate_scale keyword, which defaults to 1, allows you to adjust the vertical scale of each element. This option is only applicable when subcoordinate_y=True. For example, setting a single Curve's subcoordinate_scale to 2 will result in it overlapping 50% with its adjacent elements.

x = np.linspace(0, 10*np.pi)

curves = [
    hv.Curve((x + i*np.pi/2, np.sin(x)), label=f'Line {i}').opts(subcoordinate_y=True, subcoordinate_scale=1.2)
    for i in range(3)
]

hv.Overlay(curves).opts(show_legend=False)