tweak-hvplot-chat / hvplot_docs /04-Style_Mapping.md
ahuang11's picture
Upload 5 files
067c884 verified
|
raw
history blame
20.1 kB

Style Mapping

import numpy as np
import holoviews as hv
from holoviews import dim, opts

hv.extension('bokeh')

One of the major benefits of HoloViews is the fact that Elements are simple, declarative wrappers around your data, with clearly defined semantics describing how the dimensions of the data map to the screen. Usually the key dimensions (kdims) and value dimensions map to coordinates of the plot axes and/or the colormapped intensity. However there are a huge number of ways to augment the visual representation of an element by mapping dimensions to visual attributes. In this section we will explore how we can declare such mappings including complex transforms specified by so called dim objects.

To illustrate this point let us create a set of three points with x/y-coordinates and alpha, color, marker and size values and then map each of those value dimensions to a visual attribute by name. Note that by default kdims is x,y. However, in this example we also show that the names of the dimensions can be changed and we use 'x values' and 'y values' to represent the data series names.

data = {
    'x values': [0, 1, 0.5],
    'y values': [1, 0, 0.5],
    'alpha': [0.5, 1, 0.3],
    'color': ['red', 'blue', 'green'],
    'marker': ['circle', 'triangle', 'diamond'],
    'size': [15, 25, 40]
}

opts.defaults(opts.Points(size=8, line_color='black'))

hv.Points(data, kdims=['x values','y values'] , vdims=['alpha', 'color', 'marker', 'size']).opts(
    alpha='alpha', color='color', marker='marker', size='size')

This is the simplest approach to style mapping, dimensions can be mapped to visual attributes directly by name. However often columns in the data will not directly map to a visual property, e.g. we might want to normalize values before mapping them to the alpha, or apply a scaling factor to some values before mapping them to the point size; this is where dim transforms come in. Below are a few examples of using dim transforms to map a dimension in the data to the visual style in the plot:

points = hv.Points(np.random.rand(400, 4))

bins   = [0, .25, 0.5, .75, 1]
labels = ['circle', 'triangle', 'diamond', 'square']

layout = hv.Layout([
    points.relabel('Alpha' ).opts(alpha =dim('x').norm()),
    points.relabel('Angle' ).opts(angle =dim('x').norm()*360, marker='dash'),
    points.relabel('Color' ).opts(color =dim('x')),
    points.relabel('Marker').opts(marker=dim('x').bin(bins, labels)),
    points.relabel('Size'  ).opts(size  =dim('x')*10)
])

layout.opts(opts.Points(width=250, height=250, xaxis=None, yaxis=None)).cols(5)

What are dim transforms?

In the above example we saw how to use an dim to define a transform from a dimension in your data to the visual property on screen. A dim therefore is a simple way to declare a deferred transform of your data. In the simplest case an dim simply returns the data for a dimension without transforming it, e.g. to look up the 'alpha' dimension on the points object we can create an dim and use the apply method to evaluate the expression:

from holoviews import dim

ds = hv.Dataset(np.random.rand(10, 4)*10, ['x', 'y'], ['alpha', 'size'])

dim('alpha').apply(ds)

Mathematical operators

An dim declaration allow arbitrary mathematical operations to be performed, e.g. let us declare that we want to subtract 5 from the 'alpha' dimension and then compute the min:

math_op = (dim('alpha')-5).min()
math_op

Printing the repr of the math_op we can see that it builds up an nested expression. To see the transform in action we will once again apply it on the points:

math_op.apply(ds)

dim objects implement most of the NumPy API, supports all standard mathematical operators and also support NumPy ufuncs.

Custom functions

In addition to standard mathematical operators it is also possible to declare custom functions which can be applied by name. By default HoloViews ships with three commonly useful functions.

norm

Unity based normalization or features scaling normalizing the values to a range between 0-1 (optionally accepts min/max values as limits, which are usually provided by the plotting system) using the expression:

(values - min) / (max-min)

for example we can rescale the alpha values into a 0-1 range:

dim('alpha').norm().apply(ds)
bin

Bins values using the supplied bins specified as the edges of each bin:

bin_op = dim('alpha').bin([0, 5, 10])

bin_op.apply(ds)

It is also possible to provide explicit labels for each bin which will replace the bin center value:

dim('alpha').bin([0, 5, 10], ['Bin 1', 'Bin 2']).apply(ds)
categorize

Maps a number of discrete values onto the supplied list of categories, e.g. having binned the data into 2 discrete bins we can map them to two discrete marker types 'circle' and 'triangle':

dim(bin_op).categorize({2.5: 'circle', 7.5: 'square'}).apply(ds)

This can be very useful to map discrete categories to markers or colors.

Style mapping with dim transforms

This allows a huge amount of flexibility to express how the data should be mapped to visual style without directly modifying the data. To demonstrate this we will use some of the more complex:

points.opts(
    alpha =(dim('x')+0.2).norm(),
    angle =np.sin(dim('y'))*360,
    color =dim('x')**2,
    marker=dim('y').bin(bins, labels),
    size  =dim('x')**dim('y')*20, width=500, height=500)

Let's summarize the style transforms we have applied:

  • alpha=(dim('x')+0.2).norm(): The alpha are mapped to the x-values offset by 0.2 and normalized.
  • angle=np.sin(dim('x'))*360: The angle of each marker is the sine of the y-values, multiplied by 360
  • color='x': The points are colormapped by square of their x-values.
  • marker=dim('y').bin(bins, labels): The y-values are binned and each bin is assignd a unique marker.
  • size=dim('x')**dim('y')*20: The size of the points is mapped to the x-values exponentiated with the y-values and scaled by 20

These are simply illustrative examples, transforms can be chained in arbitrarily complex ways to achieve almost any mapping from dimension values to visual style.

Colormapping

Color cycles and styles are useful for categorical plots and when overlaying multiple subsets, but when we want to map data values to a color it is better to use HoloViews' facilities for color mapping. Certain image-like types will apply colormapping automatically; e.g. for Image, QuadMesh or HeatMap types the first value dimension is automatically mapped to the color. In other cases the values to colormap can be declared by providing a color style option that specifies which dimension to map into the color value.

Named colormaps

HoloViews accepts colormaps specified either as an explicit list of hex or HTML colors, as a Matplotlib colormap object, or as the name of a bokeh, matplotlib, and colorcet palettes/colormap (which are available when the respective library is imported). The named colormaps available are listed here (suppressing the _r versions) and illustrated in detail in the separate Colormaps user guide:

def format_list(l):
    print(' '.join(sorted([k for k in l if not k.endswith('_r')])))

format_list(hv.plotting.list_cmaps())

To use one of these colormaps simply refer to it by name with the cmap style option:

ls = np.linspace(0, 10, 400)
xx, yy = np.meshgrid(ls, ls)
bounds=(-1,-1,1,1)   # Coordinate system: (left, bottom, right, top)
img = hv.Image(np.sin(xx)*np.cos(yy), bounds=bounds).opts(colorbar=True, width=400)

img.relabel('PiYG').opts(cmap='PiYG') + img.relabel('PiYG_r').opts(cmap='PiYG_r')

Custom colormaps

You can make your own custom colormaps by providing a list of hex colors:

img.relabel('Listed colors').opts(cmap=['#0000ff', '#8888ff', '#ffffff', '#ff8888', '#ff0000'], colorbar=True, width=400)

Discrete color levels

Lastly, existing colormaps can be made discrete by defining an integer number of color_levels:

img.relabel('5 color levels').opts(cmap='PiYG', color_levels=5) + img.relabel('11 color levels').opts(cmap='PiYG', color_levels=11) 

Explicit color mapping

Some elements work through implicit colormapping, the prime example being the Image type. However, other elements can be colormapped using style mapping instead, by setting the color to an existing dimension.

Continuous values

If we provide a continuous value for the color style option along with a continuous colormap, we can also enable a colorbar:

polygons = hv.Polygons([{('x', 'y'): hv.Ellipse(0, 0, (i, i)).array(), 'z': i} for i in range(1, 10)[::-1]], vdims='z')

polygons.opts(color='z', colorbar=True, width=380)

Categorical values

Conversely, when mapping a categorical value into a set of colors, we automatically get a legend (which can be disabled using the show_legend option):

categorical_points = hv.Points((np.random.rand(100), 
                                np.random.rand(100), 
                                np.random.choice(list('ABCD'), 100)), vdims='Category')

categorical_points.sort('Category').opts(
    color='Category', cmap='Category20', size=8, legend_position='left', width=500)

Explicit color mapping

Instead of using a listed colormap, you can provide an explicit mapping from category to color. Here we will map the categories 'A', 'B', 'C' and 'D' to specific colors:

explicit_mapping = {'A': 'blue', 'B': 'red', 'C': 'green', 'D': 'purple'}

categorical_points.sort('Category').opts(color='Category', cmap=explicit_mapping, size=8)

Custom color intervals

In addition to a simple integer defining the number of discrete levels, the color_levels option also allows defining a set of custom intervals. This can be useful for defining a fixed scale, such as the Saffir-Simpson hurricane wind scale. Below we declare the color levels along with a list of colors, declaring the scale. Note that the levels define the intervals to map each color to, so if there are N colors we have to define N+1 levels.

Having defined the scale we can generate a theoretical hurricane path with wind speed values and use the color_levels and cmap to supply the custom color scale:

levels = [0, 38, 73, 95, 110, 130, 156, 999]  
colors = ['#5ebaff', '#00faf4', '#ffffcc', '#ffe775', '#ffc140', '#ff8f20', '#ff6060']

path = [
    (-75.1, 23.1, 0),   (-76.2, 23.8, 0),   (-76.9, 25.4, 0),   (-78.4, 26.1, 39),  (-79.6, 26.2, 39),
    (-80.3, 25.9, 39),  (-82.0, 25.1, 74),  (-83.3, 24.6, 74),  (-84.7, 24.4, 96),  (-85.9, 24.8, 111),
    (-87.7, 25.7, 111), (-89.2, 27.2, 131), (-89.6, 29.3, 156), (-89.6, 30.2, 156), (-89.1, 32.6, 131),
    (-88.0, 35.6, 111), (-85.3, 38.6, 96)
]

hv.Path([path], vdims='Wind Speed').opts(
    color='Wind Speed', color_levels=levels, cmap=colors, line_width=8, colorbar=True, width=450
)

Setting color ranges

For an image-like element, color ranges are determined by the range of the z value dimension, and they can thus be controlled using the .redim.range method with z. As an example, let's set some values in the image array to NaN and then set the range to clip the data at 0 and 0.9. By declaring the clipping_colors option we can control what colors are used for NaN values and for values above and below the defined range:

clipping = {'min': 'red', 'max': 'green', 'NaN': 'gray'}
options = dict(cmap='Blues', colorbar=True, width=300, height=230, axiswise=True)

arr = np.sin(xx)*np.cos(yy)
arr[:190, :127] = np.nan

original = hv.Image(arr, bounds=bounds).opts(**options)
colored  = original.opts(clipping_colors=clipping, clone=True)
clipped  = colored.redim.range(z=(0, 0.9))

original + colored + clipped

By default (left plot above), the min and max values in the array map to the first color (white) and last color (dark blue) in the colormap, and NaNs are 'transparent' (an RGBA tuple of (0, 0, 0, 0)), revealing the underlying plot background. When the specified clipping_colors are supplied (middle plot above), NaN values are now colored gray, but the plot is otherwise the same because the autoranging still ensures that no value is mapped outside the available color range. Finally, when the z range is reduced (right plot above), the color range is mapped from a different range of numerical z values, and some values now fall outside the range and are thus clipped to red or green as specified.

Normalization modes

When using a colormap, there are three available color normalization or cnorm options to determine how numerical values are mapped to the range of colors in the colorbar:

  • linear: Simple linear mapping (used by default)
  • log: Logarithmic mapping
  • eq_hist: Histogram-equalized mapping

The following cell defines an Image containing random samples drawn from a normal distribution (mean of 3) with a square of constant value 100 in the middle, shown with the three cnorm modes:

np.random.seed(42)
data = np.random.normal(loc=3, scale=0.3, size=(100,100))
print("Mean value of random samples is {mean:.3f}, ".format(mean=np.mean(data))
     + "which is much lower\nthan the black square in the center (value 100).")
data[45:55,45:55] = 100

imopts=dict(colorbar=True, xaxis='bare', yaxis='bare', height=160, width=200)
pattern = hv.Image(data)

(  pattern.options(cnorm='linear',  title='linear',  **imopts) 
 + pattern.options(cnorm='log',     title='log',     **imopts)
 + pattern.options(cnorm='eq_hist', title='eq_hist', **imopts))

The 'linear' mode is very easy to interpret numerically, with colors mapped to numerical values linearly as indicated. However, as you can see in this case, high-value outliers like the square here can make it difficult to see any structure in the remaining values. The Gaussian noise values all map to the first few colors at the bottom of the colormap, resulting in a background that is almost uniformly yellow even though we know the data includes a variety of different values in the background area.

In the 'log' mode, the random values are a little easier to see but these samples still use a small portion of the colormap. Logarithmic colormaps are most useful when you know that you are plotting data with an approximately logarithmic distribution.

In the 'eq_hist' mode, colors are nonlinearly mapped according to the actual distribution of values in the plot, such that each color in the colormap represents an approximately equal number of values in the plot (here with few or no colors reserved for the nearly empty range between 10 and 100). In this mode both the outliers and the overall low-amplitude noise can be seen clearly, but the non-linear distortion can make the colors more difficult to interpret as numerical values.

When working with unknown data distributions, it is often a good idea to try all three of these modes, using eq_hist to be sure that you are seeing all of the patterns in the data, then either log or linear (depending on which one is a better match to your distribution) with the values clipped to the range of values you want to show.

Other colormapping options

  • clim_percentile: Percentile value to compute colorscale robust to outliers. If True, uses 2nd and 98th percentile; otherwise uses the specified percentile value.
  • cnorm: Color normalization to be applied during colormapping. Allows switching between 'linear', 'log', and 'eq_hist'.
  • logz: Enable logarithmic color scale (same as cnorm='log'; to be deprecated at some point)
  • symmetric: Ensures that the color scale is centered on zero (e.g. symmetric=True)

Cycles and Palettes

Frequently we want to plot multiple subsets of data, which is made easy by using Overlay and NdOverlay objects. When overlaying multiple elements of the same type they will need to be distinguished visually, and HoloViews provides two mechanisms for styling the different subsets automatically in those cases:

  • Cycle: A Cycle defines a list of discrete styles
  • Palette: A Palette defines a continuous color space which will be sampled discretely

Cycle

A Cycle can be applied to any of the style options on an element. By default, most elements define a Cycle on the color property. Here we will create an overlay of three Points objects using the default cycles, then display it using the default cycles along with a copy where we changed the dot color and size using a custom Cycle:

points = (
    hv.Points(np.random.randn(50, 2)      ) *
    hv.Points(np.random.randn(50, 2) + 1  ) *
    hv.Points(np.random.randn(50, 2) * 0.5)
)

color_cycle = hv.Cycle(['red', 'green', 'blue'])
points + points.opts(opts.Points(color=color_cycle), clone=True)

Here color has been changed to cycle over the three provided colors, while size has been specified as a constant (though a cycle like hv.Cycle([2,5,10]) could just as easily have been used for the size as well).

Defaults

In addition to defining custom color cycles by explicitly defining a list of colors, Cycle also defines a list of default Cycles generated from bokeh Palettes and matplotlib colormaps:

format_list(hv.Cycle.default_cycles.keys())

(Here some of these Cycles have a reversed variant ending in _r that is not shown.)

To use one of these default Cycles simply construct the Cycle with the corresponding key:

xs = np.linspace(0, np.pi*2)
curves = hv.Overlay([hv.Curve(np.sin(xs+p)) for p in np.linspace(0, np.pi, 10)])

curves.opts(opts.Curve(color=hv.Cycle('Category20'), width=600))

Markers and sizes

The above examples focus on color Cycles, but Cycles may be used to define any style option. Here let's use them to cycle over a number of marker styles and sizes, which will be expanded by cycling over each item independently. In this case we are cycling over three Cycles, resulting in the following style combinations:

  1. {'color': '#30a2da', 'marker': 'x', 'size': 10}
  2. {'color': '#fc4f30', 'marker': '^', 'size': 5}
  3. {'color': '#e5ae38', 'marker': '+', 'size': 10}
color = hv.Cycle(['#30a2da', '#fc4f30', '#e5ae38'])
markers = hv.Cycle(['x', '^', '+'])
sizes = hv.Cycle([10, 5])
points.opts(opts.Points(line_color=color, marker=markers, size=sizes))

Palettes

Palettes are similar to cycles, but treat a set of colors as a continuous colorspace to be sampled at regularly spaced intervals. Again they are made automatically available from existing colormaps (with _r versions also available):

format_list(hv.Palette.colormaps.keys())

(Here each colormap X has a corresponding version X_r with the values reversed; the _r variants are suppressed above.)

As a simple example we will create a Palette from the Spectral colormap and apply it to an Overlay of 6 Ellipses. Comparing it to the Spectral Cycle we can immediately see that the Palette covers the entire color space spanned by the Spectral colormap, while the Cycle instead uses the first 6 colors of the Spectral colormap:

ellipses = hv.Overlay([hv.Ellipse(0, 0, s) for s in range(6)])

ellipses.relabel('Palette').opts(opts.Ellipse(color=hv.Palette('Spectral'), line_width=5), clone=True) +\
ellipses.relabel('Cycle'  ).opts(opts.Ellipse(color=hv.Cycle(  'Spectral'), line_width=5), clone=True)

Thus if you want to have have a discrete set of distinguishable colors starting from a list of colors that vary slowly and continuously, you should usually supply it as a Palette, not a Cycle. Conversely, you should use a Cycle when you want to iterate through a specific list of colors, in order, without skipping around the list like a Palette will.