year-vs-climatology / hvplot_docs /04-Style_Mapping.md
ahuang11's picture
Upload 5 files
067c884 verified
|
raw
history blame
20.1 kB
# Style Mapping
```python
import numpy as np
import holoviews as hv
from holoviews import dim, opts
hv.extension('bokeh')
```
One of the major benefits of HoloViews is the fact that Elements are simple, declarative wrappers around your data, with clearly defined semantics describing how the dimensions of the data map to the screen. Usually the key dimensions (kdims) and value dimensions map to coordinates of the plot axes and/or the colormapped intensity. However there are a huge number of ways to augment the visual representation of an element by mapping dimensions to visual attributes. In this section we will explore how we can declare such mappings including complex transforms specified by so called ``dim`` objects.
To illustrate this point let us create a set of three points with x/y-coordinates and alpha, color, marker and size values and then map each of those value dimensions to a visual attribute by name. Note that by default kdims is x,y. However, in this example we also show that the names of the dimensions can be changed and we use 'x values' and 'y values' to represent the data series names.
```python
data = {
'x values': [0, 1, 0.5],
'y values': [1, 0, 0.5],
'alpha': [0.5, 1, 0.3],
'color': ['red', 'blue', 'green'],
'marker': ['circle', 'triangle', 'diamond'],
'size': [15, 25, 40]
}
opts.defaults(opts.Points(size=8, line_color='black'))
hv.Points(data, kdims=['x values','y values'] , vdims=['alpha', 'color', 'marker', 'size']).opts(
alpha='alpha', color='color', marker='marker', size='size')
```
This is the simplest approach to style mapping, dimensions can be mapped to visual attributes directly by name. However often columns in the data will not directly map to a visual property, e.g. we might want to normalize values before mapping them to the alpha, or apply a scaling factor to some values before mapping them to the point size; this is where ``dim`` transforms come in. Below are a few examples of using ``dim`` transforms to map a dimension in the data to the visual style in the plot:
```python
points = hv.Points(np.random.rand(400, 4))
bins = [0, .25, 0.5, .75, 1]
labels = ['circle', 'triangle', 'diamond', 'square']
layout = hv.Layout([
points.relabel('Alpha' ).opts(alpha =dim('x').norm()),
points.relabel('Angle' ).opts(angle =dim('x').norm()*360, marker='dash'),
points.relabel('Color' ).opts(color =dim('x')),
points.relabel('Marker').opts(marker=dim('x').bin(bins, labels)),
points.relabel('Size' ).opts(size =dim('x')*10)
])
layout.opts(opts.Points(width=250, height=250, xaxis=None, yaxis=None)).cols(5)
```
### What are dim transforms?
In the above example we saw how to use an ``dim`` to define a transform from a dimension in your data to the visual property on screen. A ``dim`` therefore is a simple way to declare a deferred transform of your data. In the simplest case an ``dim`` simply returns the data for a dimension without transforming it, e.g. to look up the ``'alpha'`` dimension on the points object we can create an ``dim`` and use the ``apply`` method to evaluate the expression:
```python
from holoviews import dim
ds = hv.Dataset(np.random.rand(10, 4)*10, ['x', 'y'], ['alpha', 'size'])
dim('alpha').apply(ds)
```
#### Mathematical operators
An ``dim`` declaration allow arbitrary mathematical operations to be performed, e.g. let us declare that we want to subtract 5 from the 'alpha' dimension and then compute the ``min``:
```python
math_op = (dim('alpha')-5).min()
math_op
```
Printing the repr of the ``math_op`` we can see that it builds up an nested expression. To see the transform in action we will once again ``apply`` it on the points:
```python
math_op.apply(ds)
```
``dim`` objects implement most of the NumPy API, supports all standard mathematical operators and also support NumPy ufuncs.
#### Custom functions
In addition to standard mathematical operators it is also possible to declare custom functions which can be applied by name. By default HoloViews ships with three commonly useful functions.
##### **``norm``**
Unity based normalization or features scaling normalizing the values to a range between 0-1 (optionally accepts ``min``/``max`` values as ``limits``, which are usually provided by the plotting system) using the expression:
(values - min) / (max-min)
for example we can rescale the alpha values into a 0-1 range:
```python
dim('alpha').norm().apply(ds)
```
##### **``bin``**
Bins values using the supplied bins specified as the edges of each bin:
```python
bin_op = dim('alpha').bin([0, 5, 10])
bin_op.apply(ds)
```
It is also possible to provide explicit labels for each bin which will replace the bin center value:
```python
dim('alpha').bin([0, 5, 10], ['Bin 1', 'Bin 2']).apply(ds)
```
##### **``categorize``**
Maps a number of discrete values onto the supplied list of categories, e.g. having binned the data into 2 discrete bins we can map them to two discrete marker types 'circle' and 'triangle':
```python
dim(bin_op).categorize({2.5: 'circle', 7.5: 'square'}).apply(ds)
```
This can be very useful to map discrete categories to markers or colors.
### Style mapping with ``dim`` transforms
This allows a huge amount of flexibility to express how the data should be mapped to visual style without directly modifying the data. To demonstrate this we will use some of the more complex:
```python
points.opts(
alpha =(dim('x')+0.2).norm(),
angle =np.sin(dim('y'))*360,
color =dim('x')**2,
marker=dim('y').bin(bins, labels),
size =dim('x')**dim('y')*20, width=500, height=500)
```
Let's summarize the style transforms we have applied:
* **alpha**=``(dim('x')+0.2).norm()``: The alpha are mapped to the x-values offset by 0.2 and normalized.
* **angle**=``np.sin(dim('x'))*360``: The angle of each marker is the sine of the y-values, multiplied by 360
* **color**=``'x'``: The points are colormapped by square of their x-values.
* **marker**=``dim('y').bin(bins, labels)``: The y-values are binned and each bin is assignd a unique marker.
* **size**=``dim('x')**dim('y')*20``: The size of the points is mapped to the x-values exponentiated with the y-values and scaled by 20
These are simply illustrative examples, transforms can be chained in arbitrarily complex ways to achieve almost any mapping from dimension values to visual style.
## Colormapping
Color cycles and styles are useful for categorical plots and when overlaying multiple subsets, but when we want to map data values to a color it is better to use HoloViews' facilities for color mapping. Certain image-like types will apply colormapping automatically; e.g. for ``Image``, ``QuadMesh`` or ``HeatMap`` types the first value dimension is automatically mapped to the color. In other cases the values to colormap can be declared by providing a ``color`` style option that specifies which dimension to map into the color value.
#### Named colormaps
HoloViews accepts colormaps specified either as an explicit list of hex or HTML colors, as a Matplotlib colormap object, or as the name of a bokeh, matplotlib, and colorcet palettes/colormap (which are available when the respective library is imported). The named colormaps available are listed here (suppressing the `_r` versions) and illustrated in detail in the separate [Colormaps](Colormaps.ipynb) user guide:
```python
def format_list(l):
print(' '.join(sorted([k for k in l if not k.endswith('_r')])))
format_list(hv.plotting.list_cmaps())
```
To use one of these colormaps simply refer to it by name with the ``cmap`` style option:
```python
ls = np.linspace(0, 10, 400)
xx, yy = np.meshgrid(ls, ls)
bounds=(-1,-1,1,1) # Coordinate system: (left, bottom, right, top)
img = hv.Image(np.sin(xx)*np.cos(yy), bounds=bounds).opts(colorbar=True, width=400)
img.relabel('PiYG').opts(cmap='PiYG') + img.relabel('PiYG_r').opts(cmap='PiYG_r')
```
#### Custom colormaps
You can make your own custom colormaps by providing a list of hex colors:
```python
img.relabel('Listed colors').opts(cmap=['#0000ff', '#8888ff', '#ffffff', '#ff8888', '#ff0000'], colorbar=True, width=400)
```
#### Discrete color levels
Lastly, existing colormaps can be made discrete by defining an integer number of ``color_levels``:
```python
img.relabel('5 color levels').opts(cmap='PiYG', color_levels=5) + img.relabel('11 color levels').opts(cmap='PiYG', color_levels=11)
```
### Explicit color mapping
Some elements work through implicit colormapping, the prime example being the ``Image`` type. However, other elements can be colormapped using style mapping instead, by setting the color to an existing dimension.
#### Continuous values
If we provide a continuous value for the ``color`` style option along with a continuous colormap, we can also enable a ``colorbar``:
```python
polygons = hv.Polygons([{('x', 'y'): hv.Ellipse(0, 0, (i, i)).array(), 'z': i} for i in range(1, 10)[::-1]], vdims='z')
polygons.opts(color='z', colorbar=True, width=380)
```
#### Categorical values
Conversely, when mapping a categorical value into a set of colors, we automatically get a legend (which can be disabled using the ``show_legend`` option):
```python
categorical_points = hv.Points((np.random.rand(100),
np.random.rand(100),
np.random.choice(list('ABCD'), 100)), vdims='Category')
categorical_points.sort('Category').opts(
color='Category', cmap='Category20', size=8, legend_position='left', width=500)
```
#### Explicit color mapping
Instead of using a listed colormap, you can provide an explicit mapping from category to color. Here we will map the categories 'A', 'B', 'C' and 'D' to specific colors:
```python
explicit_mapping = {'A': 'blue', 'B': 'red', 'C': 'green', 'D': 'purple'}
categorical_points.sort('Category').opts(color='Category', cmap=explicit_mapping, size=8)
```
#### Custom color intervals
In addition to a simple integer defining the number of discrete levels, the ``color_levels`` option also allows defining a set of custom intervals. This can be useful for defining a fixed scale, such as the Saffir-Simpson hurricane wind scale. Below we declare the color levels along with a list of colors, declaring the scale. Note that the levels define the intervals to map each color to, so if there are N colors we have to define N+1 levels.
Having defined the scale we can generate a theoretical hurricane path with wind speed values and use the ``color_levels`` and ``cmap`` to supply the custom color scale:
```python
levels = [0, 38, 73, 95, 110, 130, 156, 999]
colors = ['#5ebaff', '#00faf4', '#ffffcc', '#ffe775', '#ffc140', '#ff8f20', '#ff6060']
path = [
(-75.1, 23.1, 0), (-76.2, 23.8, 0), (-76.9, 25.4, 0), (-78.4, 26.1, 39), (-79.6, 26.2, 39),
(-80.3, 25.9, 39), (-82.0, 25.1, 74), (-83.3, 24.6, 74), (-84.7, 24.4, 96), (-85.9, 24.8, 111),
(-87.7, 25.7, 111), (-89.2, 27.2, 131), (-89.6, 29.3, 156), (-89.6, 30.2, 156), (-89.1, 32.6, 131),
(-88.0, 35.6, 111), (-85.3, 38.6, 96)
]
hv.Path([path], vdims='Wind Speed').opts(
color='Wind Speed', color_levels=levels, cmap=colors, line_width=8, colorbar=True, width=450
)
```
#### Setting color ranges
For an image-like element, color ranges are determined by the range of the `z` value dimension, and they can thus be controlled using the ``.redim.range`` method with `z`. As an example, let's set some values in the image array to NaN and then set the range to clip the data at 0 and 0.9. By declaring the ``clipping_colors`` option we can control what colors are used for NaN values and for values above and below the defined range:
```python
clipping = {'min': 'red', 'max': 'green', 'NaN': 'gray'}
options = dict(cmap='Blues', colorbar=True, width=300, height=230, axiswise=True)
arr = np.sin(xx)*np.cos(yy)
arr[:190, :127] = np.nan
original = hv.Image(arr, bounds=bounds).opts(**options)
colored = original.opts(clipping_colors=clipping, clone=True)
clipped = colored.redim.range(z=(0, 0.9))
original + colored + clipped
```
By default (left plot above), the min and max values in the array map to the first color (white) and last color (dark blue) in the colormap, and NaNs are ``'transparent'`` (an RGBA tuple of (0, 0, 0, 0)), revealing the underlying plot background. When the specified `clipping_colors` are supplied (middle plot above), NaN values are now colored gray, but the plot is otherwise the same because the autoranging still ensures that no value is mapped outside the available color range. Finally, when the `z` range is reduced (right plot above), the color range is mapped from a different range of numerical `z` values, and some values now fall outside the range and are thus clipped to red or green as specified.
#### Normalization modes
When using a colormap, there are three available color normalization or `cnorm` options to determine how numerical values are mapped to the range of colors in the colorbar:
* `linear`: Simple linear mapping (used by default)
* `log`: Logarithmic mapping
* `eq_hist`: Histogram-equalized mapping
The following cell defines an `Image` containing random samples drawn from a normal distribution (mean of 3) with a square of constant value 100 in the middle, shown with the three `cnorm` modes:
```python
np.random.seed(42)
data = np.random.normal(loc=3, scale=0.3, size=(100,100))
print("Mean value of random samples is {mean:.3f}, ".format(mean=np.mean(data))
+ "which is much lower\nthan the black square in the center (value 100).")
data[45:55,45:55] = 100
imopts=dict(colorbar=True, xaxis='bare', yaxis='bare', height=160, width=200)
pattern = hv.Image(data)
( pattern.options(cnorm='linear', title='linear', **imopts)
+ pattern.options(cnorm='log', title='log', **imopts)
+ pattern.options(cnorm='eq_hist', title='eq_hist', **imopts))
```
The `'linear'` mode is very easy to interpret numerically, with colors mapped to numerical values linearly as indicated. However, as you can see in this case, high-value outliers like the square here can make it difficult to see any structure in the remaining values. The Gaussian noise values all map to the first few colors at the bottom of the colormap, resulting in a background that is almost uniformly yellow even though we know the data includes a variety of different values in the background area.
In the `'log'` mode, the random values are a little easier to see but these samples still use a small portion of the colormap. Logarithmic colormaps are most useful when you know that you are plotting data with an approximately logarithmic distribution.
In the `'eq_hist'` mode, colors are nonlinearly mapped according to the actual distribution of values in the plot, such that each color in the colormap represents an approximately equal number of values in the plot (here with few or no colors reserved for the nearly empty range between 10 and 100). In this mode both the outliers and the overall low-amplitude noise can be seen clearly, but the non-linear distortion can make the colors more difficult to interpret as numerical values.
When working with unknown data distributions, it is often a good idea to try all three of these modes, using `eq_hist` to be sure that you are seeing all of the patterns in the data, then either `log` or `linear` (depending on which one is a better match to your distribution) with the values clipped to the range of values you want to show.
## Other colormapping options
* ``clim_percentile``: Percentile value to compute colorscale robust to outliers. If `True`, uses 2nd and 98th percentile; otherwise uses the specified percentile value.
* ``cnorm``: Color normalization to be applied during colormapping. Allows switching between 'linear', 'log', and 'eq_hist'.
* ``logz``: Enable logarithmic color scale (same as `cnorm='log'`; to be deprecated at some point)
* ``symmetric``: Ensures that the color scale is centered on zero (e.g. ``symmetric=True``)
## Cycles and Palettes
Frequently we want to plot multiple subsets of data, which is made easy by using ``Overlay`` and ``NdOverlay`` objects. When overlaying multiple elements of the same type they will need to be distinguished visually, and HoloViews provides two mechanisms for styling the different subsets automatically in those cases:
* ``Cycle``: A Cycle defines a list of discrete styles
* ``Palette``: A Palette defines a continuous color space which will be sampled discretely
### Cycle
A ``Cycle`` can be applied to any of the style options on an element. By default, most elements define a ``Cycle`` on the color property. Here we will create an overlay of three ``Points`` objects using the default cycles, then display it using the default cycles along with a copy where we changed the dot color and size using a custom ``Cycle``:
```python
points = (
hv.Points(np.random.randn(50, 2) ) *
hv.Points(np.random.randn(50, 2) + 1 ) *
hv.Points(np.random.randn(50, 2) * 0.5)
)
color_cycle = hv.Cycle(['red', 'green', 'blue'])
points + points.opts(opts.Points(color=color_cycle), clone=True)
```
Here color has been changed to cycle over the three provided colors, while size has been specified as a constant (though a cycle like `hv.Cycle([2,5,10])` could just as easily have been used for the size as well).
#### Defaults
In addition to defining custom color cycles by explicitly defining a list of colors, ``Cycle`` also defines a list of default Cycles generated from bokeh Palettes and matplotlib colormaps:
```python
format_list(hv.Cycle.default_cycles.keys())
```
(Here some of these Cycles have a reversed variant ending in `_r` that is not shown.)
To use one of these default Cycles simply construct the Cycle with the corresponding key:
```python
xs = np.linspace(0, np.pi*2)
curves = hv.Overlay([hv.Curve(np.sin(xs+p)) for p in np.linspace(0, np.pi, 10)])
curves.opts(opts.Curve(color=hv.Cycle('Category20'), width=600))
```
#### Markers and sizes
The above examples focus on color Cycles, but Cycles may be used to define any style option. Here let's use them to cycle over a number of marker styles and sizes, which will be expanded by cycling over each item independently. In this case we are cycling over three Cycles, resulting in the following style combinations:
1. ``{'color': '#30a2da', 'marker': 'x', 'size': 10}``
2. ``{'color': '#fc4f30', 'marker': '^', 'size': 5}``
3. ``{'color': '#e5ae38', 'marker': '+', 'size': 10}``
```python
color = hv.Cycle(['#30a2da', '#fc4f30', '#e5ae38'])
markers = hv.Cycle(['x', '^', '+'])
sizes = hv.Cycle([10, 5])
points.opts(opts.Points(line_color=color, marker=markers, size=sizes))
```
### Palettes
Palettes are similar to cycles, but treat a set of colors as a continuous colorspace to be sampled at regularly spaced intervals. Again they are made automatically available from existing colormaps (with `_r` versions also available):
```python
format_list(hv.Palette.colormaps.keys())
```
(Here each colormap `X` has a corresponding version `X_r` with the values reversed; the `_r` variants are suppressed above.)
As a simple example we will create a Palette from the Spectral colormap and apply it to an Overlay of 6 Ellipses. Comparing it to the Spectral ``Cycle`` we can immediately see that the Palette covers the entire color space spanned by the Spectral colormap, while the Cycle instead uses the first 6 colors of the Spectral colormap:
```python
ellipses = hv.Overlay([hv.Ellipse(0, 0, s) for s in range(6)])
ellipses.relabel('Palette').opts(opts.Ellipse(color=hv.Palette('Spectral'), line_width=5), clone=True) +\
ellipses.relabel('Cycle' ).opts(opts.Ellipse(color=hv.Cycle( 'Spectral'), line_width=5), clone=True)
```
Thus if you want to have have a discrete set of distinguishable colors starting from a list of colors that vary slowly and continuously, you should usually supply it as a Palette, not a Cycle. Conversely, you should use a Cycle when you want to iterate through a specific list of colors, in order, without skipping around the list like a Palette will.