File size: 5,168 Bytes
b9a0f21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
hvPlot provides one API to explore data of many different types. Previous sections have exclusively worked with tabular data stored in pandas (or pandas-like) DataFrames. The other most common type of data are n-dimensional arrays. hvPlot aims to eventually support different array libraries but for now focuses on [xarray](https://xarray.pydata.org/en/stable/). XArray provides a convenient and very powerful wrapper to label the axis and coordinates of multi-dimensional (n-D) arrays. This user guide will cover how to leverage ``xarray`` and ``hvplot`` to visualize and explore data of different dimensionality ranging from simple 1D data, to 2D image-like data, to multi-dimensional cubes of data.

For these examples we’ll use the North American air temperature dataset:


```python
import xarray as xr
import hvplot.xarray  # noqa

air_ds = xr.tutorial.open_dataset('air_temperature').load()
air = air_ds.air
air_ds
```

## 1D Plots

Selecting the data at a particular lat/lon coordinate we get a 1D dataset of air temperatures over time:


```python
air1d = air.sel(lat=40, lon=285)
air1d.hvplot()
```

Notice how the axes are already appropriately labeled, because xarray stores the metadata required. We can also further subselect the data and use `*` to overlay plots:


```python
air1d_sel = air1d.sel(time='2013-01')
air1d_sel.hvplot(color='purple') * air1d_sel.hvplot.scatter(marker='o', color='blue', size=15)
```


```python
air.lat
```

### Selecting multiple

If we select multiple coordinates along one axis and plot a chart type, the data will automatically be split by the coordinate:


```python
air.sel(lat=[20, 40, 60], lon=285).hvplot.line()
```

To plot a different relationship we can explicitly request to display the latitude along the y-axis and use the ``by`` keyword to color each longitude (or 'lon') differently (note that this differs from the ``hue`` keyword xarray uses):


```python
air.sel(time='2013-02-01 00:00', lon=[280, 285]).hvplot.line(y='lat', by='lon', legend='top_right')
```

## 2D Plots

By default the ``DataArray.hvplot()`` method generates an image if the data is two-dimensional.


```python
air2d = air.sel(time='2013-06-01 12:00')
air2d.hvplot(width=400)
```

Alternatively we can also plot the same data using the ``contour`` and ``contourf`` methods, which provide a ``levels`` argument to control the number of iso-contours to draw:


```python
air2d.hvplot.contour(width=400, levels=20) + air2d.hvplot.contourf(width=400, levels=8)
```

## n-D Plots

If the data has more than two dimensions it will default to a histogram without providing it further hints:


```python
air.hvplot()
```

However we can tell it to apply a ``groupby`` along a particular dimension, allowing us to explore the data as images along that dimension with a slider:


```python
air.hvplot(groupby='time', width=500)
```

By default, for numeric types you'll get a slider and for non-numeric types you'll get a selector. Use ``widget_type`` and ``widget_location`` to control the look of the widget. To learn more about customizing widget behavior see [Widgets](Widgets.ipynb).


```python
air.hvplot(groupby='time', width=600, widget_type='scrubber', widget_location='bottom')
```

If we pick a different, lower dimensional plot type (such as a 'line') it will automatically apply a groupby over the remaining dimensions:


```python
air.hvplot.line(width=600)
```

## Statistical plots

Statistical plots such as histograms, kernel-density estimates, or violin and box-whisker plots aggregate the data across one or more of the coordinate dimensions. For instance, plotting a KDE provides a summary of all the air temperature values but we can, once again, use the ``by`` keyword to view each selected latitude (or 'lat') separately:


```python
air.sel(lat=[25, 50, 75]).hvplot.kde('air', by='lat', alpha=0.5)
```

Using the ``by`` keyword we can break down the distribution of the air temperature across one or more variables:


```python
air.hvplot.violin('air', by='lat', color='lat', cmap='Category20')
```

## Rasterizing

If you are plotting a large amount of data at once, you can consider using the hvPlot interface to [Datashader](https://datashader.org), which can be enabled simply by setting `rasterize=True`.

Note that by declaring that the data should not be grouped by another coordinate variable, i.e. by setting `groupby=[]`, we can plot all the datapoints, showing us the spread of air temperatures in the dataset:


```python
air.hvplot.scatter('time', groupby=[], rasterize=True) *\
air.mean(['lat', 'lon']).hvplot.line('time', color='indianred')
```

Here we also overlaid a non-datashaded line plot of the average temperature at each time.  If you enable the appropriate hover tool, the overlaid data supports hovering and zooming even in a static export such as on a web server or in an email, while the raw-data plot has been aggregated spatially before it is sent to the browser, and thus it has only the fixed spatial binning available at that time.  If you have a live Python process, the raw data will be aggregated each time you pan or zoom, letting you see the entire dataset regardless of size.