What is Project Mosaic?
Mosaic is an extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.
Mosaic provides:
-easy conversion between common spatial data encodings (WKT, WKB and GeoJSON);
-constructors to easily generate new geometries from Spark native data types;
-many of the OGC SQL standard ST_ functions implemented as Spark Expressions for transforming, aggregating and joining spatial datasets;
-high performance through implementation of Spark code generation within the core Mosaic functions;
-optimisations for performing point-in-polygon joins using an approach we co-developed with Ordnance Survey (blog post); and
-the choice of a Scala, SQL and Python API.