GeoTrellis
GeoTrellis is an open source, geographic data processing library designed to work with large geospatial raster data sets. It is written in Scala and has an open-source Apache 2.0 license.
Developer(s) | LocationTech, Azavea |
---|---|
Initial release | 12 May 2012 |
Stable release | 3.5.1
/ 23 November 2020 |
Repository | |
Written in | Scala |
Operating system | Linux |
Type | Big Data, Map algebra |
License | Apache License 2.0 |
Website | geotrellis |
Description
GeoTrellis' core competency is raster data processing: enabling distributed processing of large geospatial raster data sets using the techniques of map algebra. In addition to support for raster data operations, GeoTrellis includes some support for operations using vector and point cloud data.
GeoTrellis leverages Apache Spark for distributed processing. Distributed processing relies on indexing large datasets based on a multi-dimensional space-filling curve (SFC). SFCs enable the translation of multi-dimensional indices into a single-dimensional one, while maintaining geospatial locality. This allows for efficient reading and writing of large datasets to be performed in parallel across multiple computers.
Python bindings have been developed for GeoTrellis as a sub-project called GeoPySpark that enables Python developers to access and use the GeoTrellis library.
Project History
GeoTrellis started as a research project at Azavea, a geospatial software company based in Philadelphia. A precursor software component, DecisionTree, was developed beginning in 2006 with support from a Small Business Innovation Research grant from the U.S. Department of Agriculture. In 2009, with financial support from the William Penn Foundation and Stroud Water Research Center, Azavea embarked on early development of GeoTrellis.
GeoTrellis was released as an open source project in 2011 [1] with the goal of supporting fast processing of geospatial raster data at scale.
GeoTrellis initially supported distributed computation through Akka, a Scala framework for building concurrent and distributed applications. The need to support additional use cases and features such as caching and sharding datasets across a storage cluster led to a search for a new distribution framework. GeoTrellis moved to Apache Spark as its distribution engine in 2014 [2] in order to leverage management, scheduling, and other features in the Spark framework. One key use case that drove this phase of development was the need to efficiently process large, spatiotemporal datasets like those used for many earth science applications, such as climate change.[3] The move to Apache Spark enabled efficient support for large climate change forecast datasets published by the Intergovernmental Panel on Climate Change (IPCC).
GeoTrellis was submitted to the Eclipse Foundation's LocationTech[4] working group in 2013 and graduated from incubation with a 1.0 release in December 2016.[5]
GeoTrellis has been used in a number of geospatial domains including: satellite and aerial image processing, forest growth simulation, agricultural yield predictions, planning, digital humanities, government infrastructure investment, and machine learning to support crime risk forecasting. It is currently integrated into other open source software projects including: Raster Foundry,[6] Raster Frames,[7] and GeoPySpark.[8]
References
- "Introducing GeoTrellis". Eclipse Foundation. March 2014. Retrieved August 2, 2017.
- "GeoTrellis: Adding Geospatial Capabilities to Spark". Spark-Summit. 2014. Retrieved 2 August 2017.
- "GeoTrellis Adapts to Climate Change and Spark". Eclipse Foundation. December 2014. Retrieved 2 August 2017.
- "LocationTech GeoTrellis". Eclipse Foundation. Retrieved July 21, 2017.
- "GeoTrellis 1.0 Release with LocationTech". Azavea. 9 January 2017. Retrieved 21 July 2017.
- "Raster Foundry source code repository". Azavea. Retrieved 1 August 2019.
- "Raster Frames project home page". Astraea. Retrieved 1 August 2019.
- "Introducing GeoPySpark, a Python Binding of GeoTrellis". Azavea. 19 September 2017. Retrieved 1 August 2019.