DATA Pipeline ATLAS 7 Geographicdatatranslationmodel
Originally
ADR-0067 DATA_PIPELINE_ATLAS-7-GeographicDataTranslationModel (v3) · Source on Confluence ↗Geographic Data Translation Model
Context
Atlas data pipelines are dealing with geographic data. Currently, all the data are in WGS84 standard.
In some cases the data will need be transformed using metric units (e.g. applying buffer with specific radius to the geometry).
The data transformation model must ensure the highest possible precision.
World Geodetic System (WGS)
The World Geodetic System is a standard used in cartography, geodesy, and satellite navigation including GPS.
The current version, WGS 84, defines an Earth-centered, Earth-fixed coordinate system and a geodetic datum.
WGS84
WGS 84 is the standard U.S. Department of Defense definition of a global reference system for geospatial information and is the reference system for the Global Positioning System (GPS).
Universal Transverse Mercator (UTM)
The Universal Transverse Mercator is a map projection system for assigning coordinates to locations on the surface of the Earth.
Like the traditional method of latitude and longitude, it is a horizontal position representation, which means it ignores altitude and treats the earth as a perfect ellipsoid.
However, it differs from global latitude/longitude in that it divides earth into 60 zones and projects each to the plane as a basis for its coordinates.
Specifying a location means specifying the zone and the x, y coordinate in that plane.
The projection from spheroid to a UTM zone is some parameterization of the transverse Mercator projection. The parameters vary by nation or region or mapping system.
Most zones in UTM span 6 degrees of longitude, and each has a designated central meridian.
The scale factor at the central meridian is specified to be 0.9996 of true scale for most UTM systems in use.
UTM zone
The UTM system divides the Earth into 60 zones, each 6° of longitude in width.
Each of the 60 zones uses a transverse Mercator projection that can map a region of large north-south extent with low distortion.
The amount of distortion is held below 1 part in 1,000 inside each zone.
Decision
Atlas decided to use the UTM zone projection system for data transformation.
Input: Geometry in WGS84 standard
Transformation block:
- Adapt related UTM zone projection to the given geometry
- Convert geometry to related projection
- Transform geometry
- Convert transformed geometry back to WGS84 standard
Output: Geometry in WGS84 standard
Invalid Image Path
Invalid Image Path
Consequences
Pros:
- Fast and efficient method.
- Can be used directly in Spark (Pyspark) by using GEOS/PROJ libraries.
- No need for external tools like PostGIS or BigQuery.
- UTM zone projection provides high acceptable accuracy for many mapping, surveying, and navigation applications, particularly within the specific zone of interest.
- Minimal Distortion:
UTM minimizes distortion within each zone. The UTM projection is based on the Transverse Mercator projection,
which provides good representation of areas near the central meridian of each zone while introducing minimal distortion.
The distortion increases as you move away from the central meridian, but it is still generally acceptable for most applications within the zone. - Conformal Projection:
UTM is a conformal projection, which means it accurately preserves angles and shapes locally.
This property makes it suitable for tasks that require accurate representation of shapes, such as land surveys, navigation, and engineering applications.
Cons:
- Distortion at High Latitudes:
UTM is based on the Transverse Mercator projection, which introduces increasing distortion as you move away from the central meridian of each zone.
This distortion becomes more significant at high latitudes, causing scale exaggeration and positional inaccuracies.
As a result, UTM may not be the most suitable projection for areas near the poles. - Irregular Shape Representation:
UTM uses a cylindrical projection that wraps around the Earth, resulting in distortion of shapes.
As you move away from the central meridian, shapes become increasingly distorted, particularly in east-west direction.
This distortion can impact the accuracy and representation of features with elongated or irregular shapes.
Comparison to PostGIS and BigQuery
Reference geo object: Richmond Raceway
Reference geometry: POINT (-77.4200912253496 37.5917463932569)
Transformation method: apply buffer with radius 5556 meters
- Output geometry using chosen UTM zone projection system:
POLYGON ((-77.35721070031394 37.593019662976246, -77.35735871407525 37.588109161113096, -77.35811081349127 37.58323384529113, -77.35945963877212 37.57844066005474, -77.3613920902811 37.5737757500903, -77.36388945889645 37.56928401646077, -77.36692760996293 37.565008685132845, -77.37047721891818 37.56099089191509, -77.37450405620314 37.557269287757016, -77.3789693186182 37.55387966815431, -77.3838300038702 37.550854630166135, -77.38903932467151 37.548223260281425, -77.39454715840587 37.54601085607127, -77.40030052806637 37.54423868424086, -77.40624410990249 37.54292377734706, -77.4123207629845 37.542078771081485, -77.41847207570939 37.54171178363575, -77.42463892412938 37.541826338271186, -77.43076203688675 37.54242132980991, -77.43678256148381 37.5434910353538, -77.44264262660792 37.54502516912538, -77.4482858952658 37.54700898091275, -77.45365810356007 37.54942339719349, -77.45870758006369 37.5522452036136, -77.46338574091348 37.55544726711006, -77.46764755595183 37.55899879559249, -77.47145198149505 37.56286563274489, -77.47476235559564 37.56701058517482, -77.47754675199369 37.57139377882768, -77.47977828931617 37.57597304130195, -77.48143539248161 37.58070430644829, -77.48250200369729 37.58554203741554, -77.4829677408944 37.59043966412094, -77.48282800193155 37.59535003097346, -77.48208401340123 37.60022585056838, -77.48074282339745 37.605020159001775, -77.47881723813816 37.609686768423636, -77.47632570287938 37.614180712462066, -77.47329212810365 37.61845868020594, -77.46974566250861 37.62247943453098, -77.46572041485503 37.626204210694155, -77.46125512725291 37.629597091301555, -77.45639280296233 37.632625353975364, -77.45118029225733 37.63525978830372, -77.44566784033972 37.63747497895046, -77.4399086016902 37.639249552128916, -77.43395812560136 37.64056638299941, -77.42787381794558 37.641412761932784, -77.42171438448761 37.641780517986305, -77.41553926125138 37.64166609836139, -77.40940803759217 37.64107060304753, -77.40337987770604 37.639999774302815, -77.39751294632616 37.63846394106887, -77.3918638443125 37.63647791886728, -77.386487059735 37.63406086616575, -77.38143443988426 37.631236098634716, -77.37675468941981 37.62803086313124, -77.37249289958628 37.624476073644274, -77.36869011309776 37.62060601180837, -77.36538292891285 37.61645799493863, -77.36260315070417 37.61207201485414, -77.36037748236991 37.60749035103715, -77.35872727344966 37.60275716191894, -77.35766831679467 37.597918058287554, -77.35721070031394 37.593019662976246))
xmin:-77.482967740894409
ymin:37.541711783635712
xmax:-77.357210700313928
ymax:37.641780517986334 - Output geometry using PostGis:
POLYGON((-77.35721070031394 37.593019662976246,-77.35811081349127 37.58323384529113,-77.3613920902811 37.5737757500903,-77.36692760996293 37.565008685132845,-77.37450405620314 37.557269287757016,-77.3838300038702 37.55085463016613,-77.39454715840587 37.54601085607127,-77.40624410990249 37.54292377734706,-77.41847207570939 37.54171178363575,-77.43076203688675 37.54242132980991,-77.44264262660792 37.54502516912538,-77.45365810356007 37.54942339719349,-77.46338574091348 37.55544726711006,-77.47145198149505 37.56286563274488,-77.47754675199369 37.57139377882768,-77.48143539248161 37.58070430644829,-77.4829677408944 37.59043966412095,-77.48208401340123 37.600225850568386,-77.47881723813816 37.609686768423636,-77.47329212810365 37.61845868020594,-77.46572041485503 37.626204210694155,-77.45639280296233 37.632625353975364,-77.44566784033972 37.63747497895046,-77.43395812560136 37.64056638299942,-77.42171438448761 37.641780517986305,-77.40940803759217 37.64107060304754,-77.39751294632616 37.63846394106887,-77.386487059735 37.634060866165754,-77.37675468941981 37.628030863131244,-77.36869011309776 37.62060601180837,-77.36260315070417 37.61207201485414,-77.35872727344966 37.60275716191894,-77.35721070031394 37.593019662976246))
xmin:-77.482967740894409
ymin:37.541711783635712
xmax:-77.357210700313928
ymax:37.641780517986334