Geospatial Data Processing

Python & Parallel Data Processing

Project Name: Geospatial Big Data Arrangement and Quality Inspection System Development.

Funding Source

Geomatics Center Of Zhejiang, Hangzhou, China.
My Role

Team Leader
(January 2018 - June 2019)
My Contribution

Data Preprocessing
System Architecture Design
Algorithm Design
Software Development
Testing

Rao, J.✉, Yu, J., Zhu, X., Du, T. and Ren, F. (2019). An Algorithm For Removing Invalid Pixels In Remote Sensing Images Based On Vector Boundary Extraction. Journal of Geomatics.

Du, T., Rao, J., Peng, R. and Du, Q*✉. (2020). Multi-Source Geographic Data Efficient Quality Inspection System Based on Python. Journal of Geomatics 45 (2), 1-6.

Introduction

Massive multi-source heterogeneous geospatial data produced and collected by Geomatics Center Of Zhejiang, including Digital Orthophoto Map(DOM), Digital Elevation Model(DEM), Spatio-temporal Thematic Data(STD) and 3D Building Model Data(BMD), need a highly effective method to automate data quality inspection process. In this project, a data quality inspection system, consisting of 4 functional modules and 11 task units, was developed. Currently, the system has been put into use, significantly reducing the workload for the Geomatics Center.

Specifically, for DOM and DEM data, we designed and implemented several efficient algorithms to detect or remove outlier area(e.g. black or white collars, inner or outer irregular outlier area) or merging error area(e.g. cracked, misaligned area), and to check their metadata and locations. For STD data, we apply regular expression, fuzzy matching and some other methods to analyze and process their locations and attributes. For BMD data, we check their location accuracy and textures by automatically parsing the model data and mapping their coordinates into geographic coordinate systems(e.g. WGS84, CGCS2000).

The system is written in Python with plenty of open source libraries such as Pyside, GDAL/OGR, Numpy, Geopandas, Fiona, Shapely, PyGeodesy, Fuzzywuzzy, etc. Threading and multiprocessing are used to parallelize data processing tasks, and Cython is used to accelerate core algorithms.


Back to Home Page

Selected Works

Recent research works and projects

Stay-at-Home Order Effects

COVID-19 & Mobile Phone Location Data

Association of mobility data indications of travel and stay-at-Home mandates with COVID-19 infection rates in the US.

Read Me

Trajectory Privacy Protection

Deep Learning & Geo-Privacy Protection

LSTM-TrajGAN: A deep learning method combining LSTM and generative adversarial network for trajectory privacy protection.

Read Me

Mapping Mobility Changes

ESRI ArcGIS Online Dashboard

Mapping county-level mobility pattern changes in the United States in response to COVID-19.

Read Me

Mobile Augmented Reality

Computer Vision & Geovisualization

Focus on geovisualization and human-computer interaction through computer vision methods.

Read Me

Geospatial Data Processing

Python & Parallel Data Processing

Using Python and plenty of open source libraries to achieve parallel geospatial data processing.

Read Me

More Projects

More Research Works and Projects.

More projects, including research works, thesis projects, course projects, and personal projects.

Read Me