GNSS Radio Occultation Data in the AWS Cloud
Stephen
Leroy
Atmospheric and Environmental Research
Poster
The volume, bandwidth, and organization of GNSS radio occultation data makes them difficult for research outside of institutions that host the data, but with the growth of cloud computing and storage environment it is becoming possible for anyone, anywhere, and anytime to analyze all available radio occultation (RO) data at an order of magnitude less expense and far greater ease than is currently being done in local computing environments. We have obtained funding through the NASA ACCESS program to host all GNSS RO data for all missions in the Amazon Web Services (AWS) Open Data program. The data are contributed independently by the COSMIC DAAC at UCAR and the NASA Jet Propulsion Laboratory of the California Institue of Technology. In addition, the Radio Occultation Meteorology Satellite Application Facility (EUMETSAT) has contributed its climate data record product for COSMIC. The data will be available in UCAR native formats: conPhs files will contain level 1b calibrated excess phase; atmPrf files contain level 2a retrievals of bending angle, refractivity, dry temperature, and geopotential height; and bfrPrf files will contain level 2 data in binary unformatted (BUFR) format. A live database will be available to ease access and filtering of the data. Also, a cloud-streamable format for the level 2 data will be formed that makes the level 2 data easily accessible through in-line NetCDF-like interfaces.
By making RO data available in AWS, processing the data becomes simple and inexpensive with nearly infinite computing resources at hand. The RO data are available through simple NFS-like mounts, and there is no cost to connect to them. In order facilitate the manipulation of RO data, we will construct tutorial demonstrations of research projects that are currently common in the RO science community: (1) inter-center comparison of RO products, (2) level 2 retrieval, (3) planetary boundary layer analysis, (4) data assimilation into a numerical weather prediction system, and (5) mapping RO data by Bayesian interpolation through an API. In future IROWG meetings, half-day workshops will be conducted to introduce the RO community to AWS: how to set up and manage an AWS account, storage resources such as S3 buckets, and computing resources such EC2 instances, Batch and Lambda. Demonstrations of the above tutorials will also be presented at the workshops. All documentation and tutorial demonstrations will be publicly available through github.
By making RO data available in AWS, processing the data becomes simple and inexpensive with nearly infinite computing resources at hand. The RO data are available through simple NFS-like mounts, and there is no cost to connect to them. In order facilitate the manipulation of RO data, we will construct tutorial demonstrations of research projects that are currently common in the RO science community: (1) inter-center comparison of RO products, (2) level 2 retrieval, (3) planetary boundary layer analysis, (4) data assimilation into a numerical weather prediction system, and (5) mapping RO data by Bayesian interpolation through an API. In future IROWG meetings, half-day workshops will be conducted to introduce the RO community to AWS: how to set up and manage an AWS account, storage resources such as S3 buckets, and computing resources such EC2 instances, Batch and Lambda. Demonstrations of the above tutorials will also be presented at the workshops. All documentation and tutorial demonstrations will be publicly available through github.