Introduction (10 minutes)
Presentation on FAIR (10 minutes)
Breakout activity to design cluster working group activities:
Introduction to breakout activity (5 minutes)
Breakout group discussions (25 minutes)
Group reports; 3 minutes per group (12 minutes)
Community discussion on group recommendations (20 minutes)
Wrapup (8 minutes)
There is a growing need among researchers, citizen scientists, and related communities for formal training in low-cost, open-source sensor technologies (e.g., Arduino, LoRa). Yet, the COVID-19 pandemic has severely limited opportunities for in-person, hands-on training. In this session, sponsored by the ESIP EnviroSensing Cluster, we will use insights and lessons learned over the past year to discuss and develop best practices for creating a web-based series of short instructional hands-on training videos covering key concepts in low-cost, open-source sensor technologies. The EnviroSensing community of practitioners promotes conversation around, and the development and refinement of techniques to observe natural Earth system processes over short and long timescales. We always welcome new perspectives and advances to the field, and we invite anyone interested in learning more about these technologies to join us (no experience necessary!).
TALKS
Mike Giordano
AfriqAir; Observatoire de Sciences de l'UNIVERS EFLUVE, LISA/IPSL, UMR CNRS 758; Université Paris Est Créteil et Université Paris
Title: Low-Cost Air Quality Monitoring and Machine Learning: Challenges and Lessons Learned from the AfriqAir Network
Aji John and Nicoleta Cristea
University of Washington
Title: High-resolution snow-covered area mapping in mountain ecosystems using PlanetScope imagery
Kevin Booth
Radiant Earth
Title: Radiant MLHub: An Open Library for Geospatial Training Data
Ryan McGranaghan
ASTRA LLC
Title: The opportunities and challenges of ML: Trends from the space weather perspective
Ziheng Sun
George Mason University
Title: Earth AI: Formulating ESIP ML Community Effort
Introduce the session structure Est. time: 5
Landform introduction (Gary Berg-Cross) Est. time: 5
Part 1 - Prioritizing SWEET work
Slido poll #1 What open Github issue do you want to work on? Est. time: 5 min
BREAKOUT #1 Vote with your feet. Join the breakout of interest. Est. time: ~10 min
FEEDBACK #1 - Breakout representative briefs larger group on breakout outcomes; allow time for comments from broader group Est. time: 15-20 min
Part 2 - SWEET Community and Project Governance
Slido poll #2 Get feedback on the most pressing community and project governance issues Est. time: 5 min
BREAKOUT #2 VOTE with your feet. Join breakout of interest. Est. time: ~10 min
FEEDBACK #2 - Breakout representative briefs larger group on breakout outcomes; allow time for comments from broader group Est. time: 15-20 min
SURVEY FEEDBACK/CONCLUSION – Brandon provides commentary of the SWEET survey responses so far. Est time: ~5-10 min
Nicole Kearney: How the biodiversity community are making historic literature discoverable online
The Biodiversity Heritage Library (BHL) is the world’s largest online repository of biodiversity literature and archival materials. It is a global consortium of 500 libraries who have made over 59 million pages from their collections freely accessible online. Yet accessible does not equate to discoverable. Unlike contemporary scientific papers, the historical literature was not “born” with digital object identifiers (DOIs). This means historic articles sit outside the convenient linked infrastructure of modern publications, appearing in today’s reference lists as unlinked citations or not at all. The upshot of this is that our historic literature is falling into obscurity. This paper will detail the work the BHL is undertaking to bring the world’s historic literature into the modern linked network of scholarly research. It will also discuss the responsibility that comes with assigning DOIs retrospectively, what we are doing to promote best practice for out-of-copyright and orphaned content, and how DOIs are demonstrating that the historic literature is still hugely relevant today.
Marie-Elise Lecoq: The Living Atlases Community
Nicky Nicholson
Curtis Dyreson & Neil Cobb: Symbiota2: Promoting FAIRness in biodiversity databases and developing “Extended Specimen” pathways
This talk discusses data management standards and practices in Symbiota2 from the values stored in a database to values in a knowledge graph. Symbiota2 a biodiversity collection management system that is a rewrite of Symbiota, one of the most popular biodiversity database applications. Symbiota2 is primarily used to store and manage specimen collection data. Data in Symbiota2 is stored and managed by a relational database. Some of the data can be exported formatted to the Darwin Core standard for integration with other datasets. Symbiota2 also provides web services to interact with the data at a more abstract level, documented using the OpenAPI standard. Finally, Symbiota2 produces a knowledge graph using the R2RML standard. An emerging focus will be to accommodate data created as part of the “Extended Specimen” phase of US collection digitization (e.g., genetics, traits, phylogenetics) and integrate specimen-based data with environmental data.
Megan Cromwell: Video Data Solutions: Standard Biological Data Quantification and Data Accessibility with Machine Learning Techniques
Video data are qualitative data that are time consuming to analyze, standardize, and archive in a user friendly manner. Over the last 12 years, the NOAA National Centers for Environmental Data Management has collaborated with the NOAA Office of Ocean Exploration and Research and their partners to systematically work through many of these challenges. We have learned that standard annotations offer a key to finding and reusing these data for scientific analysis. Machine Learning techniques are opening the doorway for annotating these data without the heavy human time commitment typically required. Partners from NOAA, Academia and non-Governmental Organizations (NGOs)have begun developing annotation methodologies through student crowd-sourced projects that will eventually resolve the many challenges associated with biological imagery annotation, for the benefit of data access and reuse.