Coastal Carbon Research Coordination Network

CCRCN icon

CCRCN name

 

 

 

 

 

Now Accepting Nominations for our Steering Committee

The CCRCN is seeking nominations for three new members on our steering committee. Committee members contribute towards shaping the goals and timelines of the CCRCN and participate in collaborative research. Review will begin on December 1, 2018. See here for more information.

 

CCRCN Town Hall at AGU Fall Meeting, December 13 2018

The CCRCN will be hosting another town hall at the American Geophysical Union Fall meeting, December 10-14 2018. The town hall will be held from 18:15-19:15 on Thursday, December 13 in in the Marriott Marquis.​ Learn more about the town hall here.

 

Soil Carbon Working Group Up and Running

The CCRCN's inaugural working group began collaborating in July. The group seeks to determine how much variation in carbon stocks and burial rates is attributable to scientific practices versus environmental covariates, and to develop map and model products. Learn more on the Soil Carbon Working Group page.

 

Learn Data Manipulation with the CCRCN R Coding Tutorials

We have just made available a suite of coding tutorials that utilize RStudio and tidyverse packages to teach data manipulation and visualization skills. These exercises, which are geared towards the introductory R user, enable you to explore the CCRCN Soil Carbon Data Release (version 1), but also will improve data skills related to your own projects.

Check out the coding tutorials here.

 

Apply for a Working Group

The application portal for the CCRCN working groups is now open. Five working groups, each culminating with an in-person workshop, will take place over the next five years. The steering committee has decided to announce the titles and timing of the first two working groups, as well as suggestions for future working groups. If you are interested in participating in any of the working groups, please fill out an application. The survey will remain open indefinitely, however we will begin considering applicants for our working group 1 on Thursday, July 12th.

Apply Now

 

Soil Carbon Data Release

The dataset for a NASA CMS-funded and CCRCN-managed publication in Scientific Reports has now been made public via Smithsonian Libraries. Data from 1534 soil cores are available, including per-depth soil organic matter and carbon metrics, plant species identity, state of human impact, field and lab methodology, and core metadata. The paper, led by CCRCN manager James Holmquist, reveals that simple strategies are the most effective for mapping soil carbon, for now.

Click here for data release

 

Public Comment Open on Three Project Principles Documents

In an effort to be responsive to the community, we would like to announce the beginning of a public comment period for 3 key documents: CCRCN Governance, Data Management Principles, and Controlled Vocabulary for our Tidal Soil Carbon Synthesis Products. Public comment will be open for two weeks starting Monday April 2 and ending Friday April 13th.

View Documents

 

First Data Release

We officially launch starting January 2018, but we are already hosting our first synthetic dataset on the Smithsonian's GitHub page: a literature review supporting the U.S. Coastal Wetland National Greenhouse Gas Inventory. The page includes summary tables, a report, and an open workflow including SAS code used to create the dataset.

Building a Collaborative Network for Coastal Carbon Cycle Synthesis

Tidal marshes, mangrove swamps and seagrass meadows are unique ecosystems found on coastlines worldwide. These wetlands support specialized plant, microbe and animal species that collectively form some of the Earth’s most productive ecosystems, influencing the ecology of estuaries and coastal oceans. Coastal wetlands are also under severe pressure from human activity which threatens to diminish the many benefits they provide to people and aquatic food webs. Among these benefits is the fact that they remove large amounts of the greenhouse gas carbon dioxide from the atmosphere and bury it in soils for centuries to millennia. Indeed, these ecosystems account for nearly 50% of the organic carbon buried in the oceans despite occupying less than 1% of ocean area. This surprising fact suggests an opportunity: that protecting, restoring and managing these ecosystems could help manage greenhouse gas concentrations in addition to the list of other ecological and social benefits they provide. The pace of research on this topic has accelerated and is now too rapid to be synthesized by individual investigators. The goal of this Research Coordination Network is to advance the synthesis of coastal wetland carbon cycle data.

The Coastal Carbon Research Coordination Network will accelerate scientific discovery, advance science-informed policy, and improve coastal ecosystem management by: (1) developing a community dedicated to coastal wetland carbon science for basic research, policy development, and management, (2) exploring the ecological links between coastal wetlands, estuaries, and the atmosphere, and (3) sharing data and analysis tools that support the diverse needs of scientists, policy makers and managers. Activity 1 is a repository for participant-contributed data, and a central portal for downloading data from repositories of interest to the coastal carbon community. Activity 2 is a Coastal Carbon website to attract participation of diverse users by providing a variety of resources that meet their needs. It will provide data analysis tools, a knowledge sharing resource, a video library of training modules in standard methods, a code library to support modeling, links to publications, and a webinar library. Activity 3 is outreach via a series of webinars and 'town hall' gatherings at professional meetings. Activity 4 is a series of workshops on scientific gaps in coastal carbon. Activity 5 is a web-based tool for modeling global warming potentials.

 

Funding


NSF logo

 

 

Partners


 

usgs logo

 

 

CI logo

MBL logo

 

Blue Carbon Initive

US carbon cycle science

Return to Top

About the CCRCN

About the CCRCN

The Coastal Carbon Research Coordination Network is a consortium of biogeochemists, ecologists, pedologists, and coastal land managers with the goal of accelerating the pace of discovery in coastal wetland carbon science by providing our community with access to data, analysis tools, and synthesis opportunities. We have accomplished this goal by growing iteratively with community feedback, facilitating the sharing of open data and analysis products, offering training in data management and analytics, and leading topical working groups aimed at quantitatively reducing uncertainty in coastal greenhouse gas emissions and storage. We also curate a Data Clearinghouse that offers infrastructure and tools for accessing, visualizing, and summarizing data. Major achievements include:

 

Outreach Count
Twitter Followers 448
Mailing List Subscribers 200
Past and Present Steering Committee Members 9
Past and Present Working Group Participants 14

 

 

 

 

 

 

 

The CCRCN is overseen by three investigators and a rotating steering committee.

Steering Committee Timeline
I = Investigator, SC = Steering Committee, P = Personnel

 

Call for CCRCN Steering Commitee Member Nominations

 

The Coastal Carbon Research Coordination Network Now Accepting Nominations for our Steering Committee

 

The Coastal Carbon Research Coordination Network (CCRCN) is seeking nominations for three new members on our steering committee. The CCRCN is an NSF-funded 5-year project hosted at Smithsonian Environmental Research Center with the goal of accelerating the pace of discovery in coastal carbon science. We do this by serving a diverse user base of scientists and practitioners with synthetic data, workshops, and open source modeling tools. Learn more about our project here.

Steering committee membership is a volunteer position and is essential to CCRCN governance. There are myriad benefits for participation. Steering committee members advise CCRCN administrators and personnel on critical programs and initiatives, direct and participate in collaborative research, and the CCRCN recognizes steering committee institutions as partner organizations. This is an opportunity to take a substantial leadership role within the coastal carbon community. Committee membership is a one-year tenure minimum with potential to extend or expand into other CCRCN leadership roles.

Duties for all steering committee members include:
  • Attend a 1-hour monthly teleconference
  • Attend an annual Town Hall at the American Geophysical Union Fall Meeting
  • Contribute towards shaping the goals and timelines of CCRCN products
  • Give regular constructive feedback to CCRCN personnel to improve project implementation and governance
  • Network to identify potential new sources of funds or collaboration
  • Vote on major decisions such as changes to the CCRCN governance structure
 
Steering committee members may also take on additional duties such as:
  • Assist in the leadership and implementation of topical working groups
  • Write grants or raise other funds to further CCRCN goals
  • Speak on behalf of the CCRCN at meetings and conferences
 

We hope to address specific project needs with these new additions, but emphasize that we will take all nominees into consideration.

1. We will prioritize a nominee who can represent the U.S. Geological Survey, as our two representatives from that vital partner agency are currently stepping down.
2. We will prioritize a nominee willing to take a leadership role in planning and executing our Methane Working Group (“Improved process modeling and mapping of tidal wetland methane emissions”) starting in Summer 2019, with an in-person workshop in December 2019.
3. We will prioritize a candidate who can advise us on our product development on behalf of our stakeholder community. We strongly encourage students and early career scientists to apply, as well as women and people from underrepresented communities, so that our steering committee can better reflect the diversity of people within our growing field.
 

If you would like nominate yourself or a colleague please email CoastalCarbon@si.edu by December 1st. If nominating yourself please include a C.V. and a brief statement (no more than 300 words) indicating: 1. Your ability to execute the duties of an CCRCN steering committee member, 2. Any additional duties you would be willing to integrate into your CCRCN service, 3. How your participation would meet stated CCRCN needs, 4. Any ideas you have for pushing the CCRCN forward into 2019 and beyond.

 

The steering committee will meet to vote on nominations in December of 2019 and we will announce new members at our AGU Town Hall on Thursday December 13th.

 

Please email us at CoastalCarbon@si.edu if you have any inquiries regarding the priorities and initiatives of the CCRCN.

 

More about the CCRCN Steering Committee

The CCRCN is hosed at Smithsonian Environmental Research Center (SERC) staffed by two employees (James Holmquist, Manager; and David Klinges, Data Technician), supervised by two SERC principle investigators (J Patrick Megonigal, CCRCN Director, and Emmett Duffy), and advised remotely by five steering committee members (Lisamarie Windham-Myers, USGS; Kevin Kroeger, USGS; Jim Tang, MBL; Emily Pigeon, Conservation International; and Jorge Ramos, CI). Steering committee members serve for one year, but can be reappointed for one year at the discretion of the CCRCN Director. Candidates for the steering committee can nominate themselves or be nominated by their colleagues. The steering committee will vote on replacements, and members stepping down will help to choose their own replacement. Rotations are staggered so that no more than half of the rotating members on the Steering Committee are replaced in a given year. Learn more about CCRCN’s governance here and about current leadership here.

Coastal Carbon Working Groups

Coastal Carbon Working Groups

In order to quantitatively improve the state of the science through collaboration and training, one of our proposed activities is to use five topical working groups over the next five years to share data and expertise. 

Our first working will meet in December 2018 at the Smithsonian Environmental Research Center. Participants for the working group have been selected, but if you are interested in attending to observe, and have the ability to fund your travel, please let us know at CoastalCarbon@si.edu.

Although we are no longer accepting applicants for our first working group, the application portal below is left open for gauging interest in future working groups.

Working Group Overview: Improved measuring, reporting, modeling, and mapping of soil carbon burial rates and carbon stocks in coastal wetlands

Learn about the members and goals of the CCRCN's inaugural working group.

Apply for Working Groups

Interested in joining a future working group? Express your interest here. Although no formal applications are currently being received, entries here will be used to gauge interest for topics that deserve attention.

 

Soil Carbon Working Group

Soil Carbon Working Group

Improved measuring, reporting, modeling, and mapping of soil carbon burial rates and carbon stocks in coastal wetlands

 

Time and Location:

December 8 and 9 (before AGU), hosted at the Smithsonian Environmental Research Center, Edgewater, Maryland

Research Questions: 

1. How much variation in carbon stocks and burial rates is attributable to field and lab techniques, and how much to environmental covariates?

2. What is the potential for machine learning or process-based modeling to map carbon stocks and burial rates?

Final Products:

1. Paper on best practices for field, lab, and data management

2. papers on modeling and mapping

3. Mapped products at the scale of the contiguous United States if applicable

Working Group Application

**APPLICATIONS FOR WORKING GROUP 1 ARE NOW CLOSED.**

 

A goal of the CCRCN is to quantitatively improve the state of the science. One of our proposed activities is to use five topical working groups over the next five years to share data and expertise. The steering committee has decided to announce the titles and timing of the first two working groups, and suggestions for future working groups. Our decisions have been made based on three insights:

1. The initial results of a sensitivity analysis of U.S. coastal wetland flux which quantitatively rank priorities for reducing uncertainty.
2. Feedback from you at our 2017 AGU Town Hall, our 2018 community priorities survey, and individual outreach with many of you.
3. A recognition that the CCRCN steering committee needs to offer enough detail and leadership so that the workshops have direction, but not at the expense of remaining flexible to the changing nature of research and community priorities over the next five years.

These are the first two working groups we propose, and suggestions for future working groups:

1. Improved measuring, reporting, modeling, and mapping of soil carbon burial rates and carbon stocks in coastal wetlands

Time and Location: December 8 and 9 (before AGU), hosted at the Smithsonian Environmental Research Center, Edgewater, Maryland

Research Questions: 1. How much variation in carbon stocks and burial rates is attributable to field and lab techniques, and how much to environmental covariates? 2. What is the potential for machine learning or process-based modeling to map carbon stocks and burial rates?

Final Products: 1. Paper on best practices for field, lab, and data management; 2. papers on modeling and mapping; 3. Mapped products at the scale of the contiguous United States if applicable.

2. Improved process modeling and mapping of tidal wetland methane emissions

Timing: December 2019

Potential Future Working Groups

Detecting Carbon Flux Associated with Wetland Loss and Restoration (Timing: 2020 or 2021)

CO2 Vertical Flux and Scaling from the Chamber, Eddy Flux, to the Globe  (Timing: 2020 or 2021)

Quantifying Uncertainty Reduced by CCRCN Products, Scaling Outside the US, Determining New Research Priorities (Timing: 2022)

Each topical working group will have 12-15 participants. Each participant will be expected to agree to a code of conduct, contribute at the level of a coauthor, participate in remote collaboration in the months leading up to a two day workshop, attend all of the workshop, and assist in revising analyses and reviewing paper drafts following the workshop. We strongly encourage students and early career scientists to apply, especially as participation in data synthesis may advance publication and career opportunities. To maximize the diversity and number of participants, applicants should not expect to be selected for more than two working groups over the next five years. Travel funding will be provided for in-person workshops. Unfortunately funding for non-U.S. based collaborators is very limited. If you can provide your own funding, please indicate this on the application as it may free up funds for international participants.

Please indicate your ranked preferences 1 = highest, 5 = lowest.

 
1 Start 2 Complete
Personal Info
Working Group Preferences
Note: applications for Working Group 1 are now CLOSED. Display of topics is primarily to gauge interest. A future application will be provided for these later working groups. 1 is highest priority, 4 is lowest priority.
1 2 3 4
Improved process modeling and mapping of tidal wetland methane emissions *
Tentative: detecting carbon flux associated with wetland loss and restoration *
Tentative: CO2 vertical flux and scaling from the chamber, eddy flux, to the globe *
Tentative: quantifying uncertainty reduced by CCRCN products, scaling outside the US, determining new research priorities *
Please explain why you would be a good fit for your preferred groups (e.g. unpublished data, data analysis skills, insights on the processes): *
Experience
Please describe your level of experience with the given skillset.
Travel

Principles and Governance

Version 1

3 July 2018

The goal of the Coastal Carbon Research Coordination Network (CCRCN) is to accelerate the pace of discovery in coastal wetland carbon science by providing our community with access to data, analysis tools, and synthesis opportunities. Our activities include bringing data libraries online, creating open source analysis and modeling tools, providing training and outreach opportunities, holding town halls, responding to community feedback, and hosting data synthesis workshops targeted at strategically reducing uncertainty in coastal carbon science issues. Our first focal activity is building a public online data library of soil carbon data.

The Coastal Carbon Research Coordination Network (herein Network) builds on work by the Blue Carbon Initiative, the NASA Blue Carbon Monitoring System, and the US Carbon Cycle Science Program. Our data management principles incorporate experience from these efforts, and best practices developed in collaboration with data management specialists across the Smithsonian and our partner institutions.

Contents

Core Principles

Return to Top

  1. We are responsive to a global community of scientists and practitioners.
  2. We focus on quantifiable improvements to the state of the science.
  3. We adopt protocols, policies, and communication platforms that facilitate transparency, ease of adoption, program sustainability, and data stability.

Defined Roles and Responsibilities within the CCRCN

Return to Top

  • Principle Investigators - Emmett Duffy, Patrick Megonigal, and James Holmquist are responsible for executing the project as proposed to NSF, reporting to NSF on project progress, and carrying out the fiduciary requirements of the grant.
  • Director - The Director (Patrick Megonigal) is responsible for directing the activities of the Principal Investigators, and overall management of the Network.
  • Manager - The Manager (currently James Holmquist) is hired by the Director with input from the steering committee. The manager is responsible for leading the daily activities of the Network, responding to stakeholders, and interacting with the Steering Committee.
  • Steering Committee Members - Steering committee members are responsible for advising the Director and Manager on Network management, adherence to core principles, workshop topics, and evaluating steering committee nominees.
  • CCRCN Personnel - Principal Investigators, the Steering Committee, and people with official Smithsonian affiliations who work on Network-related tasks. Personnel are responsible for implementing the activities of the Network.
  • Collaborators / Collaborating Organizations - Collaborators include researchers who are not Network personnel, but are otherwise actively contributing to Network products in collaboration with Network personnel, or as part of data synthesis workshops. Collaborators are expected to participate at the level of co-author on synthesis products. Collaborating Organization responsibilities are similar to Collaborators except the organization and the Network have entered into a memorandum of understanding to formalize expectations with respect to the Network. Collaborating organizations are expected to explore financial support of activities that are not supported by the NSF funding that established the Network, and to explore opportunities to secure funding to support Network activities beyond the initial five years of NSF funding.
  • Partners / Partner Organizations - Include anyone engaged in any Network activities, but more informally than Collaborators. These include remote consultations, town halls, Twitter, webinars, online surveys, and participating in public comment periods. The best way to be recognized as a partner is to sign up for regular Network email updates. Partner Organizations include any organizations that interact with the Network in a manner similar to Partners, but have entered into a memorandum of understanding with the Network that formalizes activities such as consultation or other non-monetary support for Network goals.
  • Users - Anyone using data structures or synthesis products created by the Network. There is no obligation to involve Network personnel or collaborators in individual research efforts beyond the workshops funded by the Network NSF grant. Users are responsible for properly citing Network synthesis products, and properly citing original authors when datasets curated or synthesized by the Network are downloaded and reused. See the Data Use Policy.

Steering Committee Membership

Return to Top

The three Principal Investigators are permanent members of the Steering Committee. Five additional members will be chosen to assist the Network with existing or emerging needs as identified by the Steering Committee. In principle, the members should represent a range of stakeholder interests and technical expertise. Members serve for one year, but can be reappointed for one year at the discretion of the Director. Candidates for the steering committee can nominate themselves or be nominated by their colleagues. The steering committee will vote on replacements, and members stepping down will help to choose their own replacement. Rotations are staggered so that no more than half of the rotating members on the Steering Committee are replaced in a given year.

Data Synthesis Workshops

Return to Top

Data synthesis products led by the Network will be developed over the course of five workshops organized and led by Network personnel and collaborators. Potential collaborators can propose a workshop, or apply to participate in workshops. The Steering Committee will vote on topics and participants for each workshop, which will typically have 12-15 participants.

Coauthorship Policy

Return to Top

Those accepted to participate in any of the five synthesis workshops hosted by the Network will be expected to contribute before, during, and following the workshop, and will be granted co-authorship on publications resulting from the effort. We will follow the American Geophysical Union’s 2017 Scientific and Professional Ethics: Guideline B. Ethical Obligations of Authors/Contributors for determining co-authorship. Submitting data to the network alone will not merit co-authorship in data syntheses. If done according to the protocols established herein, it will result in citation.

Co-authorship policies in data sharing exercises often benefit established researchers from western industrialized nations at the expense of those from groups with less institutional power 1. We commit to adopting policies and technologies that facilitate engagement of students, people with indigenous knowledge, and researchers from low and middle income countries as attendees and co-authors in synthesis activities. This policy will be implemented within the limits of NSF CCRCN grant resources.

Data Use Policy

Return to Top

We refer to users as anyone using either data we curate, or synthesis products we create. Data that is curated, but not created, by the Network, should not be attributed to the Network. Users should cite all dataset DOIs and credit the datasets’ original authors. All synthesis products created by the Network and associated collaborators will be listed under a Creative Commons With Attribution license. The Network should be acknowledged and cited appropriately if users utilize any of the data structures, tools, or scripts developed by Network and associated collaborators. We will develop additional tools to assist users in generating lists of citations, but users will be ultimately responsible for correctly citing all data used.


  1. Serwadda D, Ndebele P, Grabowski MK, Bajunirwe F, Wanyenze RK (2018). Open data sharing and the Global South: Who benefits?. Science, 359(6376), 642-643.https://doi.org/10.1126/science.aap8395

Data Management, Structure, and Products

Coastal Carbon Data Clearinghouse

 

 

Data Management Plan (Version 1)

The CCRCN strives for transparency with methods of data archival and management. Here we overview data types and structure associated with the Network, and procedures for data storage and quality control.

Database Structure (Version 1)

An exhaustive description of the naming convention for attributes and variables recommended by the CCRCN.

Data Products

The Network recognizes three classes of data: (i) data that we curate, (ii) data that we ingest, and (iii) synthesis products we create. Data that we curate will be hosted on Smithsonian Institution (SI) servers, but the original data submitter and funding sources will be credited as the dataset’s creators. Data that we ingest will include both data we curate and data from any outside sources that meet basic availability, archiving, and metadata standards.

See below for a list of CCRCN data products.

 

Dataset

Data Components

Last Updated

Accuracy and Precision of Tidal Wetland Soil Carbon Mapping (associated paper here) Per-depth soil organic matter and carbon metrics, plant species identity, state of human impact, field and lab methodology, and metadata of 1534 soil cores 21 June 2018
Coastal National Greenhouse Gas Inventory: Report, Datasets, and Workflow Literature review, data, analysis, and report 9 December 2017

 

Testimonials to Data Contribution

 

“The Coastal Carbon Research Coordination Network dataset has been invaluable in our recent research identifying global drivers of variability in coastal wetland carbon cycling. The Network’s dataset greatly complemented our own previous data collation efforts, filling important gaps in our record. The availability of a comprehensive and well-curated dataset allowed us to focus on the analysis and interpretation of data, deriving important new insights in global patterns of carbon storage.”

- Jeffrey Kelleway, Department of Environmental Sciences, Macquarie University

"The CCRCN database is a key cornerstone in accelerating the pace of discovery for coastal carbon cycling.  I recently downloaded version 1, and have begun analylzing it and intercomparing its features with other national and global sets on soil core characteristics. As it focuses only on soilcores from tidal wetland, it is the single largest and spatially explicit empirical dataset, globally, for populating carbon stock assessments or testing models across space and time. For coastal lands, it is an invaulable asset for scientists and managers alike.  The developers of the dataset and platform should be commended, as should the many community contributors who are fueling advances in science and practice by sharing their data."

- Lisamarie Windham-Myers, U.S. Geological Survey

 

Return to Top

Data Management Plan

Version 1

3 July 2018

Contents

 

Types of Data

Return to Top

The Network recognizes three classes of data: (i) data that we curate, (ii) data that we ingest, and (iii) synthesis products we create. Data that we curate will be hosted on Smithsonian Institution (SI) servers, but the original data submitter and funding sources will be credited as the dataset’s creators. Data that we ingest will include both data we curate and data from any outside sources that meet basic availability, archiving, and metadata standards. These data will be pulled into intermediate files in a centralized database using R code. The workflow and files will be archived and publicly available on an SI-managed GitHub website. This document refers to soil depth profile data throughout, but it is our intention that these general structures and principles be applied to other types of data as the Network evolves.

 

Digital Object Identifiers

Return to Top

We encourage submitters to use best practices and to assign datasets a citable digital object identifier (DOI), which links to a repository containing downloadable data and associated metadata. We will prioritize ingesting such data into the synthesis. Data submitters can choose to forward DOIs issued outside of the Network for ingestion into the central data structure on the Network’s SI GitHub website. Some DOI-issuing repository services include Figshare and the Environmental Data Initiative. As a service to the community Network personnel can will be available to assist data submitters in archiving data according to outlined standards.

Submitters also have the option to host data on an SI server, and apply for a DOI through SI libraries, with the submitters credited as a dataset author and Network personel credited for their curatorial role. Landing pages with summaries of projects, sites, and cores will be viewable on Dspace, and the CCRCN website will advertise a link to the data release. While data will be digitally archived long-term in accordance with SI standards, we cannot guarantee new data will be accepted after Network funding ends.

While we recognize that there is no official definition for what constitutes a trusted repository, repositories associated with DOIs should in general have community recognition and trust in their long-term stability. For data curated by the Network we hope that SI’s reputation, DSpace’s status as an approved technology, and the SIL’s commitment to digital object curation, generate this level of community trust.

 

Metadata Standards

Return to Top

For data curated by the Network, we will use the Environmental Metadata Language standards. This includes an abstract, detailed submitter information, attribute definitions, and data types (e.g. character, factor, numeric, or dateTime). CCRCN personnel will use the R-based EML package in our workflow to create metadata for data that we curate. Code used to create EML will be documented, and archived in a Smithsonian GitHub repository.

 

Attribute Names

Return to Top

Attribute names (analogous to column names in a spreadsheet) should follow good management practices1. Attribute names should be self descriptive and machine readable. They should contain no spaces and must not begin with a number or special character; however, underscores (i.e. pothole_case) are acceptable. We will recommend and adopt controlled vocabulary for attribute naming. Any submitter defined attributes should follow the same naming principles and documentation.

Units for all attributes need to be defined and in some cases controlled. For some variables which typically have commonly reported units we will recommend submitters format using these controlled units. These include fraction_organic_matter (fraction), dry_bulk_density (g cm-3) and latitude and longitude (decimal degrees [world geographic survey 1984]). For attributes that are applicable to the synthesis, but typically have multiple common unit formats, we recommend an accompanying column defining these units. Uncommon data types, or data types not included in synthesis projects, simply need to have units defined in associated metadata.

Good data practices require consistently formatting no data values and categorical variables. We have adopted the R-based convention of representing no data values as NA for all variable types (never blanks). Categorical variables should have descriptive names stored as text, similar to attribute names. For example, one may code the categorical variable treatments as numeric values 0 and 1 standing in for experimental and control; however, best practices would dictate coding these as descriptive characters (experimental and control) rather than numbers.

For data we curate we will use controlled vocabulary units and variable types. For data we ingest, we will keep a file of corresponding controlled variables and aliases so that data not complying with controlled vocabulary can still be ingested. We will document transformations made to ingested data to standardize them with the data we curate in R code.

Proposed Level of Disaggregation

Return to Top

In general we believe that there should be community agreement on the finest level of data disaggregation archived for practical use and reuse. This fundamental unit should be the most detailed unit typically used and reported in the literature. For soils data we will stratify by site, by core, and by depth increment. For calculations such as loss-on-ignition and bulk density, data by depth increment will be the fundamental level of archiving. For age-depth information we will archive radiocarbon (14C) age ? sd of a sample for 14C dates, and counts per unit dry weight ? sd of a sample for 210Pb and 137Cs profiles.

Hierarchical Structure

Return to Top

We will ingest existing data and curate submitted data in a hierarchical framework. Information associated with submitters, projects, sites, cores, and depth profiles will all be hosted in separate tables related by index codes that are unique. A universal dataset index will be composed of the principal investigator’s family name, as well as the second author or ‘et al’ in the case of more than two authors, then the publication year. Sequential letters will be added to the end (a,b,c, etc.) in case of multiple publications per year (Example: Jane Doe, Lee Fakeman and Ben Mademup’s 2009 paper = Doe_et_al_2009).

Project metadata will have an abstract and information about coauthors, associated funding source, or set of funding sources, associated publications, and materials and methods. A project should be a discrete unit of research united by consistent personnel, funding sources, and/or materials and methods. A project can be associated with one or more sites.

Sites refer to discrete geographic or management units and are somewhat nebulous, project specific, and submitter defined. A site code should follow the same best practices for variable naming: not starting with a number, descriptive, brief, and meaningful to project documentation and design. Site metadata refer to data associated with the sites, such as location, notes on dominant vegetation types, salinity, and site condition. Although there are no standards for what constitutes a site, and different projects could have different names for the same site, this coding should be consistent within a project. A site can have one or more data sets, including one or more core, plot, or instrument locations.

Core/Plot/Instrument-Level Data refer to information specifically about the location of a soil core. This could include positional information such as latitude, longitude, and elevation. It could also include notes that are redundant but more detailed than site metadata, such as vegetation and salinity. Each core should have a unique code within a project. A core code should follow the same best practices for variable naming: not starting with a number, descriptive, brief, and meaningful to project documentation and design. For future syntheses this level of hierarchy could also house other types of data such as vegetation plots or instruments.

Depth Series Information: Soil cores have depth-series information which should include minimum and maximum depth increments, as well as measurements presented in their fundamental unit (explained above), with associated methods notes, and uncertainties. If replicate samples are taken from the same depth increment then these can be distinguisehd with a sample code. These can be submitter defined, but should conform to general principles for variable naming. In future syntheses time-series data could also be archived in this format with instrument replacing a core, and time replacing depth.

 

Data Storage

Return to Top

Tabular data will be stored in a flat text file, meaning that no data formatting will be included. We will default to using tab-delimited text files (.txt) for simplicity. Although comma separated values .csv are also common, these types of files can be corrupted if commas are ever used within a variable rather than to delineate variables. Submissions that are received in other formats, such Microsoft Excel files, will be edit-locked and archived as submission documents. However, a working version of this submission will be formatted according to Network standards in flat text files.

Tabular data will be stored in long-form tables, as opposed to wide-form tables. Each column should correspond to one variable, each row should correspond to one entry. Each column needs to have a single data type, and represent only one variable. Extra information such as units, notes, or operator code will not be encoded as an excel note, cell color, or be included in the same cell as a value.

 

Quality Control

Return to Top

Network personnel will perform a cursory visual check on all data we curate and relay any concerns to the data submitter during the curation process. We will also write scripts to check data type, to check that values for each attribute are in defined bounds if applicable (such as 0 to 1 for fractions), to check for completeness, and to ensure there are no conflicts or duplicates with previously archived data. For data that we curate, files will be edit-locked following QA:QC. Any updates or corrections will result in a new named version of the file, a change logged by Network personnel in a text file associated with the data, and a note of the change sent to the next update of the Network email list members.

 

Submitting Data

Return to Top

If you are interested in submitting data, please email CoastalCarbon@si.edu and CCRCN personnel will assist you in the process. We are working on building a webportal to automate a lot of this exchange, but until we do so this will remain a very friendly peer to peer handoff system. Data submissions can remain embargoed for a time specified by the submitter. In embargo cases a data release will be prepared and shared with the submitter via a private dropbox link, until the embargo period ends, the data release is made public, and the dataset is drawn into synthesis products.


  1. Wilson G, Bryan J, Cranston K, Kitzes J, Nederbragt L, Teal TK (2017) Good enough practices in scientific computing. PLoS Comput Biol 13(6): e1005510. https://doi.org/10.1371/journal.pcbi.1005510

Data Products

Data Products

 

The Network recognizes three classes of data: (i) data that we curate, (ii) data that we ingest, and (iii) synthesis products we create. Data that we curate will be hosted on Smithsonian Institution (SI) servers, but the original data submitter and funding sources will be credited as the dataset’s creators. Data that we ingest will include both data we curate and data from any outside sources that meet basic availability, archiving, and metadata standards.

See below for a list of CCRCN data products.

 

Dataset

Data Components

Last Updated

Accuracy and Precision of Tidal Wetland Soil Carbon Mapping (associated paper here) Per-depth soil organic matter and carbon metrics, plant species identity, state of human impact, field and lab methodology, and metadata of 1534 soil cores 21 June 2018
Coastal National Greenhouse Gas Inventory: Report, Datasets, and Workflow Literature review, data, analysis, and report 9 December 2017

 

Testimonials to Data Contribution

 

“The Coastal Carbon Research Coordination Network dataset has been invaluable in our recent research identifying global drivers of variability in coastal wetland carbon cycling. The Network’s dataset greatly complemented our own previous data collation efforts, filling important gaps in our record. The availability of a comprehensive and well-curated dataset allowed us to focus on the analysis and interpretation of data, deriving important new insights in global patterns of carbon storage.”

- Jeffrey Kelleway, Department of Environmental Sciences, Macquarie University

"The CCRCN database is a key cornerstone in accelerating the pace of discovery for coastal carbon cycling.  I recently downloaded version 1, and have begun analylzing it and intercomparing its features with other national and global sets on soil core characteristics. As it focuses only on soilcores from tidal wetland, it is the single largest and spatially explicit empirical dataset, globally, for populating carbon stock assessments or testing models across space and time. For coastal lands, it is an invaulable asset for scientists and managers alike.  The developers of the dataset and platform should be commended, as should the many community contributors who are fueling advances in science and practice by sharing their data."

- Lisamarie Windham-Myers, U.S. Geological Survey

 

Return to Top

Website Section
Coastal Carbon

Database Structure

Naming Conventions for Attributes and Variables (Version 1)

 

3 July 2018

Contents

Overview

Return to Top

This page serves as guidance for the types and scope of data and metadata that will be archived as part of the Network’s developing tidal soil carbon synthesis. We propose the following data structure and standardized attribute names for metadata and data in order to make datasets machine-readable and interoperable. Each subheading lists a level of metadata or data hierarchy from study level metadata to site level to core level to depth series information. Each subheading also represents separate tables which can be joined by common attributes such as study_id, site_id, and core_id. We also include accompanying sets of recommended controlled vocabulary for key categorical variables (also known as factors). Some attributes have controlled units that we wish to keep uniform across datasets. Data that we curate will follow naming conventions outlined herein. Data that we ingest from outside sources will be converted to these conventions when being ingested into the central GitHub database using custom-built R scripts.

At a minimum a submission should have the following for inclusion in soil carbon synthesis products: study_id, author information, core_id, latitude and longitude information associated with either a core or the site, depth_min, depth_max, dry_bulk_density, organic_matter_fraction and/or carbon_fraction. The more auxiliary detail that you provide, the more widely your data can be used. Throughout the tables below mandatory attributes are shown in bold.

The depth series is the level at which carbon-relevant information is housed. This synthesis will not ingest core-level or site-level averages of variables like dry bulk density, fraction organic matter, or fraction carbon. These averages can be derived from the database, but are not immediately useful to our research questions unless those averages can be traced back to their original data.

There are many opportunities to express your data’s individuality. We refer throughout to ‘flags’ and ‘notes’. Flags refer to common methodological choices or data issues that can be coded using categorical variables. The idea behind flags is to allow users the option to query datasets based on methodology. Flags are very machine-readable but not very flexible from the standpoint of a submitter. Notes are available for almost all measured attributes and take the form of free-text allowing submitters to provide context, observations, or concerns about methods, sites, cores, or observations. These are more flexible from the perspective of a submitter but are less machine-readable.

Development Process to Date

Return to Top

This guidance is the culminations of three efforts:

  1. A meeting of 47 experts in Menlo Park, CA in January 2016, hosted by the United States Carbon Cycle Science Program, in order to establish community priorities.

  2. Experience with the initial curation of a dataset of ~1,500 public soil cores as part of the publication Holmquist et al., 2018 Accuracy and Precision of Tidal Wetland Soil Carbon Mapping in the Conterminous United States.

  3. The results of 19 collaborators submitting commentary on an initial draft of these recommendations put up for public comment in April and May 2018.

Ongoing and Future Development

Return to Top

We acknowledge that this is a lot of information to process and do not want to imply >100 attributes are mandatory. They are not. While we will make the entire entry template available for download (LINK PENDING), we are also in the process of designing an application which will generate a custom submission template based on your answers to a questionnaire about your dataset.

Submitters can feel free to add other attributes to data submissions as long as the attributes and any associated categorical variables are defined with the submission. CCRCN personnel will accept and archive related soils data within reason, but will not be able to quality control data falling outside the outlined guidance. If attributes or variables are submitted often and there is community coordination behind their inclusion, they could be integrated into periodic updates to this guidance.

We anticipate that this guidance will evolve as we synthesize new datasets as part of five working groups. Part of each working group’s task will be to revisit this guidance and agree on new needed attribute names, definitions, variables, controlled vocabulary and units. Any further guidance based on the working group’s experience will be made available to the community via post-workshop reports and peer reviewed publications. Documentation on any changes to the data management plan and submission templates will be issued with version numbers. CCRCN produces will reference these documents and version numbers. We will avoid changing attribute or variable names, and will only do so if there is a compelling reason to. If in the future there ends up being more than one acceptable redundant attribute or variable name, names will be added to a database of synonyms and working synthesis products will be updated given the most current standards.

Study Level Metadata

Return to Top

Study-level information is essential for formatting the Ecological Metadata Language, and is a great way for you to express your project’s history, context, and originality.

Study Information

Return to Top

Please provide some custom text for your study.

Study Information
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year spearated by underscores. character  
one_liner If this is data the CCRCN curates, the submitter should include a one line description of the study. character  
study_code If this is data the CCRCN curates, the study will be assigned a 128-bit universal unique identifier. Submitters should only include this if it already exists for the data. Otherwise CCRCN personnel will generate this as part of the data ingestion process. character  
study_start_date Study start date. Date YYYY-MM-DD
study_end_date Study end date. Date YYYY-MM-DD
title If this is data the CCRCN curates, the submitter should include a study title. If this is data the CCRCN is ingesting, this can be pulled from the metadata or source text. character  
abstract If this is data the CCRCN curates, the submitter should include a one paragraph description of the study. If this is data the CCRCN is ingesting, this can be pulled from the metadata or source text. character  

Keywords

Return to Top

Keywords are not necessary, but can help make your data more searchable in a database.

Keywords
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year spearated by underscores. character  
key_words If this is data the CCRCN curates, the submitter should include five to fifteen descriptive words or phrases describing the study. If this is data the CCRCN is ingesting, this can be pulled from the metadata or source. Keywords help build some search functionality into the databases. character  

Authors

Return to Top

For each dataset at least one corresponding author should be specified. Specifying author names will allow users (or you in the future) to query the dataset and see how many cores you’ve submitted.

Authors
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
last_name Submitter’s family name. character  
given_name Submitter’s first name, middle name, middle initial, or any other names. character  
institution Submitter’s current institution. character  
email Submitter’s current email address. character  
address Submitter’s current mailing address. character  
phone Submitter’s current phone number. character  
corresponding_author TRUE or FALSE indicating whether the author should be contacted as the corresponding author. factor TRUE = The author should be contacted with any further questions. FALSE = The author should not be contacted with any further questions.

Funding Sources

Return to Top

Your funders will love being acknowledged in a data release, and will appreciate being searchable in the database. One dataset can have multiple funding sources.

Funding
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
funding_agency Agency name funding the research, spelled out, no acronyms. character  
funding_id Code used by the agency to track the project funding. character  
funding_notes Any other submitter-generated notes about the project funding. character  

Associated Publications

Return to Top

One dataset can be affiliated with multiple publications. This allows an original work to be cited as a primary source, as well as any secondary or synthesis papers that added value to the dataset’s archival. Submitters can simply add a bibtex style citation, such as one copied over from Google Scholar, or they can fill out all of the relevant attributes for the data release. It’s all the same to us. Much of this guidance came from the Wikipedia page for BibTeX.

Associated Publications
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
bibtex_citation Submitters can associate multiple BibTeX style citations with the dataset. They can also include the same information by filling out following attributes in tabular form if more convenient than BibTeX formatting. character  
publication_type Code indicating the type of publication the study originates from. factor article = Journal article. book = Book. mastersthesis = Master’s Thesis. misc = Miscellaneous publications such as online datasets. phdthesis = PhD thesis or dissertation. techreport = Technical report. unpublished = Unpublished source.
author The names of the author separated by “and”. character  
year The year of publication or, if unpublished, the year of creation. Date YYYY
title The title of the work. character  
journal The journal or magazine the work was published in. character  
volume The volume of a journal or multi-volume book. character  
number The “(issue) number” of a journal, magazine, or tech-report, if applicable. (Most publications have a “volume”, but no “number” field.). character  
pages Page numbers, separated either by commas or double-hyphens. character  
url Permanent web address where the work can be located. character  
doi Digital object identifier associated with the work. character  
address Publisher’s address (usually just the city, but can be the full address for lesser-known publishers). character  
annote An annotation for annotated bibliography styles (not typical). character  
booktitle The title of the book, if only part of it is being cited. character  
chapter The chapter number. character  
crossref The key of the cross-referenced entry. character  
edition The edition of a book, long form (such as “First” or “Second”). character  
editor The name(s) of the editor(s). character  
howpublished How it was published, if the publishing method is nonstandard. character  
institution The institution that was involved in the publishing, but not necessarily the publisher. character  
key A hidden field used for specifying or overriding the alphabetical order of entries (when the “author” and “editor” fields are missing). Note that this is very different from the key (mentioned just after this list) that is used to cite or cross-reference the entry. character  
month The month of publication (or, if unpublished, the month of creation). Date MM
note Miscellaneous extra information. character  
organization The conference sponsor. character  
publisher The publisher’s name. character  
school The school where the thesis was written. character  
series The series of books the book was published in (e.g. “The Hardy Boys” or “Lecture Notes in Computer Science”). character  
type The field overriding the default type of publication (e.g. “Research Note” for techreport, “{PhD} dissertation” for phdthesis, “Section” for inbook/incollection). character  

Materials and Methods

Return to Top

For each study please fill out key data regarding materials and methods that are important to the soil carbon stocks meta-analysis. Some users may want to include or exclude certain methodologies, or see your commentary on the methods. Let’s make it easy for them.

Materials and Methods
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
coring_method Code indicating what type of device was used to collect soil depth profiles. factor gouge auger = A half cylinder coring device in which the coring section is open, not sealed off by a fin. hargas corer = A large diameter (>10 cm) coring device consisting of a tube, piston, and a cutting head. mcauley corer = A half cylinder coring device with the coring section sealed off by a fin attached to a rotating pivot point. mccaffrey peat cutter = U-shaped blade that extracts a core by cutting down through peat. none specified = No coring device was specified. other shallow corer = Any other type of coring device typically taking cores shallower than 30 centimeters. piston corer = A device that extrudes core into tube upward with a plunger. push core = Any number of coring types involving driving a tube into the sediment to recover a core. pvc and hammer = PVC pipe was driven into the sediment with a hammer to recover a core. russian corer = A half cylinder coring device with the coring section sealed off by a fin attached to a rotating pivot point. vibracore = A technique involving collecting a core by sinking a continuous pipe into sediment attaching a source of vibration, then recovering using a winch and pulley. surface sample = A technique involving collecting a core shallower than ~5 cm using a circular metal cutter.
roots_flag Code indicating whether live roots were included or excluded from carbon assessments. factor roots and rhizomes included = Roots and rhizomes were included in dry bulk density and or organic matter and carbon measurements. roots and rhizomes separated = Roots and rhizomes were separated from soil before dry bulk density and or organic matter and carbon measurements.
sediment_sieved_flag Code indicating whether or not sediment was sieved prior to carbon measurements. factor sediment sieved = Sediment was sieved prior to analysis for organics. sediment not sieved = Sediment was not sieved prior to analysis for organics.
sediment_sieve_size If sediment was sieved, the size of sieve used. numeric millimeters
compaction_flag Code indicating how the authors qualified or quantified compaction of the core. factor compaction qualified = Compaction was at least qualified and noted by the authors. compaction quantified = Compaction was quantified and corrected for in core based measurements. corer limits compaction = Authors specified that the coring device’s design minimized compaction. no obvious compaction = Authors observed no obvious compaction. not specified = Compaction was not specified.
dry_bulk_density_temperature Temperature at which samples were dried to measure dry bulk density. This can include either samples that were freeze dried or oven dried. numeric celsius
dry_bulk_density_time Time over which samples were dried to measure dry bulk density. numeric hour
dry_bulk_density_sample_volume Sample volume used for bulk density measurements, if held constant. numeric cubicCentimeters
dry_bulk_density_sample_mass Sample mass used for bulk density measurements, if held constant. numeric grams
dry_bulk_density_flag Any notable codes regarding how the authors quantified dry bulk density. factor air dried to constant mass = Methodology specified that samples were air dried to a constant mass. modeled = Bulk density was not measured, but was modeled from loss on ignition and assumptions about the particle densities of organic and inorganic matter. freeze dried = Bulk density was measured on freeze dried samples. not specified = No additional details regarding bulk density methodology were provided. removed non structural water = Bulk density methodology did not specify drying temperature or length, only that non-strucural water was removed. time approximate = Bulk density time recorded herin is an approximate estimate. to constant mass = Bulk density methodology did not specify drying temperature or length, only that samples were dried to a constant mass.
loss_on_ignition_temperature Temperature at which samples were combusted to estimate fraction organic matter. numeric celsius
loss_on_ignition_time Time over which samples were combusted to estimate fraction organic matter. numeric hour
loss_on_ignition_sample_volume Sample volume used for loss on ignition, if held constant. numeric cubicCentimeters
loss_on_ignition_sample_mass Sample mass used for loss on ignition, if held constant. numeric grams
loss_on_ignition_flag Common codes regarding loss on ignition methodology. factor time approximate = Loss on ignition time recorded herein is an approximate estimate. not specified = No additional details regarding loss on ignition methodology or time were provided.
carbon_measured_or_modeled Code indicating whether fraction carbon was measured or estimated as a function of organic matter. factor measured = Fraction carbon was measured as opposed to modeled. modeled = Fraction carbon was modeled as opposed to measured.
carbonates_removed Whether or not carbonates were removed prior to calculating fraction organic carbon. factor FALSE = Carbonates were not removed before measuring organic carbon. TRUE = Carbonates were removed before measuring organic carbon.
carbonate_removal_method The method used to remove carbonates prior to measuring fraction carbon. factor direct acid treatment = Carbonates were removed using direct application of dilute acid. acid fumigation = Carbonates were removed by fumigating with concentrated acid. low carbonate soil = Organic carbon fraction was measured without removing carbonates assuming carbonate content of the soil type was minimal. carbonates not removed = Carbonates were not removed and low carbonate soil was not specified. none specified = Carbonate removal methodology was not specified.
fraction_carbon_method Code indicating the method for which fraction carbon was measured or modeled (Note: regression based models are permitted, but the use of the Bemmelen factor [0.58 gOC gOM-1] is discouraged). factor Craft regression = Used regression model from Craft et al., 1991, Estuaries, to predict fraction carbon as a function of fraction organic matter. EA = Each sample presented was measured using Elemental Analysis. Fourqurean regression = Used regression model from Fourqurean et al., 2012, Nature Geoscience, to predict fraction carbon as a function of fraction organic matter. Holmquist regression = Used regression model from Holmquist et al., 2018, Scientific Reports, to predict fraction carbon as a function of fraction organic matter. kjeldahl digestion = Each sample was measured kjeldahl digestion method. local regression = A regression model fit using a subset of measurements was used to predict fraction carbon as a function of fraction organic matter. not specified = No additional details were provided regarding fraction carbon methodologies. wet oxidation = Each sample was measured using a wet oxidation method.
fraction_carbon_type Code indicating whether fraction_carbon refers to organic or total carbon. factor organic carbon = Author specified that fraction carbon measurements were of organic carbon. total carbon = Author specified that fraction carbon measurements were of total carbon.
carbon_profile_notes Any other submitter defined notes describing methodologies for determining dry bulk density, organic matter fraction, and carbon fraction. character  
cs137_counting_method Code indicating the method used for determining radiocesium activity. factor alpha = Alpha counting method used. gamma = Gamma counting method used.
pb210_counting_method Code indicating the method used for determining lead 210 activity. factor alpha = Alpha counting method used. gamma = Gamma counting method used.
excess_pb210_rate Code indicating the mass or accretion rate used in the excess_pb_210_model factor mass accumulation = Excess 210Pb modeled using mass accumulation rate. accretion = Excess 210Pb modeled using vertical accretion rate.
excess_pb210_model Code indicating the model used to estimate excess lead 210. factor CRS = Constant rate of supply model used. CIC = Constant initial concentration model used. CFCS = Constant flux constant sedimentation model used.
ra226_assumption Code indicating the assumption used to estimate the core’s background 226Ra levels. factor each sample = 226Ra was measured for each sample. total core = 226Ra was measured for the total core, at asymptote = asy
c14_counting_method Code indicating the method used for determining radiocarbon activity. factor AMS = Accelerator mass spectroscopy used. beta = Beta counting used.
dating_notes Any submitter defined notes elaborating on the process of dating the core not yet made clear by the coding. character  
age_depth_model_reference Code indicating the reference or 0 year of the age depth model. factor YBP = Year zero is defined as years before present, 1960 CE. CE = Year zero is set according to Common Era and Before Common Era standards. core collection date = Year zero is set as the core’s collection year.
age_depth_model_notes Any submitter defined notes on how the age depth model was created. character  

Site Level

Return to Top

Site information is not required, but could provide important context for your study. You should describe the site and how it fits into your broader study, provide geographic information (although this can be generated automatically from the cores as well), and add any relevant tags and notes regarding site vegetation and inundation. Vegetation and inundation can alternatively be incorporated into the core-level data, whatever makes the most sense for your study design.

Site Information
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
site_id Site identification code unique to each study. character  
site_description Site description including relevant study details and political geographic units. Some of these descriptions can be automated by the ingestion code. character  
site_latitude_max Maximum latitude defining a bounding box for the site in decimal degree World Geodedic System of 1984 (WGS84). This can also be generated automatically by the ingestion code. numeric degree
site_latitude_min Minimum latitude defining a bounding box for the site in decimal degree WGS84. This can also be generated automatically by the ingestion code. numeric degree
site_longitude_max Maximum longitude defining a bounding box for the site in decimal degree WGS84. This can also be generated automatically by the ingestion code. numeric degree
site_longitude_min Minimum longitude defining a bounding box for the site in decimal degree WGS84. This can also be generated automatically by the ingestion code. numeric degree
site_boundaries As an alternative to submitting or automatically generating a bounding box, submitters can include a shapefile (.shp) or keyhole markup language (.kml) documenting the geographic boundaries of the site. This can be converted to and stored in well known text (WTK) format. character  
salinity_class Code based on submitter field observation or measurement indicating average annual salinity (Note: Palustrine and freshwater should only include tidal wetlands, or wetlands that are potentially/formerly tidal but artificially freshened due to artificial tidal restrictions). factor estuarine C-CAP = 5-35 parts per thousand salinity (ppt) according to the coastal change analysis program. palustrine C-CAP = < 5 ppt according to the coastal change analysis program. estuarine = 0.5-35 ppt according to most other definitions. palustrine = < 0.5 ppt according to most other definitions. brine = >50 ppt. saline = 30-50 ppt. brackish = 0.5-30 ppt. fresh = <0.5 ppt. mixoeuhaline = 30-40 ppt. polyhaline = 18-30 ppt. mesohaline = 5-18 ppt. oligohaline = 0.5-5 ppt.
salinity_method Indicate whether salinity_class was determined using a field observation or a measurement. factor field observation = Salinity inferred by field observation such as vegetation. measurement = Salinity observed from local instrument.
salinity_notes Any relevant submitter generated notes on how salinity_class was determined. character  
vegetation_class Code based on submitter field observations or measurement indicating dominant wetland vegetation type. factor emergent = Describes wetlands dominated by persistent emergent vascular plants. scrub shrub = Describes wetlands dominated by woody vegetation <= 5 meters in height. forested = Describes wetlands dominated by woody vegetation > 5 meters in height. FO/SS = Dominated by forested to scrub/shrub biomass. seagrass = Describes tidal or subtidal communities dominated by rooted vascular plants.
vegetation_method Indicate whether vegetation_class was determined using a field observation or a measurement factor field observation = Vegetation inferred by field observation. measurement = Vegetation measured by counts or plots.
vegetation_notes Any relevant submitter generated notes on how vegetation_class was were determined character  
inundation_class Code based on submitter field observation or measurement indicating how often the coring location is inundated factor high = Study-specific definition of an elevation relatively high in the tidal frame, typically defined by vegetation type. mid = Study-specific definition of an elevation in the relative middle of the tidal frame, typically defined by vegetation type. low = Study-specific definition of an elevation in relatively low in the tidal frame, typically defined by vegetation type. levee = Study-specific definition of a relatively high elevation zone built up on the edge of a river, creek, or channel. back = Study-specific definition of a relatively low elevation zone behind a levee.
inundation_method Indicate whether inundation_class was determined using a field observation or a measurement factor field observation = Inundation inferred by field observation such as vegetation. measurement = Inundation class assessed from elevation and nearby tide gauge or other similar method.
inundation_notes Any relevant submitter generated notes on how inundation was determined character  

Core Level

Return to Top

Note that positional data can be assigned at the core level, or at the site level. However, it is important that this is specified, that site coordinates are not attributed as core coordinates, and that the method of measurement and precision is noted. Vegetation and inundation can alternatively be incorporated into the site-level data, whatever makes the most sense to your study design. In the future this level of hierarchy will be reassessed as the ‘subsite level’ as this level of hierarchy can handle any sublocation information such as vegetation plot, and instrument location and description.

Core- (or Subsite-) Level Information
attribute name definition data type format, unit or codes
study_id

Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores.

factor NA
site_id Site identification code unique to each study. factor NA
core_id Core identification code unique to each site. factor NA
core_date Date of core collection. Date YYYY-MM-DD
core_notes Any other relevant submitter generated notes on how cores were collected. character  
core_latitude Positional latitude of the core in decimal degree WGS84. numeric degree
core_longitude Positional longitude of the core in decimal degree WGS84. numeric degree
core_position_accuracy Accuracy of latitude and longitude measurement, if determined and recorded. numeric meter
core_position_method Code indicating how latitude and longitude were determined. factor

RTK = Real-time kinematic global position system (GPS). handheld = Conventional Commercially available hand-held GPS. other high resolution = Any other technique resulting in positional error < 1 meter. other moderate resolution = Any other technique resulting in positional error < 30 meters. other low resolution = Any other technique resulting in positional error > 30 meters.

core_position_notes Any relevant submitter generated notes on how latitude and longitude were determined. character  
core_elevation Surface elevation of the core relative to defined datum. numeric meters
core_elevation_datum The datum relative to which the core elevation was measured against (For a complete list of datum names and aliases please refer to the ISO Geodedic Registry https://iso.registry.bespire.eu/register/geodetic/VerticalDatum). factor

NAVD88 = A gravity-based geodetic datum, North American Vertical Datum of 1988. MSL = A tidal datum, Mean Sea Level as measured against a local tide gauge. MTL = A tidal datum, Mean Tidal Level as measured against a local tide gauge. MHW = A tidal datum, Mean High Water as measured against a local tide gauge. MHHW = A tidal datum, Mean Higher High Water as measured against a local tide gauge. MHHWS = A tidal datum, Mean Higher High Water for Spring Tides as measured against a local tide gauge. MLW = A tidal datum, Mean Low Water as measured against a local tide gauge. MLLW = A tidal datum, Mean Lower Low Water as measured against a local tide gauge.

core_elevation_accuracy Accuracy of elevation measurement, if determined and recorded numeric meters
core_elevation_method Code indicating how elevation was determined factor

RTK = Real-time kinematic GPS. other high resolution = Any other technique resulting in positional error < 5 cm of random error. LiDAR = Handheld GPS matched to lidar-based digital elevation model. DEM = Handheld GPS matched to another digital elevation model. other low resolution = Any other technique resulting in positional error > 5 cm of random error.

core_elevation_notes Any relevant submitter generated notes on how elevation was determined character  
salinity_class Code based on submitter field observation or measurement indicating average annual salinity (Note: Palustrine and freshwater should only include tidal wetlands, or wetlands that are potentially/formerly tidal but artificially freshened due to artificial tidal restrictions). factor

estuarine C-CAP = 5-35 parts per thousand salinity (ppt) according to the coastal change analysis program. palustrine C-CAP = < 5 ppt according to the coastal change analysis program. estuarine = 0.5-35 ppt according to most other definitions. palustrine = < 0.5 ppt according to most other definitions. brine = >50 ppt. saline = 30-50 ppt. brackish = 0.5-30 ppt. fresh = <0.5 ppt. mixoeuhaline = 30-40 ppt. polyhaline = 18-30 ppt. mesohaline = 5-18 ppt. oligohaline = 0.5-5 ppt.

salinity_method Indicate whether salinity_class was determined using a field observation or a measurement factor

field observation = Salinity inferred by field observation such as vegetation. measurement = Salinity observed from local instrument.

salinity_notes Any relevant submitter generated notes on how salinity_class was determined character  
vegetation_class Code based on submitter field observations or measurement indicating dominant wetland vegetation type. factor

emergent = Describes wetlands dominated by persistent emergent vascular plants. scrub shrub = Describes wetlands dominated by woody vegetation < 5 meters in height. forested = Describes wetlands dominated by woody vegetation > 5 meters in height. seagrass = Describes tidal or subtidal communities dominated by rooted vascular plants.

vegetation_method Indicate whether vegetation_class was determined using a field observation or a measurement factor

field observation = Vegetation inferred by field observation. measurement = Vegetation measured by counts or plots.

vegetation_notes Any relevant submitter generated notes on how vegetation_class and dominant_species were determined. character  
inundation_class Code based on submitter field observation or measurement indicating how often the coring location is inundated. factor

high = Study-specific definition of an elevation relatively high in the

tidal frame, typically defined by vegetation type. mid = Study-specific definition of an elevation in the relative middle of the tidal frame, typically defined by vegetation type. low = Study-specific definition of an elevation in relatively low in the tidal frame, typically defined by vegetation type. levee = Study-specific definition of a relatively high elevation zone built up on the edge of a river, creek, or channel. back = Study-specific definition of a relatively low elevation zone behind a levee.

inundation_method Indicate whether inundation_class was determined using a field observation or a measurement factor field observation = Inundation inferred by field observation such as vegetation. measurement = Inundation class assesed from elevation and nearby tidegauge or other similar method.
inundation_notes Any relevant submitter generated notes on how elevation was determined character  
core_length_flag Indicated whether or not the coring team believes they recovered a full sediment profile, down to bedrock, or other non-marsh interface. factor core depth limited by length of corer = The total depth of the core was limited by the length of the coring device. core depth represents deposit depth = Authors report that the depth of the core represents the depth of the wetland soil deposit. not specified = Authors did not specify whether or not the depth of the core represents the depth of the wetland soil deposit.

Soil Depth Series

Return to Top

This level of hierarchy contains the actual depth series information. At minimum a submission needs to specify minimum and maximum depth increments, dry bulk density, and either fraction organic matter or fraction carbon. Sample ID’s should be used in the case that there are multiple replicates of a measurements. There is plenty of room for recording raw data from various dating techniques as well as age depth models.

Soil Depth Series Information
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
site_id Site identification code unique to each study character  
core_id Core identification code unique to each site character  
depth_min Minimum depth of a sampling increment. numeric centimeter
depth_max Maximum depth of a sampling increment. numeric centimeter
sample_id Sample identification unique to the core. This should be used in the case that there are relevant lab specific sample codes, or in the case that there are multiple replicate samples. character  
dry_bulk_density Dry mass per unit volume of a soil sample. This does not include ash free bulk density. numeric gramsPerCubicCentimeter
fraction_organic_matter Mass of organic matter relative to sample dry mass. Ash free bulk density should not be used here but should be expressed as a loss on ignition fraction. numeric dimensionless
fraction_carbon Mass of carbon relative to sample dry mass. numeric dimensionless
compaction_fraction Fraction of the sample depth interval reduced due to compaction. numeric dimensionless
compaction_notes Any submitter generated notes on compaction. character  
cs137_activity Radioactivity counts per unit dry weight for radiocesium (137Cs). numeric becquerelPerKilogram
cs137_activity_sd 1 standard deviation of uncertainty associated with cs137_activity. numeric becquerelPerKilogram
total_pb210_activity Total radioactivity counts per unit dry weight for excess lead 210 (210Pb). numeric becquerelPerKilogram
total_pb210_activity_sd 1 standard deviation of uncertainty associated with total_pb210_activity. numeric becquerelPerKilogram
ra226_activity Total radioactivity counts per unit dry weight for Radium 226 (226Ra) if measured as part of the 210Pb dating process. numeric becquerelPerKilogram
ra226_activity_sd 1 standard deviation of uncertainty associated with ra226_activity. numeric becquerelPerKilogram
excess_pb210_activity Excess radioactivity counts per unit dry weight for excess lead 210 (210Pb). numeric becquerelPerKilogram
excess_pb210_activity_sd 1 standard deviation of uncertainty associated with excess_pb210_activity. numeric becquerelPerKilogram
c14_age Radiocarbon age as estimated from AMS measurements. numeric radiocarbonYear
c14_age_sd Estimated uncertainty in c14_age. numeric radiocarbonYear
c14_material Description of the material selected for radiocarbon (14C) dating. character  
c14_notes Any relevant submitter generated notes on 14C dating process. character  
delta_c13 The isotopic signature of 13C. This is oftentimes measured along with c14_age and can be useful for analyzing carbon lability and provenance. numeric partsPerMillion
be7_activity Radioactivity counts per unit dry weight for 7Be. numeric becquerelPerKilogram
be7_activity_sd Estimated uncertainty in be_7_activity. numeric becquerelPerKilogram
am241_activity Radioactivity counts per unit dry weight for 241Am. numeric becquerelPerKilogram
am241_activity_sd Estimated uncertainty in am_241_activity. numeric becquerelPerKilogram
marker_date The age of any other dated depth horizon such as an artificial marker, pollen horizon, pollution horizon, etc. Date YYYY-MM-DD
marker_type Code indicating the type of marker. factor artificial horizon = Horizon was added to the surface artificially by using materials such as feldspar, glitter, or rare earth elements. pollen = Pollen analysis was used to tie horizon to the timing of vegetation change such as the arrival of invasives, or the beginning of local agriculture. pollution = Chemical analysis was used to tie the horizon to the timing of a pollution event. tsunami = Sediment analysis was used to tie the horizon to the timing of a tsunami event.
marker_notes Any other submitter generated notes about the origin of the marker. character  
age Most likely, median, or mean age of the depth interval from submitter generated age depth model. numeric year
age_min Minimum age of the depth interval from submitter generated age depth model. numeric year
age_max Maximum age of the depth interval from submitter generated age depth model. numeric year
age_sd Standard deviation of age estimate from submitter generated age depth model. numeric year
depth_interval_notes Any other submitter generated notes specific to the depth interval. character  

Multiple Special Conditions at the Level of the Site or Core

Return to Top

Because there may be multiple observations or conditions that are part of the study, such as species present, or degradation or restoration activities, that can affect a site or core, these are archived separately.

Dominant Species Present

Return to Top

You can record species codes associated with sites and/or cores. The CCRCN is species code system is derived from the USDA PLANTS Database, and for most taxa, the code consists of the first two letters of genus follow by the first two letters of the species (e.g., "Spartina alterniflora" = "SPAL").

Species Present at Site or Subsite
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
site_id Site identification code unique to each study. character  
core_id Core identification code unique to each site. character  
species_code Code associated with a species or a vegetation assemblage. character  

Anthropogenic Impacts Present

Return to Top

You can record various codes associated with degradation or restoration conditions at sites and/or cores.

Anthropogenic Impacts at Site or Subsite
attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
site_id Site identification code unique to each study. character  
core_id Core identification code unique to each site. character  
impact_class Code indicating any major anthropogenic impacts historically and currently affecting the coring location. factor tidally restricted = Tidal flow is muted or blocked by built structures. impounded = Water level is raised artificially by a tidal restriction, resulting in ponding of water on the wetland and or upland surface. managed impounded = Wetland is impounded seasonally, and other times natural or semi natural hydrology occurs. ditched = Tidal hydrology is altered because artificial ditches have been cut to promote tidal flooding and drainage. diked and drained = The wetland has been diked and drained, with or without flapper gates, pumping, or other means. farmed = Managed impoundment or drainage in which wetland has been converted to agricultural land. tidally restored = Tidal flow has been restored by removing an artificial obstruction. revegetated = Wetland vegetation has been reintroduced by replanting on unvegetated surfaces. invasive plants removed = Natural plant communities have been restored by the active removal of invasive plant species. invasive herbivores removed = Tidal wetland vegetation has been managed by the removal of invasive herbivores. sediment added = Elevation has been managed by artificially adding sediment to the site using techniques such as thin layering or sediment diversion. wetlands built = Constructed wetland using sediments such as dredge spoils or other sediment source.

Submitter Defined Attributes and Definitions

Return to Top

Part of the reason we control these attribute and variable names are so that the dataset does not become unmanageable, and we can deliver products that run cleanly and smoothly to you. However, we know that research is complicated, and not all of the data you want to include can be represented here. As long as it fits within this hierarchy, we allow you to submit user defined attributes.

Study Level Species Table

Return to Top

If species codes or common names are used anywhere in the study, there should be a separate table included defining all names using scientific names. The CCRCN is species code system is derived from the USDA PLANTS Database, and for most taxa, the code consists of the first two letters of genus follow by the first two letters of the species (e.g., "Spartina alterniflora" = "SPAL").

attribute name definition data type format, unit or codes
study_id Unique identifier for the study made up of the first author’s family name, as well as the second author’s family name or ‘et al.’ if more than three, then publication year separated by underscores. character  
species_code Code associated with a species or a vegetation assemblage. character  
genus Genus according to the most up to date classification. character  
species Species according to the most up to date classification. character  
sub_species Any nomenclature referring to subspecies special cases. character  
hybrid Any nomenclature referring to special cases of hybridization. character  
common_name Common name associated with the species, especially if it is referred to in any accompanying text. character  
species_notes Any other submitter defined notes regarding the species. character  

Other Attributes and Variables

Return to Top

Any submitter-defined attributes should be included in a separate table indicating the associated level of hierarchy, attribute name, data type (date, factor, character, or numeric). Attribute names should follow good naming practices: self-descriptive, don’t start with a number or special character, no spaces. Dates should be stored as a character string and should have an accompanying ‘string format’ indicating the position, number of digits and deliminators for the date time. For example June twenty-sixth two-thousand eighteen written as 2018-06-26 would be formatted as ‘YYYY-MM-DD’. Here is a handy dateTime reference. Numeric values should have their units defined. Factors (i.e. categorical variables) should be defined in a separate table.

level of hierarchy attribute name description data Type format, unit
ex. site level or core level (your column name here. [use good naming conventions]) (describe your attribute here.) Date, factor, character, or numeric (extra necessary info here)

Variable names, like attribute names, should be self-descriptive. Such as ‘experimental’ or ‘control’ as opposed to ‘1’ and ‘2’.

level of hierarchy attribute name categorical variable name description
ex. site level or core level (parent column name here) (your variable name here.) (describe your variable)

That’s It

You now know everything there is to know about soil carbon data management.

Return to Top

Outreach and Training

One of the primary goals of the CCRCN is to provide training tools and tutorials for coastal carbon researchers. Here is a listing of the training products developed by the Network to date.

 

 

 

Join the Network

Join the Network

 

Interested in contributing data to the CCRCN? If so, please email CoastalCarbon@si.edu and CCRCN personnel will assist you in the process. We are working on building a webportal to automate a lot of this exchange, but until we do so this will remain a very friendly peer to peer handoff system. Data submissions can remain embargoed for a time specified by the submitter. In embargo cases a data release will be prepared and shared with the submitter via a private dropbox link, until the embargo period ends, the data release is made public, and the dataset is drawn into synthesis products.

Frequent Updates

 

Monthly Updates

Subscribe to the RCN mailing list

* indicates required