PCM Data Object Design (DOD) Interface

The Process Configuration Manager (PCM) Data Object Design (DOD) refers to the interfaces through which users define the data products that are produced by an ARM data product algorithm (i.e. a vap or ingest process). The description of the output products are referred to as Data Object Designs, thus the GUI is referred to as the PCM DOD GUI. Anyone can view DODs

To access PCM DOD GUI https://pcm.arm.gov/pcm/ a user must provide their ARM LDAP account user name and password. An ARM LDAP account can be requested at https://adc.arm.gov/armuserreg/#/new.

Layout of the PCM DOD Website

The PCM DOD GUI has two main windows. A left panel containing the datastream catalog of all datastreams and their associated DODs as found in the data system database (DSDB). The main window is the area in which DODs are viewed and edited.

_images/pcmdod_main_page.png

Datastreams can be filtered by regular expressions. In addition filters to view a subset of datastreams based on type, characteristics, and permissions are provided as are Advanced Search capabilities. The screenshot below shows a filter for all datastreams with the substring ‘met’ in their datastream name that apply to one or more ARM locations as defined by their site/facility designation, and include a variable with ‘temp’ as a substring.

_images/pcmdod_main_page_filtering.png

The datastream icons and coloring schema is explain in the Help button located at the top left of the form in the blue banner.

_images/pcmdod_main_page_help.png

Define Content of Output Datastream DODs

Because the PCM is intended to expediate the transfer of a scientific algorithm into the ARM production processing system, it is expected the desired output product is well defined. In addition, for an ADI VAP to run the DOD all of its output datastreams need to be loaded into the DSDB. Therefore it is recommended that a new process be initiated by first defining its output product in the PCM DOD GUI.

A new DOD can be created by either starting from a blank template or by importing an existing netCDF file. If importing a DOD from a file, the file must follow arm data file naming standards, and the site and facility of the datastream must be loaded into the DSDB. Once loaded into the DSDB, the DOD(s) should be updated to conform to ARM DOD standards (add reference once it exists).

DODs associated with a datastream can be viewed by selecting the drop list icon to the left of the datastream. The contents of each DOD can be viwed in the catalog area by selecting its drop list arrow. Double clicking a DOD will load it into the main viewing/editing area. The screen shot below shows the metwxt.b1 datastream expanded with its DODv1.2 expanded and loaded into the main viewing area.

_images/pcmdod_main_page_example.png

The process of completely defining a datastream’s DOD includes creating the DOD, adding variables, adding/editing variable attributes, adding global attributes, and resetting variable attribute and global attribute values that will be populated by the ADI libraries or the VAP algorithm. All of these steps will be required for VAPs with DODs created through the use of a template. Whether, and if so, how much a DOD loaded by importing from an existing netCDF file will need to be edited is a function of the contents of the netCDF file imported and whether the DOD meets ARM’s standards.

Creating a DOD

To create a DOD you can either create a new one from a blank template, or by importing one from a netCDF file. We discuss both techniques in the following steps.

Creating a New DOD from a Blank Template

To create a new datastream with an initial DOD from a blank template select the DOD from the ‘New’ drop list icon in the middle of the main forms blue banner. In the pop up form ‘Create New DOD’ - enter the name and data level of the new datastream. If the datastream name does not conform to ARM standards this will noted by the text box border remaining red in color. - enter a comment - select the ‘Create’ radio button at the bottom left of the pop up form.

_images/pcmdod_create_new_DOD_template.png

Creating a New DOD by Importing NetCDF File or NCDump of a NetCDF File

New datastream/DODs can be created by importing either a netCDF file or a dump of a netCDF file created via the ncdump netCDF command.

To import from a NCDump file ……

To import from a netcdf file on the development server update the ‘DMF NetCDF Import’ pop window with the full directory path and filename to import and the datastream name and data level. Below is a example of defining a new DOD by importing an existing file on the ARM development server. Note that the datastream name excludes the site and facility designation. DODs can be designated to apply to a particular ARM location (i.e. site/facility pairing), but they are not defined as such, rather that constraint is applied after creation. Therefore omit the site and facility designations from the datastream name.

_images/pcmdod_import_DOD_DMF.png

Importing from a file on your local computer is done requires selecting ‘DOD NetCDF’ option from the import drop box and selecting the NetCDF file to use to create the new DOD. The datastream name and data level will be autopopulated with the base datastream name (i.e. name without a site and facility designation) and data level of the imported file. If that datastream name is already defined in the DSDB the name and/or level will need to be modified to create a unique new entry into the database. In the example below the aosacsm.b1 datastream is already defined so a new name of klgaosacsm is entered.

_images/pcmdod_import_DOD_local.png

The last import option is to import a file created via the Unidata ‘ncdump’ command that given a NetCDF file creates an ascii formatted version of the file. These files are formally refered to as Command Data Language (CDL) files, and informally as NCdump files. As with the import from a local NetCDF file the provided datastream name and level must not be already defined in the database. To import from a NCDump file select ‘DOD CDL/NCDump’ option from the import drop box and selecting the NCDump file to use to create the new DOD.

Saving a DOD

DODs should be regularly saved to ensure no updates are lost. To save a DOD - select the floppy icon located below the DOD’s header tab (it is the fourth icon from the left). Hoovering the mouse over each icon to display its functionality. - select either the ‘Save’, ‘Save as New DOD’ or ‘Save as Review Copy’ as appropriate. - ‘Save’ is used to store recent changes for the DOD being edited. By default the save will be a ‘Revision’ save (meaning neither the DOD major or minor version numbers are changed, and the chang is documented in an internally managed revision number. If the ‘Major’ or ‘Minor’ radio buttons are selected the DOD version number is increased by 1.

_images/pcmdod_save_DOD.png

KLG-> Need to add a reference a section of documentation that discusses DOD versions, how used and managed.

  • ‘Save As New DOD’ allows a user to save the DOD as new datastream name by providing a pop box into which the new datastream name and level are entered.

  • ‘Save As Review Copy’ creates a DOD designated as ‘Review’ copy. Review copies are created by individuals responsible for reviewing the DOD to ensure it adheres to ARM’s standards. The newly created review copy is identical to the DOD, but the user can make changes to it without affecting the parent DOD’s revision or DOD level. Thus allowing a mechanism to efficiently ‘fix’ problems with the DOD but allowing the developer to review these and incoporate them into their ‘official’ DOD selectively.

KLG-> Need to add a reference a section of documentation that Review copies how used and managed.

The screen shot below shows a sample of a review copy on the klgaosacsm.b1 datastream. It is the located below the DOD used to create it (DOD v1.0) and designated with an eye icon.

Note: More than one DOD version can be defined for any given datastream.

_images/pcmdod_DOD_review_copy.png

Dynamic DODs

Users do not have to specify an output datastream if they specify a retrieve and set a datastream in the process portion of PCM. They do not have to create a DOD, a DOD will be created on the fly. The resulting DOD will not exist in the database, but it will exist in output data file.

However, a datastream name must be included in the Process Output Datastream Classes in the PCM process definition. That datastream should not have a DOD associated with it.

In addition in the PCM Retriever Editor variable names must be provided in Output(s) column. Thus, the user sets up the mapping to the output as if a DOD for the target datastream existed.

If a DOD exists in the database then dynamic DODs will have no effect. Current functionality is such that if a DOD exists, and they select the dynamic DOD option, then the existing DOD will be updated with the additional information in the retriever that was not defined in the DOD. The DOD wins and nothing is overwritten that already exists.

Dynamic dods are generated using the –dynamic-dods command line option.

DOD Elements

Dimensions:

All dimensions that comprise a variable’s coordinate system must be defined in the DOD. A full definition includes not only meeting the netCDF requirement of documenting dimension name and length, but also, when possible, includes creating a variable whose name matches that of the dimension, whose dimension is the dimension itself, and whose values document the possible values of that element of another variable’s coordinate system. For example, if a measurement in a file is dimensioned by time and height, where time is the unlimited dimension and height has a value of 10, a variable named ‘height’ should be created that records the 10 heights of the temperature variable’s coordinate system.

Variables:

Output variables include not only variables representing the primary measurements produced by the VAP analysis, but also all input variables whose values are used in determining the primary measurements. These variables are referred to as ‘passthrough variables’.

Variable attributes that must be defined for all variables include the variable name, dimensions, data type, long_name, and units. Attributes that should exist if they are applicable to the variable include valid_min, valid_max, valid_delta, resolution, and missing_value. Optional attributes that can be added to provide additional detail include: comment(s), precision, accuracy, and uncertainty. In addition to this attributes, an indication of whether the variable has a companion QC variable and whether it is a primary measurement can also be made.

Global Attributes:

Global attributes are metadata that apply to all variables in the file and over all time samples. Typically metadata should not have attributes attributed to it because it is an attribute. As a result, if a global attribute has a unit then it should probably be defined as a static variable rather than a global attribute so that its units can be properly defined in a variable attribute.

Global attributes whose values need to be assigned at run time should not be assigned a value in PCM DOD Definition form. If a global attribute has a value in the PCM, that value cannot be overwritten during processing. If the global attribute has a value equal to the value attempting to be assinged, the process will run successfully. The ADI library sets some ARM standard global attribute values at run time. To ensure these attributes can be properly assigned values during run time, the ADI libraries verify they do in fact have NULL values in the PCM. If they do not, the process will fail with the error “Invalid Global Attribute Value”. The following global attributes are set by the ADI libraries during run time:

command_line, process_version, dod_version, site_id, facility_id, data_level, datastream

Adding DOD Elements to a DOD **THIS SECTION IS OUTDATED

Variables, dimensions, variable attributes, and global attributes can be added to a datastream’s DOD by either adding a new DOD element or by dragging and dropping DOD elements from an existing datastream’s DOD. Some ancillary variables (such as companion QC and status variables) can be added via a button available in each measurement variable’s details subform. Frequently it is more efficient to add passthrough variables, and the variables correlating to their coordinate dimensions to a DOD by dragging and dropping the variable from its input datastream. The drag and drop method is helpful for any DOD element if it closely resembles an element in an existing datastream. The following tools are useful for editing the DOD variable attributes:

  • Green plus - insert new DOD element after selected element.

Note: This can be used to insert a dimension, variable, or global attributes.

  • Duplicate sheets - copies element. Copies highlighted existing element and insert copy of that element immediately after the existing element.

  • Pencil - edits selected element.

  • Red ‘X’ - deletes selected element.

  • Arrow rotating counter clockwise - undo last edit.

  • Arrow rotating clockwise - redo last edit.

  • Single sheet - brings up DOD in text editor.

  • Green check symbol - attempts to automatically fix simple problems in a DOD.

  • Disk - save DOD to DSDB.

  • Green arrow pointing up - exports DOD to text file.

  • Drag and drop functionality - DOD elements can be dragged and dropped across and within DODs. Multiple elements can be selected using the shift and control keys and moved all at once.

  • If a DOD element is selected (dimension, variable, attribute), a mouse right click will bring up a menu of editing commands which if selected will be performed against the selected DOD element.

Adding a New DOD Element to a DOD

By inserting a New DOD Element.

  1. Select the element (such that it is highlighted) after which the new element will be inserted.

  2. Select the green plus icon located just above the dimensions to create the new element and open up the element’s edit form in a third frame on the left of the PCM (Figure 5.4 shows the DOD Variable form).

By using Ancillary Variable Button.

By Dragging and Dropping from an Existing Datastream.

  1. Locate a similar element in the PCM’s Datastreams tab in the left frame. For a passthrough variable, the value to copy is the retrieved variable in most preferred input datastream.

Note: the Retrieval Definition form can be accessed from the VAP Process window tab located at the top of the PCM page. Return to the DOD window by selecting the DOD tab.

  1. Expand the datastream, the desired DOD version, and variables of the input datastream (see Figure 5.5).

  2. Select the element(s) to retrieve by highlighting them (either with a single click for one variable, shift-mouse click for a range of variables, control-mouse click for more than one non-adjacent variable) and drag the element from the left frame and drop it into the desired location in the Retrieval Definition frame on the right.

_images/pcm_adding_new_variable_newDOD_and_plus.png _images/pcm_locate_passthrough_variable.png

Review Your DOD Elements

All elements and their attributes, regardless of how created, (imported, as a new element, or copied from an existing datastream) should be reviewed to ensure they meet ARM standards and have been properly defined. The primary measurement variable check box should be selected for all primary measurements so that this characteristic will be stored in the DSDB.

Add/Edit DOD Elements

  1. Access the element’s edit form (by either double clicking on the element’s name in the new DOD, or highlight the element and select the pencil icon located at the top of the page), or the elements DOD Attribute form (select the attribute and either double click or pencil icon), or select variable or attribute, right click, and select edit.

  2. Create and set the values of the element.

Note: When editing the DOD Variable elements:

  • If you specify additional dimensions for a variable in the DOD Variable form, those dimensions will be automatically added to the DOD.

  • If the QC box is checked, a QC variable will be created when the DOD Variable form is exited (i.e., the ‘Done’ button is selected).

  • Select the primary measurement box if the variable is a primary attribute.

  1. Select the disk icon located at the top of the DOD (Figure 5.2) to save the updated DOD to the DSDB.

For the example_vap process being developed, the passthrough variables were populated in the examplevap.c1 DOD v0.0 through a drag and drop process. Figure 5.7 shows these variables expanded with their original values. Figure 5.8 shows the same variables after they have been edited for the change in name and units of the cloud base height variable, and to meet VAP QC standards. To expedite the process, the attributes of the qc_first_cbh were copied from another *c1 level datastream. Note that in the following figure, the variables added to the basic template are noted in green, and in Figure 5.8, existing attributes that have been updated are indicated in red and new attributes are highlighted with a green bar.

_images/pcm_passthrough_variable_attr_before_update.png

Passthrough variables before editing.

_images/pcm_passthrough_variable_after_edit.png

Passthrough variables after editing.

For the example VAP we will add one new variable ‘new_cloud_base_height’ that is equal to the ‘cloud_base_height’ variable divided by 10 to convert its units to meters. The ‘new_cloud_base_height’ and ‘qc_new_cloud_base_height’ variables were created by duplicating the ‘cloud_base_height’ variable (select the cloud_base_height variable and then the paper duplicate icon). It is then edited to update its name and attribute values. The new values and their attributes are shown in the following figure.

_images/pcm_completed_DOD_exVAP.png

The figure above shows the completed example_vap.process DOD.

Fixing Problems in the DOD

Complete the definition of the Datastream by addressing the error, warning, and notice indicators in the DOD that appear to the left of the variables as red, yellow, and grey circles respectively. The colors indicate the confidence that a fix is needed and severity of leaving something unfixed. With red indicating it is highly likely a fix is needed and not fixing it could significantly impact a user’s or automated processing application’s ability to understand and utilize the information. Yellow indicates a fix is likely needed, but with known exceptions, so it is never auto fixed. Grey indicates a low impact, possibility subjective recommended change and not an indication of a problem. Mousing over the indicator will reveal a pop up describing the issue. If the pop up is ‘+ 1 child notice’, expand the variable and the indicator will move to the appropriate attribute of the variable and the pop up will be updated with the specific issue.

In Figure 5.10 the error ‘Missing or invalid coordinate dimension for this dimension’ occurs because the ‘backscatter’ variable is dimensioned by time and range. While the range dimension was automatically added to the DOD, ARM standards require that also be a variable ‘range’ included in the DOD if it can be defined. This issue can be resolved by returning to the input datastream, and dragging and dropping the ‘range’ variable into the DOD.

_images/pcm_DOD_error_messages.png

It is necessary to clear all variable attributes and global attributes whose values are populated by the VAP process itself otherwise changes in these will be associated with changes in DOD version. More importantly, ADI shared libraries will overwrite the values populated by the VAP process with the values indicated in the output DOD. As a result of a DOD that is populated by importing from an existing netCDF file, if a user does not delete the value of the ‘history’ global attribute, the history value from the imported file will be propagated to all output files. The error, “Attribute should not be NULL (set at run-time)”, indicated by a red circle, flags such problems. Attributes that are known to be set at run time are automatically deleted by the auto-fix function executed via the check box icon. The DOD should be reviewed for non-standard attributes whose values are defined by the VAP process and manually set their values to NULL.

CAUTION: Periodically save the DOD. Exiting the Processing Configuration Manager web page without saving will result in the loss of data.

Submitting DODs to Metadata Service

To facilitate the storage, exploration, and retrieval of data associated with a datastream all ARM datastreams must have an instrument code and the datastream’s DOD must be stored in the ARM MetaData database. DODs can be submitted to the ARM MetaData database via the PCM DOD GUI through a link to the ARM MetaData Service (MST)submission form located in the upper right corner labeled ‘Submit for Review’ in green text. If the DOD has already been submitted and approved the link is labeled ‘View Review[APPROVED]’. If submitted but not approved the link reads ‘Under Review’. For VAPs that are to be released to the ARM production processing system, a developer responsible should work with their translator on populating this form. Figure 4.1 shows the Datastream Information page for the examplevap.c1 output datastream of the example_vap process.

CAUTION: Do not perform this step unless you are working with an ARM science translator on a production ARM VAP.

_images/pcmdod_MST_link.png