The Data Pool is a concept that was proposed by the Science Working Group on Data (SWGD). The SWGD viewed the lack of adequate data distribution solutions by the Distributed Active Archive Centers (DAACs) as the most critical problem that the Earth Observing System (EOS) faces. The Data Pool concept was proposed as an answer. DataPool uses a large (i.e., many TB) disk cache at each DAAC to hold EOS data after initial insertion for an extended period (e.g., a month or more). The contents of the cache are tuned to the needs of the user community. The goal is to increase the distribution capacity of the EOS Data and Information System (EOSDIS) Core System (ECS) by a factor of five by significantly reducing the need to access a tape archive. In addition, user access to this data should be made faster and easier.
This delivery of the DataPool includes functionality to maintain the DataPool disks and inventory (define insert rules, insert science and browse granules and metadata, clean up expired granules, report on data pool access, and tune the data pool content), as well as access data in the Data Pool by subscription, ftp, or the web. It also includes the ability to store non-ECS data and, on demand, convert files using the standard ECS data format (HDF-EOS) into GeoTIFF.
Data is stored in the Data Pool in a predefined directory structure, as follows:
<BMGTmetadatagroup>/<shortname.versionid>/<acquisitiondate>
where <BMGT metadata group> is a grouping of ECS
collections by
instrument and mission, except in the case of MODIS data, where the
grouping is by instrument, mission, and major discipline. For example,
MOAA is MODIS Atmosphere data from the Aqua mission, ASTT is ASTER data
from the Terra mission.
Science granules (data files) are stored in the appropriate directory based on collection and acquisition date. A corresponding metadata file is stored with each science granule in the science granule directory, in XML format with a .xml extension. Links to all associated Browse files in the DataPool will also be stored in the science granule directory. Browse files will be stored in a Browse directory in the DataPool by extracting the jpeg image embedded in the browse HDF files. Note that browse files that do not have embedded jpeg images will not be visible via the DataPool.
Check the Release Notes for known problems and potential workarounds.