CEIDARS 2.5 Database Structure
This page last reviewed March 13, 2013
This document specifies the logical design of the CEIDARS database. It consists of
two parts: 1) a description of the basic data tables
and their contents and 2) a specification of the relationships between the tables.
Together they constitute the logical schema for CEIDARS.
DATABASE TABLES
The CEIDARS database consists of two categories of information: source information and utility information. Source information includes the
basic inventory information generated and collected on all point and area sources. Utility information generally
includes auxiliary data, which helps categorize and further define the source information. Used together, CEIDARS
is capable of generating complex reports based on a multitude of category and source selection criteria.
Source Information
Source information contains the basic data on the facilities, stacks, devices, and
processes that emit criteria pollutants into the air. There are two types of sources: point and area. Point sources
are generally large sources that are individually identified in the database and have fixed locations, such as
power plants or steel mills. Area sources are generally small sources that individually emit small quantities
of pollutants but collectively result in significant emissions. Examples of area sources are smaller plants
not accounted for in the point source inventory, and sources of emissions occurring over broad geographic locations,
such as pesticide usage, applications of architectural coatings, and motor vehicle activity.
Information generated and collected for point and area sources are stored in the tables
listed below.
- FACILITY - This table contains the name, address and UTM of each emitting facility in CEIDARS. A combination of the county ID, the facility ID, the airbasin code and the district code uniquely identifies a facility. These four fields together form the primary key for the table.
- STACK - This table contains the pertinent stack parameters for all the facilities which have stacks. These parameters include stack height, flow rate, diameter, temperature and UTM coordinates of each stack. Not all facilities have stacks. The primary key for the stack table consists of the county ID, facility ID, airbasin code, district code, and the stack ID.
- DEVICE - This table contains the information identifying each device in a facility which has emitting processes. Each facility identified in the database should have one or more devices. Data stored in this table includes local device name, permit ID, and number of devices represented. The primary key to the device table is the county ID, facility ID, airbasin code, district code and the device ID.
- PROCESS - This table identifies all processes, which emit pollutants. For point sources, each device identified in the database has one or more emitting processes. For area sources, each category of emissions is identified as a process. This file includes such processing information as monthly thruput, process rate, process descriptions, operating cycles, and stack ID if the emissions from the process are vented to a stack. Processes and devices may emit pollutants directly to the ambient environment or they may be vented to a stack. Several devices and many processes may be vented to a single stack. The primary key to this table is the county ID, facility ID, air basin code, district code, device ID and the process ID.
- EMISSION - This table contains the actual emissions for each emitting process. Each process emits one or more pollutants. For each pollutant emitted, the table carries information on the emission factors used, amounts emitted, methods of calculation and types and efficiency of control equipment used. The primary key to this table is the county ID, facility ID, air basin code, district code, device ID, process ID and the pollutant ID.
- EXCESS - This table records the unplanned excess emissions, which may result from breakdowns, variances, or unusual occurrences. The primary key to this table is the county ID, facility ID, air basin code, district code, device ID, process ID, the pollutant ID, along with the type, year and quarter of the excess emissions.
Detailed listings and descriptions of data fields for each of these tables are contained
in the CEIDARS Data Dictionary.
Utility Information
Beyond the six tables above which carries the basic inventory data from the various
sources in the state, there is a host of other tables which supplement the basic inventory data. These are used
to help categorize the sources, define the codes used, forecast future scenarios, and assist in quality assurance.
Listed below are brief descriptions of some of the more significant tables:
- COABDIS - This table contains the county ID and names of all counties in California, the airbasin codes and names, and all district codes and names.
- POLLUTANT - This table lists all the
pollutants maintained in CEIDARS. It shows the pollutant long and short names by pollutant ID. For criteria
pollutants, this pollutant ID is the EPA SAROAD code. For toxics, it is generally the Chemical
Abstract Service (CAS) identification number, unless one has not yet been
assigned -- in which case, a four digit ARB pollutant ID is used. - SCC - This table contains the Source Classification Codes and the descriptions of what they mean. It also contains codes to characterize process, entrainment and materials, and dimensions.
- SIC - This table contains the Standard Industrial Classification codes along with their descriptions. Additionally, it contains codes characterizing the activity and subactivity.
- CATEGORY - This table contains all combinations of SCC and SIC used in CEIDARS. Each combination provides a grouping that is useful for categorizing the sources and provides flexibility in reporting. For example, forecasting is performed based on the growth and control categories defined by the combination of SCC and SIC. Other groupings are used for producing summary reports. It also contains the fractions for reactive organics and profiles for particulates and organic categories. The CATEGORY table also provides a pointer to ARB's new Emission Inventory Code (EIC) categorization scheme.
- EIC - This table lists ARB's new Emission Inventory Codes (EIC). Each fourteen digit EIC is composed of four parts. The first part is a three-digit code pointing to the summary category. The second part is a three-digit code pointing to the source category. The third part is a four-digit code pointing to the materials description. And the last part is another four-digit code pointing to a subcategory.
In addition to these utility information tables, there are numerous others, which
serve primarily to provide additional information, and to describe the codes used in the other tables. A description
of these tables and their contents are listed in the CEIDARS Data Dictionary.
RELATIONSHIP BETWEEN TABLES
In order to be useful, records in tables must be able to be joined together in a meaningful
way. That is why the logical schema must also specify the relationship between tables and paths for joining
them.
The figure below is a data structure diagram, which illustrates the relationships between
the various CEIDARS tables. (Only the primary tables are shown -- otherwise the diagram would be too busy
and confusing.) The clear boxes are the source information tables, the shaded boxes are the utility tables,
and the lines between them shows their relationship with one another.
The arrowheads on the relationship lines show whether the relationships between records
are one-to-one, one-to-many or many-to-many. For example, the line between the FACILITY table and the
DEVICE table has a single arrowhead at the FACILITY table and a double arrowhead at the DEVICE table: this means
that a single facility may have multiple devices, and conversely, a specific device can be located in only
one facility.
Relationships between tables are also implied in the
CEIDARS Data Dictionary
where the domain for a field in a table is another table.