About HEIS
Getting Started
Guide to the Map
About the Data
FAQ

Maps and Data Frequently Asked Questions

How can I see a full list of included data?
What is metadata?
The map displays the data as a range. How can I get the actual numbers?
Why did you use these particular data layers?
What data did you want to include that you couldn't?
Why do you include only Massachusetts?
What is statistical significance and why is it so important?
How accurate is the data?
What are SIRs and why are they used rather than rates?
What are limitations of the data?
What does -999 mean?
Why is some data displayed as points and some as shapes?
How can I find out where the data came from, how it was collected, quality control methods, and other information?
Why do the time frames for the data sets vary?
Why is it difficult to compare data?
How can I/should I interpret what I see?
Should I cite a map and how should I do it?
What data will you include in the future?
   

How can I see a full list of included data?

Follow the link from Build Your Own Map and scroll through the layers.
back to top

What is metadata?

Metadata is “data about the data.” Metadata is all the information that will help you better understand the data, and includes details about the quality, source, and content of the data. The use of metadata is standard in the GIS industry, and the methods for reporting metadata used here are the standard developed by the Federal Geographic Data Committee (FGDC) To access metadata, click on the data layer you are interested in and the metadata will open in a new window.
back to top

The map displays the data as a range. How can I get the actual numbers?

You can use the Identify tool and click on a specific town or other map item to reveal detailed information.
back to top

Why did you use these particular data layers?

The primary objective of the pilot version of this application was to make the Institute's original research on breast cancer and the environment on Cape Cod publicly available. The secondary objective was to enhance these data with other publicly available environmental and public health data sets for MA. The data included in this version of the application are primarily publicly available data sets from the Environmental Protection Agency (EPA), Massachusetts Geographic Information System (MASS GIS), and the Massachusetts Department of Public Health (DPH).
back to top

What data did you want to include that you couldn't?

The purpose of MassHEIS is to bring together geographic data on health effects and environmental pollutants so that potential relationships can be explored by researchers and the public. In the context of this project, environmental data are useful if they can be used to estimate pollutant exposures or evaluate relationships between human activities and environmental change (e.g., how does drinking water quality relate to the land uses within drinking water well recharge areas?). Health data are useful if they are systematically collected, can be expressed with reference to an expected frequency in the underlying population, and can be shared on a scale that is relevant to variations in pollutant exposures or demographic factors of interest. The data currently included in MassHEIS are the most relevant we were able to identify, but many of the included data sets have limitations. Data on many environmental and health features of interest are simply not available. For example, only limited data are available on prevalence of learning disabilities and autism; rates of many common cancers and health effects would be more informative if mapped by census tract rather than by town; and pollution monitoring data are extremely scarce. With a greater state and federal investment in public health tracking, more could be learned.

A few examples of data we were not able to obtain are described below:
(1) Established risk factors for breast and other cancers, including body weight, alcohol use, and tobacco use: These data are only available for major metropolitan statistical areas in MA through the Behavioral Risk Factor Surveillance System (BRFSS).
(2) Brownfields: A compilation of all brownfields in the state does not currently exist. The MassDevelopment agency terms brownfields as "vacant, abandoned, or underutilized industrial or commercial properties where expansion, redevelopment, or improvement is complicated by real or perceived environmental contamination and liability." MassHEIS does, however, display a data set, the MassDEP Tier Classified Oil and/or Hazardous Material Sites, pinpointing all the sites currently going through the Mass. 21E regulatory clean up process.
(3) Pesticide application: Silent Spring Institute compiled detailed historical data on wide-area pesticide applications across Cape Cod from the 1940s to 1990. Comparable historical information for the rest of the state has not been compiled, and no electronic, geographically-oriented data on current pesticide applications are available in MA to map (although certain land use categories may be assumed to involve pesticide use). Other states, such as California, have pesticide application reporting requirements and tracking systems that, if implemented in Massachusetts, would allow mapping of all pesticide applications by registered applicators for the entire state.
(4) Current data on modeled hazardous air pollutants: The EPA published a national hazardous air pollutant model using 1996 data, but has not published an updated version. New data would allow trend analysis and comparisons with current patterns of respiratory health effects.
(5) Cases of Asthma: The BRFSS collects self-reported data on cases of asthma but these data are only available for limited geographical areas. Instead, we chose to map incidence of asthma or asthma-related hospitalizations, because these data are available by town for each town in MA.
back to top

Why do you include only Massachusetts?

Prompted by the breast cancer incidence rates that are 20% higher on Cape Cod than the rest of MA, Silent Spring Institute in 1994 began a long-term research program to investigate the possible role environmental factors have on breast cancer incidence on Cape Cod. Because Cape Cod is a fragile ecosystem, with water resources easily affected by contaminants deposited on the land surface or leached from wastewater, breast cancer activists called for an investigation of the role that environmental pollutants played in the long-term health of Cape residents. As part of that study, Silent Spring Institute created a Geographic Information System (GIS). GIS is a computerized database that can be used to store, analyze, and display data, particularly data associated with locations on a map.
HEIS is intended to provide public access to the integrated health and environmental information gathered during the course of the Cape Cod Breast Cancer and Environment Study and thus most detailed data are available for Cape Cod. The secondary objective was to enhance these data with other publicly available environmental and public health data sets available for MA. The data included in this version of the application are primarily publicly available data sets from the Environmental Protection Agency (EPA), Massachusetts Geographic Information System (MASS GIS), and the Massachusetts Department of Public Health (DPH).
back to top

What is statistical significance and why is it so important?

Statistical significance is a useful guideline in interpreting research findings because it assesses the likelihood that results are due to chance alone. Statistical significance is based on a number called a "p-value." By convention, results achieving a p-value less than 0.05 are called statistically significant, which simply means that there is one chance in 20 that the finding is due only to chance. The traditional cutoff for statistical significance of p less than 0.05 is arbitrary, and an Standardized Incidence Ratio (SIR) that approaches but does not attain statistical significance may still be of interest.
back to top

How accurate is the data?

Because the data come from a number of different sources, the quality of the data can vary. Generally there are two components to the quality of information in GIS. The first is the accuracy of the geographic coordinates used to locate information relative to its actual location . Second is the accuracy of the attributes associated with the geographic point. The first component is unique to GISs while the second component would apply to any database. Different sources have different quality control mechanisms. Most of the data included in MASS HEIS comes from organizations with established quality control procedures and it is important to refer to the original source when evaluating data quality. As a quality control measure, all Silent Spring Institute data were verified by a staff member who did not participate in the data entry. By clicking on the data layer you can learn about the sources of other data sets included in HEIS and can follow the links back to the original sources.
back to top

What are SIRs and why are they used rather than rates?

Standardized Incidence Ratio, or SIR, is a common tool for monitoring disease rates. Incidence is the number of newly diagnosed cases in a given location during a given time period. An SIR compares the actual number of cases for a given place and time to the number that would be expected based on cancer rates in some comparison area. SIRs are usually written as 100 or 125 instead of 1.00 or 1.25. An SIR of 100 means that the actual number of cases equals the expected number. An SIR of 125 means that the actual number of cases was 25% higher than expected. An SIR of 75 means that the actual number of cases was only 75% of the expected number. For more information, see our glossary.
back to top

What are limitations of the data?

The quality of inferences that can be drawn from the data depend on the quality of the original data. Information in HEIS depends on varying sources of data, methods of collecting and organizing data, scales of the data, and completeness and accuracy of historical records. In addition, Silent Spring Institute has followed standard practice in not reporting cancer information where fewer than 5 cases appeared in a particular town or census tract in a particular time period in order to protect the privacy of individuals. We also only included data occurring with a high enough frequency to be meaningful when mapped. For example, Institute researchers determined that birth defects are not prevalent enough to map at the municipal level, given that such data have only been reported to the public for three years. In the future, it may be possible to map birth defects by town by aggregating data over a longer time period. To learn more about any particular data layer, click on the data layer.
back to top

What does -999 mean?

Where data was not available or censored, the value -999 was substituted to clearly differentiate the record from those with real data. Silent Spring Institute is invested in protecting the privacy of individuals and has followed standard practice in censoring town level cancer data rates where fewer than five cases appeared in a particular time period.
back to top

Why is some data displayed as points and some as shapes?

Data for an area (e.g., town, census tract, county, or school district) are represented by a color shade for that area. Data related to a specific location (e.g., a factory) are represented as points. Most of the health data is represented as shapes and most of the environment data as points. Although the reported observation (e.g., cancer rate) for a municipality appears evenly distributed throughout the town, it may not be. Further, though much of the environmental data are represented by a single point or location on the map, that does not mean that the impact of the facility does not reach further than its immediate location. To protect privacy, health data that pinpoints individual cases is generally not released. Instead, health data are aggregated and displayed by census tract, block group, or town level.
back to top

How can I find out where the data came from, how it was collected, quality control methods, and other information?

Metadata, "data about the data" is information that will help you better understand the data, and includes information about the quality, source, and content of the data. The use of metadata is standard in the GIS industry, and the methods for reporting metadata used here are the standard developed by the Federal Geographic Data Committee (FGDC) Most data sets include links back to their originating sources. To access metadata, click on the data layer you are interested in and the metadata will open in a new window.
back to top

Why do the time frames for the data sets vary?

Time frames for data sets we included are typically limited by the source of the data. For example, cancer rates are not available prior to 1982, when the MA Cancer Registry was initiated. In some cases, we were able to provide more extensive historical environmental data for Cape Cod as a result of the Institute's original research, but good quality historical data are largely unavailable electronically for the rest of the state.
back to top

Why is it difficult to compare data?

It can be difficult to compare data over time and over geographic areas because of varying sources of data, methods of collecting and organizing data, scales of the data, and completeness of historical records. For example, different sources of data you may see on the maps include: breast cancer statistics for your town from the Massachusetts Cancer Registry; pesticide spraying data compiled by Silent Spring Institute researchers; and toxic release data supplied from the EPA’s Toxic Release Inventory (TRI) index. You can learn more about the datasets by clicking on the data layers.
back to top

How can I/should I interpret what I see?

Very carefully. Maps are a powerful tool for conveying information, but they can be easily misinterpreted. When interpreting a map that you have created, be sure that you look carefully at the legend to confirm the range of values each color represents. When creating a map with multiple layers, pay close attention to the time frames. While they do not have to be concurrent, you should consider their time frames in making inferences about relationships between factors. In addition, be aware of how the different data sets we have provided were generated. Some, like hazardous air pollutants, are models, or projections. Others, such as asthma, are indicators of a phenomenon (e.g., asthma-related hospitalizations), rather than an absolute measure (e.g., people in the town with asthma). Use the metadata feature to learn the source of the data you are mapping. Finally, variation is expected due to chance. Look for indications of statistical significance, consistent trends over time, and large sample sizes to determine if a phenomenon is significant. We have gone to great lengths to attain the best possible data sets for this project, however, they do not communicate seamlessly. To make accurate inferences using this mapping tool will require awareness on your part of the data sets with which you are working.
back to top

Should I cite a map and how should I do it?

We ask users of MASS HEIS to please cite the use of the maps in your work. Citing data files and maps retrieved from online sources is important for the following reasons:
• It is critical to acknowledge the authors and developers of a dataset, whether published or unpublished;
• It is valuable for us, our funders, and our data sources to know that the data and information we make available are useful to users;
• The inclusion of data citations is crucial to provide the relevant information needed for users to confirm the accuracy and credibility of the data for further information or analysis. The example below represents a method to cite the information; however, publications often have their own style manuals, and we suggest you check reference formatting instructions.
Maps (Dynamically Generated)
Identify the name of the mapping service as well as the name of the person generating the map.
GENERAL FORMAT:
Author [if there is one]. "Map title" [format]. Scale. Computer database title [format]. Edition. Place of production: Producer, Date of copyright or production. Using: Author. Computer software title [format]. Edition. Place of production: Producer, Date of copyright or production.

EXAMPLE:
"Massachusetts Breast Cancer Incidence 1995-2002" [map]. Scale 1" = 40 miles. Silent Spring Institute MassHEIS [computer files]. Anytown, MA: Jane Doe, 2006. Using Silent Spring Institute MassHEIS [online application]. Newton, MA: Silent Spring Institute. 2006.

back to top

What data will you include in the future?

Silent Spring Institute is currently working on adding data on drinking water and surface water quality. We are also adding census data that describe demographic charateristics (e.g. age of home, education, and income), births to smokers, low birth weight, autism, and blood lead levels in children. Please use the Feedback box on the HEIS entry page to suggest other data sets you would like to be included in the future.

back to top

Google
Silent Spring Institute
E-resources website
The e-resources website was made possible by
a grant from the National Library of Medicine
Go to the NLM website