Characteristics of NA Data

Characteristics of the North America 1-km AVHRR Data Set

ZHI-LIANG ZHU, Hughes STX Corporation, U.S. Geological Survey, EROS Data Center, Sioux Falls, South Dakota 57198, U.S.A.

LIMIN YANG, Center for Advanced Land Management Information Technologies Conservation and Survey Division, University of Nebraska-Lincoln, Lincoln, Nebraska 68588-0517, U.S.A.

Abstract. The North America portion of a new global 1-km AVHRR time-series data set was produced recently by the U.S. Geological Survey, EROS Data Center. Characteristics of the data set were evaluated for scan-angle distribution, image area distortion as the result of map projection, distribution of high solar zenith angle, and cloud presence in image composites produced using maximum values of normalized difference vegetation index (NDVI). The evaluation showed that the compositing procedure exhibits a bias favoring off-nadir pixels, particularly at post-nadir (forward scanning) positions in the winter months. Results for scan angle distribution and image area distortion provide a basis for calculating the data's effective minimum mapping area for various geographic locations. The amount of missing data due to large solar zenith angle effect varies from 42 percent in January to 1 percent in July. Cloud contaminated pixels estimated for the thirty-six 10-day composites range from 7.5 percent in May to 1.6 percent in November. Recompositing the North America data set from 10-day cycles to monthly cycles can effectively reduce the amount of cloudy pixels in the data.

1. Introduction

A new Advanced Very High Resolution Radiometer (AVHRR) data set covering the land areas of the Earth is being produced by the Earth Resources Observation Systems (EROS) Data Center (EDC) of the U. S. Geological Survey (USGS) in Sioux Falls, South Dakota. Production of this data set is based on specifications developed through the International Geosphere Biosphere Programme (IGBP) (Townshend 1992) with the USGS, the National Aeronautics and Space Administration, the National Oceanic and Atmospheric Administration, and the European Space Agency as the primary sponsors. A network of 30 AVHRR ground receiving stations provides the daily AVHRR data that make the project viable ( Eidenshink and Faundeen 1994 ).

Because data sets of this magnitude for Earth science are unprecedented, an evaluation of the initial data set is undertaken in order to document and understand the salient qualities. The evaluation was based on the North America subset of the new global data. It focused on (1) determining the distribution of the sensor's scan angle in the composite data, (2) evaluating image area distortions due to the use of Goode projection and resampling technique, (3) documenting high solar zenith angle effect, and (4) assessing presence of residual clouds in the data.

The processing flow for the new global data set, as recommended by IGBP, is described in Eidenshink and Faundeen (1994) . The steps in the process are (1) orbital stitching of daily paths, (2) radiometric calibration of the five original AVHRR channels, (3) geometric registration to the Interrupted Goode Homolosine map projection ( Steinwand 1994 ), (4) calculation of normalized difference vegetation index (NDVI), and (5) multitemporal (10 day) maximum value compositing (MVC) based on NDVI. Next, atmospheric correction for ozone and Rayleigh scattering is performed for AVHRR channels 1 and 2 of the resultant NDVI composite; in the last step, NDVI band is recalculated using the two corrected AVHRR channels.

The resulting composites represent a 10-day (3 per month) time series and contain the following 10 bands: AVHRR channels 1-5, NDVI, satellite zenith, solar zenith, relative azimuth, and date index. The North American data consist of two Goode projection regions ( figure 1 ) and thirty-six 10-day composites from April 1-10, 1992 (first period) to March 21-31, 1993 (last period).

2. Scan angle distribution

Previous work has shown that NDVI, and subsequently the MVC process, are a function of AVHRR scan angle (Holben and Fraser 1984, Deering and Eck 1987, Gutman 1991, Moody and Strahler 1994), and using atmospherically corrected data in MVC may shift MVC selection toward high off-nadir positions (Cihlar and Huang 1994). In this study, MVC was computed prior to correcting for atmospheric effects ( Eidenshink and Faundeen 1994 ). The per pixel scan angle values were derived from satellite zenith band in or near the solar principal plane (95 percent of the pixels are within ñ 40ø relative azimuth angle). Angle values range from 0 to 48 degrees for both pre-nadir (backscatter) and post- nadir (forescatter) scanning directions.

Results show that, on average, there is an MVC bias toward post- nadir direction ( figure 2 ). This bias is strongest for pixels in the northern half of the North America data, particularly at the extreme 41 to 48 degrees post-nadir. Most pixels are distributed in high off- nadir positions; chances of selecting near-nadir pixels through MVC are low in the North American data set.

Most of the bias may be attributed to preferential selection of post-nadir pixels in the winter months, as indicated in figure 3(a) where a temporal variation of the mean scan angle is evident. In this figure, the weighted mean scan angle was calculated with pre-nadir scan angles as negative and post-nadir scan angles as positive. In general, for the months between late September 1992 and February 1993, MVC strongly favored pixels from the post-nadir direction over pixels from the pre-nadir direction. Between April and middle of September 1992, and March 1993, the average scan angle fluctuated around the nadir. Figure 3(b) shows mean and standard deviation (STD) calculated based on absolute scan angle values (pre- and post-nadir combined).

Given that AVHRR pixel size is a function of scan angle (Goward et al. 1991), the majority of the North America AVHRR pixels in the composites have ground dimensions greater than 1.1-km nominal scale. Mean pixel area, for example, can be calculated in the range from 1.59 to 1.85 km 2 based on the statistics given in figure 3(b) . Actual ground dimensions vary with pixel locations and can be similarly computed using the sensor-target-earth geometry.

The observed NDVI bias toward off-nadir positions may be mitigated to a certain extent, by extending the compositing period from 10 days to 1 month. Figure 4 shows examples of re-compositing three 10-day composites into 1 monthly composite for 2 months (August and October 1992). Qualitatively, the re-compositing has an effect of improving off-nadir scan angle's distribution toward near-nadir position (0 +/- 10 degrees). This potential improvement of image quality provides a rationale for using extended composite period (up to 1 month) in land characterization applications of areas where good image data are difficult to obtain for the shorter 10-day compositing period.

3. Image area distortions due to map projection

Geometric registration transforms satellite images into a precise map projection. The process of geometric registration for the global 1-km AVHRR data set involves application of a satellite model to derive a systematic correction and a ground control point matching technique to update the model ( Eidenshink and Faundeen 1994 ). Based on the updated model, raw AVHRR data are mapped into the Goode projection using the nearest neighbor resampling method.

Although the use of Goode projection reduces image distortions in large land areas ( Steinwand 1994 ), changes in local scale and resolution of the original image still occur as the result of data being compressed and expanded to fit the map projection. To examine these image distortion effects in the North American data, a technique reported by Steinwand et al. (in press) was adopted. A checkerboard image consisting of regularly spaced 10-by-10 pixel squares was created; each square had pixel values ranging from 1 to 100. This checkerboard image was reprojected to various regions of the North America Goode projection, using georegistration grids and the nearest neighbor resampling method. Twenty-four sample windows were taken from representative geographic locations; pixel values of the squares (now skewed) were counted in the new map projection space. Changes in these values represent image area distortions caused by both the use of map projection and the result of the resampling method (Steinwand et al. in press).

The samples are listed ( table 1 ) in the ascending order from 20.85øN to 84.80øN in latitude. All samples except one are located in the western hemisphere. If all 100 pixel values (ranging from 1 to 100) from an original square are present in the new block, it means 100 percent of the original data area is transferred and no loss of information occurs. In table 1 , samples taken from low latitudes have no reduction in the original data area except two samples, which are located near the central meridian (100øW). In the middle and high latitudes, area compression becomes more noticeable. At the highest latitude sample point, only 69 percent of the original data area is preserved. Thirty-one percent of original information is lost at this point, resulting in a change in the spatial resolution.

As the result of assigning pixels from the original orbital projection to the new map projection, some pixels are used more than once. Pixel duplications in a sample represent a local area enlargement; pixels are duplicated to represent the same ground feature. This does not add to the information content of the data set, and the effect can also make measuring scale in the data misleading. As indicated in table 1 , on average, most of the sample windows' pixels are mapped more than once, and maximum duplication is up to four times as high. Note that a portion of this error may be related to the use of nearest neighbor resampling. The error may be reduced if cubic convolution or other interpolation resampling techniques are used.

4. Solar zenith angle effects

Missing data exist in the North America data as the result of masking for large solar zenith angle (SZA) that occurs at the high latitudes during the winter season. Under such conditions (low solar illumination and long atmospheric path length), data recorded by AVHRR can be erroneous in representing land cover features and are difficult to calibrate. Due to this limitation, all pixels that have SZA greater than 80 degrees are flagged in MVC processing; no NDVI is computed for these pixels.

For North America, the areal extent of missing data due to cut-off SZA is a function of time of the year: large in winter months and small in summer months. Table 2 lists statistics of missing data for all 12 monthly composites from April 1992 to March 1993. Note that the percentage of missing data increases from 1 percent of total pixels in July to 42 percent in January. The missing data can present problems for deriving land characteristics in high latitudes using the multitemporal images.

5. Assessing cloud extent in the composites

The MVC technique used to generate this data set minimizes cloud effect by selecting the "clearest" pixels over the composite period. However, when the period of cloudiness is equal or longer than the compositing period, cloud pixels cannot be removed by the compositing technique. In addition, subpixel sized clouds are not uncommon in AVHRR data. The presence of cloudy and mixed cloud-land pixels contaminates the data set (Goward et al. 1991, Moody and Strahler 1994).

The assessment for residual clouds was based on a 1-percent systematic sample of the original North America data set; the method was adopted from several previous studies for assessing global and continental distribution of cloud cover using AVHRR data. (See Baglio and Holroyd 1989, Stowe et al. 1991 for more details.) The method uses AVHRR channels 1 through 5, along with solar illumination and viewing geometry for separating clouds from other land features. In this study, 3 cloud screening tests were applied to each of the 10-day composites. The first test was based on the fact that most land surfaces appear much darker than clouds in AVHRR channel 1. The second test took into account the emissivity difference between channel 4 and channel 5 for water and ice, and used channel 4 minus channel 5 radiative temperature difference to detect thin cirrus clouds and clouds in polar latitude (Stowe et al. 1991). The third test used temperature difference between channel 3 and channel 4 to separate cloud from cold snow-covered land (Baglio and Holroyd 1989).

Thresholds used for the 3 tests are similar to those suggested by Stowe et al. (1991) and Baglio and Holroyd (1989) (channel 1 reflectance > 0.4, channel 4 minus channel 5 temperature dependent, and channel 3 minus channel 4 > 30øK). The threshold for the third test (channel 3 minus channel 4) was set relatively high to minimize inclusion of snow-covered land at high latitude. All 3 tests were applied to areas north of 50øN, and areas south of 50øN where snow is present in the winter months; pixels with values exceeding all three thresholds were labeled as cloud. For areas south of 50øN in the summer, cloud pixels were determined based on the agreement of first and second tests.

Table 3 summarizes estimated cloud residuals for the thirty-six 10-day composites. The amount of cloud pixels varies from less than 2 percent in November to almost 8 percent in May. The 10-day composites of spring (April and May), midsummer (August), and fall (late September to early October) exhibit higher percentage of cloud cover than those of winter (November through January). Because no attempt was made in this study to identify subpixel sized clouds, the percentages are regarded as conservative estimates. Geographically, the most cloud-prone regions are found in the tropical latitudes and in Labrador east of Hudson Bay, Canada. Figure 5 illustrates, as an example, the spatial distribution of clouds identified from the April 21-30 composite and corresponding AVHRR channels 1, 4, and NDVI.

A further comparison between 10- and 30-day composites of August and October was made to assess potential reduction of clouds through recompositing. The amount of clouds ranges from 3.7 to 6.2 percent for the 3 August 10-day composites and 3.2 to 6.4 percent for the 3 October composites (table 3). The August and October monthly composites contain 1.3 and 3.1 percent estimated cloud pixels, respectively. This comparison indicates that recompositing over longer periods (for example, monthly) can effectively reduce cloud contaminated pixels.

6. Summary and Conclusions

The key characteristics evaluated for the new global 1-km AVHRR data in North America are:

The MVC procedure selects off-nadir pixels in the North America data set, despite the absence of atmospheric correction. This is consistent with work previously reported (Gutman 1991, Moody and Strahler 1994).
The distribution of scan angle is such that in fall and winter, MVC has a strong bias towards post-nadir pixels especially in the high latitudes. This bias is reduced in the lower latitudes in North America. In spring and summer, scan angles are distributed more evenly between pre- and post-nadir directions than in fall and winter.
The global Goode projection used in the data set has data compression and duplication effects, which cause changes in the spatial resolution. The effects vary with locations and can be measured using the simple checkerboard approach.
The 1.1-km nominal pixel size (at nadir) is modified by variables such as the high scan angle and image area distortion described. An effective minimum mapping area can be determined for a given geographic area based on these variables. Calculation of the effective minimum mapping area has implications for land cover mapping applications using the data set.
Techniques can be adopted by users for handling missing data due to high SZA in the North America data. For deriving land cover information, a best estimate for winter missing data is adequate considering the area is relatively homogeneous in ground cover with low vegetation but lasting snow and ice. However, for deriving surface biophysical parameters from NDVI (for example, albedo and leaf area index), temporal or spatial interpolation, or both methods are preferable especially when additional information on surface condition is available (for example, Landsat data or ground measurements). Caution must be taken, however, to minimize potential distortion of the original data due to limitations of the interpolation algorithms.
The amount of clouds in the data set is relatively high in warm seasons and low in the winter. Comparison of 10-versus 30-day composites suggests that for seasonal land cover characterization over the entire North America, recompositing over a longer period (for example, monthly) is preferable. The trade-off between extending NDVI compositing period and lowering temporal resolution, as well as potential propagation of spatial registration errors, needs to be assessed based on research objectives and geographic area. For regional or local applications, it is recommended that the users examine each 10-day composite for their specific study area, as the quality of 10-day composites varies with geographic locations and time.

Acknowledgments

The authors would like to thank the following individuals at EDC:

Jeffrey C. Eidenshink for the global 1-km AVHRR data processing, David J. Meyer for cloud screening, Daniel R. Steinwand for assessing projection distortion, and Charles E. Wivell for the satellite-earth geometry. Zhi-Liang Zhu's work was under USGS contract 1434-92-C- 40004. Limin Yang's work in this research was supported through U.S. National Aeronautics and Space Administration Grant (NAGW-3940).

References

BAGLIO, J.V., and HOLROYD, E.W. III, 1989, Methods for operational snow cover area mapping using the Advanced Very High Resolution Radiometer: San Juan Mountains test study. USGS Research Technical Report, USGS/EROS Data Center, Sioux Falls, SD.
CIHLAR, J., and HUANG, F., 1994, Effects of atmospheric correction and viewing and restriction on AVHRR data composites. Canadian Journal of Remote Sensing, 20, 132-137.
DEERING, D.W., and ECK, T.F., 1987, Atmospheric optical depth effects on angular anisotropy of plant canopy reflectance. International Journal of Remote Sensing, 8, 893-916.
EIDENSHINK, J.C., and FAUNDEEN, J.L., 1994, The 1-km AVHRR global land data set: first stages in implementation. International Journal of Remote Sensing, 15, 3443-3462.
GOWARD, S.N., MARKHAM, B, DYE, D.G., DULANEY, W., and YANG, J., 1991, Normalized difference vegetation index measurements from the Advanced Very High Resolution Radiometer. Remote Sensing of Environment, 35, 257-277.
GUTMAN, G.G., 1991, Vegetation indices from AVHRR: an update and future prospects. Remote Sensing of Environment, 35, 121-136.
HOLBEN, B.N., and Fraser, R.S., 1984, Red and near-infrared sensor response to off-nadir viewing. International Journal of Remote Sensing, 5, 145-160.
MOODY, A., and STRAHLER, A.H., 1994, Characteristics of composited AVHRR data and problems in their classification. International Journal of Remote Sensing, 15, 3473-3491.
STEINWAND, D.R., 1994, Mapping raster imagery to the Interrupted Goode Homolosine projection. International Journal of Remote Sensing, 15, 3463-3471.
STEINWAND, D.R., HUTCHINSON, J.A., and SNYDER, J.P., [In press], Map projections for global and continental data sets and an analysis of pixel distortion caused by reprojection. Photogrammetric Engineering & Remote Sensing.
STOWE, L.L., McCLAIN, E.P., CAREY, R., GUTMAN, G.G., DAVIS, P., LONG, C., and HART, S, 1991, Global distribution of cloud cover derived from NOAA/AVHRR operational satellite data. Advance in Space Research, 11, 351-354.
TOWNSHEND, J.R.G. (editor), 1992, The global 1 km AVHRR data set: further recommendations. IGBP-DIS Working Paper No. 3. The International Geosphere Biosphere Programme Data and Information System, University of Maryland, College Park, MD, U.S.A.

Figure 1. Map of North America showing region 1, 90øN-40ø44'N, and region 3, 40ø44'N-the Equator, of Interrupted Goode Homolosine projection. See Steinwand (1994) for a complete description of the Goode projection.

Figure 2. Distribution of scan angles summarized for North America (NA), Goode region 1 (S01), and Goode region 3 (S03). Data are averaged for all 36 composite periods. Scan angle is limited to 48 degrees by EDC for the data set.

Figure 3. Mean scan angle summarized for the North America data: (a) Weighted mean using pre-nadir as negative value and post-nadir as positive value; (b) Mean and standard deviation (STD) based on absolute scan angle. Values range from 23 to 28 degrees, and from 13 to 14 degrees, respectively, for the 36 composites.

Figure 4. Scan angle distributions showing effect of extending composite period from 10 days to one month. Negative angle values represent pre-nadir, and positive angle values represent post-nadir positions.

Figure 5. Estimated cloud distribution in the North America data set using AVHRR spectral channels and illumination geometry, April 21-30, 1992.

Table 1. Image area distortions (changes in number of pixels) due to the use of Interrupted Goode Homolosine projection and resampling method.




      Sample center     Original data  Pixel duplication



 latitude longitude   area used (%)   average maximum







  20.85N   88.08W        100           1.93      3



  24.45N  101.53W         99           1.16      2



  31.01N   98.99W         81           1.11      2



  39.74N   91.20W        100           2.10      3



  40.01N  112.98W        100           2.18      3



  41.76N   81.69W        100           2.24      4



  41.88N   61.13W        100           1.65      2



  42.62N   63.27W        100           2.05      4



  44.85N   84.35W        100           1.64      3



  45.74N  163.35W        100           2.01      3



  48.13N  140.79W        100           2.09      3



  52.93N   97.31W         88           1.00      1



  59.13N   82.00W         88           1.00      1



  59.86N  159.54W         89           1.00      1



  59.96N   61.48W         89           1.00      1



  60.71N  179.06W        100           2.12      3



  63.78N  125.76W        100           1.96      3



  64.73N  108.24W        100           2.02      3



  68.97N   86.81W         88           2.60      3



  71.15N  149.83E        100           2.17      4



  71.32N   69.69W         85           2.48      3



  74.68N  129.55W        100           1.60      2



  82.44N  157.88W         71           2.68      4



  84.80N   85.16W         69           2.17      4

Table 2. Amount of missing data due to solar zenith angle greater than 80 degrees for 12 NDVI monthly composites.




Month  Year    Number of pixels   Percent







Apr    1992       1,333,346         5.2



May    1992       1,212,581         4.7



Jun    1992         947,088         3.7



Jul    1992         254,128         1.0



Aug    1992       1,447,545         5.7



Sep    1992         985,699         3.8



Oct    1992       2,448,994         9.6



Nov    1992       7,616,120        30.0



Dec    1992      10,732,749        42.3



Jan    1993       8,027,949        31.7



Feb    1993       3,949,743        15.5



Mar    1993       1,952,302         7.7

Table 3. Amount of cloud pixels estimated for the thirty six 10-day composites, April 1992 to March 1993. The estimates are based on an 1-percent sample of the original composite data.




  Composite Period   Percent   Composite Period    Percent







  Apr    01-10 1992     6.7     Oct    01-10 1992    6.4



  Apr    11-20 1992     5.3     Oct    11-20 1992    3.2



  Apr    21-30 1992     6.6     Oct    21-30 1992    4.0



  May    01-10 1992     6.3     Nov    01-10 1992    2.9



  May    11-20 1992     3.7     Nov    11-20 1992    2.4



  May    21-30 1992     7.5     Nov    21-30 1992    1.6



  Jun    01-10 1992     4.2     Dec    01-10 1992    1.9



  Jun    11-20 1992     4.6     Dec    11-20 1992    3.2



  Jun    21-30 1992     3.0     Dec    21-30 1992    3.1



  Jul    01-10 1992     4.7     Jan    01-10 1993    3.6



  Jul    11-20 1992     3.5     Jan    11-20 1993    3.1



  Jul    21-30 1992     3.9     Jan    21-30 1993    2.4



  Aug    01-10 1992     3.7     Feb    01-10 1993    2.3



  Aug    11-20 1992     6.2     Feb    11-20 1993    4.4



  Aug    21-30 1992     6.0     Feb    21-28 1993    3.7



  Sep    01-10 1992     3.0     Mar    01-10 1993    4.5



  Sep    11-20 1992     3.7     Mar    11-20 1993    2.7



  Sep    21-30 1992     5.7     Mar    21-30 1993    4.1

1 KM Home Page

This site is hosted by the USGS - NASA Distributed Active Archive Center
Disclaimers, Statements and Accessibility
URL: http://edcdaac.usgs.gov/1KM/zhu.html
Maintainer: edc@eos.nasa.gov
Last Update: Thursday, July 12, 2001.