Characteristics of the North America 1-km AVHRR Data SetZHI-LIANG ZHU, Hughes STX Corporation, U.S. Geological Survey, EROS Data Center, Sioux Falls, South Dakota 57198, U.S.A. LIMIN YANG, Center for Advanced Land Management Information Technologies Conservation and Survey Division, University of Nebraska-Lincoln, Lincoln, Nebraska 68588-0517, U.S.A. Abstract. The North America portion of a new global 1-km AVHRR time-series data set was produced recently by the U.S. Geological Survey, EROS Data Center. Characteristics of the data set were evaluated for scan-angle distribution, image area distortion as the result of map projection, distribution of high solar zenith angle, and cloud presence in image composites produced using maximum values of normalized difference vegetation index (NDVI). The evaluation showed that the compositing procedure exhibits a bias favoring off-nadir pixels, particularly at post-nadir (forward scanning) positions in the winter months. Results for scan angle distribution and image area distortion provide a basis for calculating the data's effective minimum mapping area for various geographic locations. The amount of missing data due to large solar zenith angle effect varies from 42 percent in January to 1 percent in July. Cloud contaminated pixels estimated for the thirty-six 10-day composites range from 7.5 percent in May to 1.6 percent in November. Recompositing the North America data set from 10-day cycles to monthly cycles can effectively reduce the amount of cloudy pixels in the data. 1. Introduction A new Advanced Very High Resolution Radiometer (AVHRR) data set covering the land areas of the Earth is being produced by the Earth Resources Observation Systems (EROS) Data Center (EDC) of the U. S. Geological Survey (USGS) in Sioux Falls, South Dakota. Production of this data set is based on specifications developed through the International Geosphere Biosphere Programme (IGBP) (Townshend 1992) with the USGS, the National Aeronautics and Space Administration, the National Oceanic and Atmospheric Administration, and the European Space Agency as the primary sponsors. A network of 30 AVHRR ground receiving stations provides the daily AVHRR data that make the project viable ( Eidenshink and Faundeen 1994 ). Because data sets of this magnitude for Earth science are unprecedented, an evaluation of the initial data set is undertaken in order to document and understand the salient qualities. The evaluation was based on the North America subset of the new global data. It focused on (1) determining the distribution of the sensor's scan angle in the composite data, (2) evaluating image area distortions due to the use of Goode projection and resampling technique, (3) documenting high solar zenith angle effect, and (4) assessing presence of residual clouds in the data. The processing flow for the new global data set, as recommended by IGBP, is described in Eidenshink and Faundeen (1994) . The steps in the process are (1) orbital stitching of daily paths, (2) radiometric calibration of the five original AVHRR channels, (3) geometric registration to the Interrupted Goode Homolosine map projection ( Steinwand 1994 ), (4) calculation of normalized difference vegetation index (NDVI), and (5) multitemporal (10 day) maximum value compositing (MVC) based on NDVI. Next, atmospheric correction for ozone and Rayleigh scattering is performed for AVHRR channels 1 and 2 of the resultant NDVI composite; in the last step, NDVI band is recalculated using the two corrected AVHRR channels. The resulting composites represent a 10-day (3 per month) time series and contain the following 10 bands: AVHRR channels 1-5, NDVI, satellite zenith, solar zenith, relative azimuth, and date index. The North American data consist of two Goode projection regions ( figure 1 ) and thirty-six 10-day composites from April 1-10, 1992 (first period) to March 21-31, 1993 (last period). 2. Scan angle distribution Previous work has shown that NDVI, and subsequently the MVC process, are a function of AVHRR scan angle (Holben and Fraser 1984, Deering and Eck 1987, Gutman 1991, Moody and Strahler 1994), and using atmospherically corrected data in MVC may shift MVC selection toward high off-nadir positions (Cihlar and Huang 1994). In this study, MVC was computed prior to correcting for atmospheric effects ( Eidenshink and Faundeen 1994 ). The per pixel scan angle values were derived from satellite zenith band in or near the solar principal plane (95 percent of the pixels are within ñ 40ø relative azimuth angle). Angle values range from 0 to 48 degrees for both pre-nadir (backscatter) and post- nadir (forescatter) scanning directions. Results show that, on average, there is an MVC bias toward post- nadir direction ( figure 2 ). This bias is strongest for pixels in the northern half of the North America data, particularly at the extreme 41 to 48 degrees post-nadir. Most pixels are distributed in high off- nadir positions; chances of selecting near-nadir pixels through MVC are low in the North American data set. Most of the bias may be attributed to preferential selection of post-nadir pixels in the winter months, as indicated in figure 3(a) where a temporal variation of the mean scan angle is evident. In this figure, the weighted mean scan angle was calculated with pre-nadir scan angles as negative and post-nadir scan angles as positive. In general, for the months between late September 1992 and February 1993, MVC strongly favored pixels from the post-nadir direction over pixels from the pre-nadir direction. Between April and middle of September 1992, and March 1993, the average scan angle fluctuated around the nadir. Figure 3(b) shows mean and standard deviation (STD) calculated based on absolute scan angle values (pre- and post-nadir combined). Given that AVHRR pixel size is a function of scan angle (Goward et al. 1991), the majority of the North America AVHRR pixels in the composites have ground dimensions greater than 1.1-km nominal scale. Mean pixel area, for example, can be calculated in the range from 1.59 to 1.85 km 2 based on the statistics given in figure 3(b) . Actual ground dimensions vary with pixel locations and can be similarly computed using the sensor-target-earth geometry. The observed NDVI bias toward off-nadir positions may be mitigated to a certain extent, by extending the compositing period from 10 days to 1 month. Figure 4 shows examples of re-compositing three 10-day composites into 1 monthly composite for 2 months (August and October 1992). Qualitatively, the re-compositing has an effect of improving off-nadir scan angle's distribution toward near-nadir position (0 +/- 10 degrees). This potential improvement of image quality provides a rationale for using extended composite period (up to 1 month) in land characterization applications of areas where good image data are difficult to obtain for the shorter 10-day compositing period. 3. Image area distortions due to map projection Geometric registration transforms satellite images into a precise map projection. The process of geometric registration for the global 1-km AVHRR data set involves application of a satellite model to derive a systematic correction and a ground control point matching technique to update the model ( Eidenshink and Faundeen 1994 ). Based on the updated model, raw AVHRR data are mapped into the Goode projection using the nearest neighbor resampling method. The samples are listed ( table 1 ) in the ascending order from 20.85øN to 84.80øN in latitude. All samples except one are located in the western hemisphere. If all 100 pixel values (ranging from 1 to 100) from an original square are present in the new block, it means 100 percent of the original data area is transferred and no loss of information occurs. In table 1 , samples taken from low latitudes have no reduction in the original data area except two samples, which are located near the central meridian (100øW). In the middle and high latitudes, area compression becomes more noticeable. At the highest latitude sample point, only 69 percent of the original data area is preserved. Thirty-one percent of original information is lost at this point, resulting in a change in the spatial resolution. As the result of assigning pixels from the original orbital projection to the new map projection, some pixels are used more than once. Pixel duplications in a sample represent a local area enlargement; pixels are duplicated to represent the same ground feature. This does not add to the information content of the data set, and the effect can also make measuring scale in the data misleading. As indicated in table 1 , on average, most of the sample windows' pixels are mapped more than once, and maximum duplication is up to four times as high. Note that a portion of this error may be related to the use of nearest neighbor resampling. The error may be reduced if cubic convolution or other interpolation resampling techniques are used. 4. Solar zenith angle effects Missing data exist in the North America data as the result of masking for large solar zenith angle (SZA) that occurs at the high latitudes during the winter season. Under such conditions (low solar illumination and long atmospheric path length), data recorded by AVHRR can be erroneous in representing land cover features and are difficult to calibrate. Due to this limitation, all pixels that have SZA greater than 80 degrees are flagged in MVC processing; no NDVI is computed for these pixels. For North America, the areal extent of missing data due to cut-off SZA is a function of time of the year: large in winter months and small in summer months. Table 2 lists statistics of missing data for all 12 monthly composites from April 1992 to March 1993. Note that the percentage of missing data increases from 1 percent of total pixels in July to 42 percent in January. The missing data can present problems for deriving land characteristics in high latitudes using the multitemporal images. 5. Assessing cloud extent in the composites The MVC technique used to generate this data set minimizes cloud effect by selecting the "clearest" pixels over the composite period. However, when the period of cloudiness is equal or longer than the compositing period, cloud pixels cannot be removed by the compositing technique. In addition, subpixel sized clouds are not uncommon in AVHRR data. The presence of cloudy and mixed cloud-land pixels contaminates the data set (Goward et al. 1991, Moody and Strahler 1994). The assessment for residual clouds was based on a 1-percent systematic sample of the original North America data set; the method was adopted from several previous studies for assessing global and continental distribution of cloud cover using AVHRR data. (See Baglio and Holroyd 1989, Stowe et al. 1991 for more details.) The method uses AVHRR channels 1 through 5, along with solar illumination and viewing geometry for separating clouds from other land features. In this study, 3 cloud screening tests were applied to each of the 10-day composites. The first test was based on the fact that most land surfaces appear much darker than clouds in AVHRR channel 1. The second test took into account the emissivity difference between channel 4 and channel 5 for water and ice, and used channel 4 minus channel 5 radiative temperature difference to detect thin cirrus clouds and clouds in polar latitude (Stowe et al. 1991). The third test used temperature difference between channel 3 and channel 4 to separate cloud from cold snow-covered land (Baglio and Holroyd 1989). Thresholds used for the 3 tests are similar to those suggested by Stowe et al. (1991) and Baglio and Holroyd (1989) (channel 1 reflectance > 0.4, channel 4 minus channel 5 temperature dependent, and channel 3 minus channel 4 > 30øK). The threshold for the third test (channel 3 minus channel 4) was set relatively high to minimize inclusion of snow-covered land at high latitude. All 3 tests were applied to areas north of 50øN, and areas south of 50øN where snow is present in the winter months; pixels with values exceeding all three thresholds were labeled as cloud. For areas south of 50øN in the summer, cloud pixels were determined based on the agreement of first and second tests. Table 3 summarizes estimated cloud residuals for the thirty-six 10-day composites. The amount of cloud pixels varies from less than 2 percent in November to almost 8 percent in May. The 10-day composites of spring (April and May), midsummer (August), and fall (late September to early October) exhibit higher percentage of cloud cover than those of winter (November through January). Because no attempt was made in this study to identify subpixel sized clouds, the percentages are regarded as conservative estimates. Geographically, the most cloud-prone regions are found in the tropical latitudes and in Labrador east of Hudson Bay, Canada. Figure 5 illustrates, as an example, the spatial distribution of clouds identified from the April 21-30 composite and corresponding AVHRR channels 1, 4, and NDVI. A further comparison between 10- and 30-day composites of August and October was made to assess potential reduction of clouds through recompositing. The amount of clouds ranges from 3.7 to 6.2 percent for the 3 August 10-day composites and 3.2 to 6.4 percent for the 3 October composites (table 3). The August and October monthly composites contain 1.3 and 3.1 percent estimated cloud pixels, respectively. This comparison indicates that recompositing over longer periods (for example, monthly) can effectively reduce cloud contaminated pixels. 6. Summary and Conclusions The key characteristics evaluated for the new global 1-km AVHRR data in North America are:
Acknowledgments The authors would like to thank the following individuals at EDC: Jeffrey C. Eidenshink for the global 1-km AVHRR data processing, David J. Meyer for cloud screening, Daniel R. Steinwand for assessing projection distortion, and Charles E. Wivell for the satellite-earth geometry. Zhi-Liang Zhu's work was under USGS contract 1434-92-C- 40004. Limin Yang's work in this research was supported through U.S. National Aeronautics and Space Administration Grant (NAGW-3940). References
Figure 1. Map of North America showing region 1, 90øN-40ø44'N, and region 3, 40ø44'N-the Equator, of Interrupted Goode Homolosine projection. See Steinwand (1994) for a complete description of the Goode projection. Figure 2. Distribution of scan angles summarized for North America (NA), Goode region 1 (S01), and Goode region 3 (S03). Data are averaged for all 36 composite periods. Scan angle is limited to 48 degrees by EDC for the data set. Figure 3. Mean scan angle summarized for the North America data: (a) Weighted mean using pre-nadir as negative value and post-nadir as positive value; (b) Mean and standard deviation (STD) based on absolute scan angle. Values range from 23 to 28 degrees, and from 13 to 14 degrees, respectively, for the 36 composites. Figure 4. Scan angle distributions showing effect of extending composite period from 10 days to one month. Negative angle values represent pre-nadir, and positive angle values represent post-nadir positions. Figure 5. Estimated cloud distribution in the North America data set using AVHRR spectral channels and illumination geometry, April 21-30, 1992. Table 1. Image area distortions (changes in number of pixels) due to the use of Interrupted Goode Homolosine projection and resampling method. Sample center Original data Pixel duplication latitude longitude area used (%) average maximum 20.85N 88.08W 100 1.93 3 24.45N 101.53W 99 1.16 2 31.01N 98.99W 81 1.11 2 39.74N 91.20W 100 2.10 3 40.01N 112.98W 100 2.18 3 41.76N 81.69W 100 2.24 4 41.88N 61.13W 100 1.65 2 42.62N 63.27W 100 2.05 4 44.85N 84.35W 100 1.64 3 45.74N 163.35W 100 2.01 3 48.13N 140.79W 100 2.09 3 52.93N 97.31W 88 1.00 1 59.13N 82.00W 88 1.00 1 59.86N 159.54W 89 1.00 1 59.96N 61.48W 89 1.00 1 60.71N 179.06W 100 2.12 3 63.78N 125.76W 100 1.96 3 64.73N 108.24W 100 2.02 3 68.97N 86.81W 88 2.60 3 71.15N 149.83E 100 2.17 4 71.32N 69.69W 85 2.48 3 74.68N 129.55W 100 1.60 2 82.44N 157.88W 71 2.68 4 84.80N 85.16W 69 2.17 4 Table 2. Amount of missing data due to solar zenith angle greater than 80 degrees for 12 NDVI monthly composites. Month Year Number of pixels Percent Apr 1992 1,333,346 5.2 May 1992 1,212,581 4.7 Jun 1992 947,088 3.7 Jul 1992 254,128 1.0 Aug 1992 1,447,545 5.7 Sep 1992 985,699 3.8 Oct 1992 2,448,994 9.6 Nov 1992 7,616,120 30.0 Dec 1992 10,732,749 42.3 Jan 1993 8,027,949 31.7 Feb 1993 3,949,743 15.5 Mar 1993 1,952,302 7.7 Table 3. Amount of cloud pixels estimated for the thirty six 10-day composites, April 1992 to March 1993. The estimates are based on an 1-percent sample of the original composite data. Composite Period Percent Composite Period Percent Apr 01-10 1992 6.7 Oct 01-10 1992 6.4 Apr 11-20 1992 5.3 Oct 11-20 1992 3.2 Apr 21-30 1992 6.6 Oct 21-30 1992 4.0 May 01-10 1992 6.3 Nov 01-10 1992 2.9 May 11-20 1992 3.7 Nov 11-20 1992 2.4 May 21-30 1992 7.5 Nov 21-30 1992 1.6 Jun 01-10 1992 4.2 Dec 01-10 1992 1.9 Jun 11-20 1992 4.6 Dec 11-20 1992 3.2 Jun 21-30 1992 3.0 Dec 21-30 1992 3.1 Jul 01-10 1992 4.7 Jan 01-10 1993 3.6 Jul 11-20 1992 3.5 Jan 11-20 1993 3.1 Jul 21-30 1992 3.9 Jan 21-30 1993 2.4 Aug 01-10 1992 3.7 Feb 01-10 1993 2.3 Aug 11-20 1992 6.2 Feb 11-20 1993 4.4 Aug 21-30 1992 6.0 Feb 21-28 1993 3.7 Sep 01-10 1992 3.0 Mar 01-10 1993 4.5 Sep 11-20 1992 3.7 Mar 11-20 1993 2.7 Sep 21-30 1992 5.7 Mar 21-30 1993 4.1
| |||||||||||
|