Development of an unbiased cloud detection algorithm for a spaceborne multispectral imager - Ishida - 2009 - Journal of Geophysical Research: Atmospheres - Wiley Online Library
Advertisement
[1]&A new concept for cloud detection from observations by multispectral spaceborne imagers is proposed, and an algorithm comprising many pixel-by-pixel threshold tests is developed. Since in nature the thickness of clouds tends to vary continuously and the border between cloud and clear sky is thus vague, it is unrealistic to label pixels as either cloudy or clear sky. Instead, the extraction of ambiguous areas is considered to be useful and informative. We refer to the multiple threshold method employed in the MOD35 algorithm that is used for Moderate Resolution Imaging Spectroradiometer (MODIS) standard data analysis, but drastically reconstruct the structure of the algorithm to meet our aim of sustaining the neutral position. The concept of a clear confidence level, which represents certainty of the clear or cloud condition, is applied to design a neutral cloud detection algorithm that is not biased to either clear or cloudy. The use of the clear confidence level with neutral position also makes our algorithm structure very simple. Several examples of cloud detection from satellite data are tested using our algorithm and are validated by visual inspection and comparison to previous cloud mask data. The results indicate that our algorithm is capable of reasonable discrimination between cloudy and clear-sky areas over ocean with and without Sun glint, forest, and desert, and is able to extract areas with ambiguous cloudiness condition.1.&Introduction[2]&Discrimination between cloudy and clear-sky areas is an important issue for satellite remote sensing, because clouds have a large impact on radiance. For example, retrieval of aerosol properties and the amounts of Rayleigh scattering from satellite measurement requires the rigorous exclusion of cloud-covered areas, because contamination of even thin clouds in a pixel causes fatal errors in these retrievals [e.g., ; ]. Mapping of land surface condition and vegetation changes also needs to reject pixels that contain opaque clouds [e.g., ]. On the other hand, satellite observations are capable of globally estimating cloudiness [e.g., ], which is a fundamental variable for meteorology and climatology [e.g., ]. Cloud properties such as optical thickness can be retrieved from satellite remote sensing, but correct extraction of cloudy areas must first be carried out [e.g., ; ]. It is, therefore, necessary to provide methods for distinguishing between cloudy (including partly cloudy) and clear areas (referred to as cloud detection or cloud mask) in satellite observation scenes.[3]&When performing cloud detection on satellite images, it is desirable to use the data acquired by the same satellite, because the simultaneous and synchronized observation of an area is assured. Methods for cloud detection by satellite are, in general, based on the radiative properties and geometrical textures of cloud. The reflectance of clouds for the solar radiation is often larger than that of land or ocean, and is almost constant with respect to wavelength, while cloud temperature observed by satellite is often colder than that of the underlying surface. Clouds also tend to produce more spatial variability of the radiance field than surfaces, especially over ocean []. These cloud properties can be applied as a threshold test, which is a simple and versatile method for cloud detection. A value measured by satellite is compared pixel by pixel to the threshold value, fixing the boundary of clear and cloudy areas. The threshold value is usually determined from satellite and/or in situ observations, referring to other cloud identification such as visual inspection, or theoretical simulations of radiative transfer.[4]&However, two main problems make cloud detection in satellite observations difficult. One is the variety of cloud types. Since radiative properties differ among cloud types, a threshold test that is appropriate to a certain cloud type may be not applicable for another. For example, reflectance of solar radiation is an appropriate index for finding relatively thick clouds such as cumulus but is not sensitive to thin clouds, resulting in a false identification as clear sky. The second problem is the existence of land surfaces whose radiative characteristics at measured wavelengths are similar to cloud. For example, the reflectance test is likely to categorize clear-sky areas over snow and desert, which have large reflectance, as being cloudy. Besides, cloud detection algorithms are required to run fast and in near-real time, because cloud detection is usually the first part of the analysis of satellite data.[5]&In order to avoid incorrect identification and to improve the usefulness of cloud detection, many algorithms have been proposed, depending on the purposes of the observation and specifications of the imager, such as the spectral and spatial resolution of the sensor. For multispectral imager satellite observations such as those from the Moderate Resolution Imaging Spectroradiometer (MODIS), it is usual to carry out a series of threshold tests for a pixel and comprehensively examine whether it represents a cloudy or clear area. The composition of the threshold tests must be arranged to deal with all types of cloud and surface, using many wavelengths.
has developed such an algorithm, referred to as &MOD35,& for operational cloud mask data of MODIS observation. MOD35 consists of many threshold tests with static threshold values. A similar algorithm was used for the Advanced Earth Observing Satellite (ADEOS)-II Global Imager (GLI) science mission in Japan [].
has assembled many threshold tests into a sequential algorithm (&CL***R1&) for cloud screening of Advanced Very High Resolution Radiometer (***HRR) data.
developed the ***HRR Processing scheme Over cLouds, Land and Ocean (APOLLO), which comprises dynamic threshold tests and is applied to separate pixels i cloud-free, fully cloudy, and partially cloudy.
also presented a method that dynamically and automatically determines the thresholds for each image: it is based on the assumption that the histogram of the radiance values in an image (containing 512 by 512 pixels) must have the peak attributed to land. This method makes the thresholds more locally appropriate, but cannot deal with an image that contains no pixels of land (i.e., no clear sky).
applied cluster analysis to group pixels of the similar radiative properties, and only representative values of a cluster, such as the mean, are compared to threshold values for cloud identification. This method results in more distinct discrimination than a simple pixel-by-pixel threshold test, but a large area can be falsely identified as cloudy or clear if the threshold value is not correct.[6]&Although these improvements have enabled more accurate cloud detection for satellite data than the simple threshold test, it is still difficult to make the discrimination between cloudy and clear completely correct. In particular, the discrimination for a pixel with a value around the threshold tends to result in ambiguity. In order to eliminate the ambiguous pixels, several cloud detection methods are designed to bias the discrimination by identifying the ambiguous pixels as cloudy (referred to as &clear conservative&) or clear (&cloud conservative&), depending on the purpose of remote sensing. For example, MOD35 [] is designed to be clear conservative, because it is necessary to exclude pixels that contain even a little bit of cloud. However, it can be argued that the extraction of ambiguous pixels also provides valuable information on the condition of the atmosphere and cloudiness. Actually, it is unrealistic to distinctly divide pixels into cloudy and clear sky with only a certain threshold value, because in nature the (optical) thickness of clouds sometimes continuously varies and the border between cloud and clear sky is thus vague. The definition of &cloud& should depend on the purposes of the observation. Mixed pixels, which include both clouds and surfaces, also require some methods for estimating the degree of mixing. Therefore, it is reasonable for ambiguous pixels to remain unidentified as cloudy or clear and to provide a quantitative estimate of the level of ambiguity in the clear/cloudy condition instead of a dichotomy.[7]&In this paper, we propose a new cloud detection algorithm based on a neutral standpoint. In other words, it is neither clear-conservative nor cloud-conservative. We refer to the multiple threshold method employed in the MOD35 [] algorithm, but substantially reconstruct the structure of the algorithm to meet our aim of sustaining the neutral position. The set of channels used in our algorithm is almost the same as that of MOD35, but some threshold tests in MOD35 are removed and the reflectance ratio test for bright desert is added. Some threshold values for the individual threshold tests are slightly changed from MOD35 according to inspection. One of the remarkable characteristics of our algorithm is that it estimates the clear confidence level, which can be considered as an index of the likelihood of clear condition, by applying the threshold test with two thresholds, an upper limit and a lower limit, rather than a single value. The concept of the confidence level has been introduced by MOD35. However, the MOD35 algorithm used the confidence level only as an intermediate product to divide pixels into four levels (clear, probably clear, uncertain, and cloudy). Our idea is the use of the confidence level consistently to the final solution. To achieve this idea we examined, categorized, and rearranged the several threshold tests of MOD35, thus not only the overall concept but also the flow of data analysis were different from MOD35. If the measured value of a pixel is between the lower and upper limit, the algorithm does not identify the pixel as cloudy or clear but calculates the clear confidence level. Consequently, our algorithm is not biased toward cloudy or clear sky, but is neutral.[8]& describes our cloud detection algorithm.
presents some examples of cloud detection implementation, applying our algorithm to MODIS observation data. In , we evaluate the results and examine the significance of the clear confidence level. A summary and our conclusions are given in .2.&Algorithm Description2.1.&Threshold Tests[9]&The cloud detection algorithm in this study comprises the calculation of clear confidence levels for every threshold test and the comprehensive integration of them. As mentioned in , every threshold test has strong and weak points. Here we briefly explain the theoretical basis of the individual threshold tests and their characteristics.2.1.1.&Single Reflectance Tests[10]&Optically thick clouds usually have a large reflectance at wavelengths in nonabsorption bands in the visible and near infrared regions. Over ocean, the reflectance in the near infrared (e.g., 0.87 &m of MODIS) is efficient, because the effect of Rayleigh scattering by air molecules is smaller than that in the visible region. Over land, however, the reflectance in the visible region (e.g., 0.66 &m) must be used, because leaves of plants have a large reflectance in the near infrared region. The reflectance test may falsely identify bright surfaces, such as deserts, snow covered areas, coral reefs, and Sun glint regions in ocean, as clouds. In order to avoid incorrect identification, we apply the &minimum albedo& map, which consists of the minimum reflectance of GLI/ADEOSII for a month before the date of satellite data for cloud screening. The spatial resolution of the minimum albedo map is 0.125& by 0.125&. The reflectance is compared to the minimum albedo, instead of using a static threshold like MOD35. This scheme is consistent with the assumption that at least one time in a month will be clear and the minimum value for a month must represent the reflectance of the surface. Clouds and Sun glint areas can be generally excluded from the minimum albedo data. However, the minimum albedo sometimes includes larger reflectance than surfaces due to lasting clouds.
applied a similar procedure, i.e., dynamic and regional threshold values, to cloud screening for ***HRR data. If the minimum albedo is missing, the static threshold value is applied for the reflectance test. Additionally, the threshold value for ocean regions is changed with the cone angle (&c) in order to deal with increases in the reflectance of ocean due to Sun glint, where &c is the angle between vectors of viewpoint-to-satellite and the specular reflection defined by &c = sin&solsin&satcosϕ + cos&solcos&sat, &sol, &sat, ϕ are the solar zenith angle, satellite zenith angle, and relative azimuthal angle, respectively. On the other hand, at wavelengths in the strong absorption band due to water vapor such as 1.38 &m, radiance reflected by the surface or low-level clouds cannot reach the satellite, whereas the radiance increases when high cloud exists []. Therefore, the reflectance test in an absorption band is quite effective for detecting thin cirrus at high altitude. However, this reflectance test is inadequate in highland areas, because the reduced air mass lowers absorption by air molecules and increases the intensity of radiance reflected by the surface under clear skies.2.1.2.&Reflectance Ratio Tests[11]&The ratio and the difference of reflectance between two wavelengths in the solar radiation region can be applied to detect optically thick clouds. The reflectance of cloud at a wavelength in the solar radiation region is almost independent of the wavelength if the absorption by air molecules is very small, whereas the reflectance of land and ocean usually varies with the wavelength, depending on the condition of the surface. However, certain combinations of wavelengths for the reflectance ratio test are not appropriate for surfaces whose reflectance does not vary with the wavelength. For example, the ratio of reflectances at 0.66 &m and 0.87 &m is sensitive to clouds over ocean but mistakes Sun glint regions and bright deserts for cloud. The ratio of reflectances at 0.87 &m and 1.64 &m can be applied to discriminate clouds from bright desert surfaces, because the reflectance of desert in the near infrared increases with increasing wavelength [], whereas this combination is not sensitive to clouds over deep forest. The ratio of reflectances at 0.55 &m and 1.24 &m can be applied to detect clouds over bare and half-vegetation areas []. At the wavelengths of large water vapor absorption, the intensity of the radiance reflected by the surface is reduced by water vapor, almost all of which exists near the surface, whereas reduction of the radiance reflected by cloud is small because clouds usually exist above the layer with abundant water vapor. Therefore, the ratio of reflectance at a wavelength with large water vapor absorption to a wavelength that is almost free from such absorption is able to discriminate between clouds and Sun glint regions []. The Normalized Difference of Vegetation Index (NDVI), which uses the large reflectance of leaves in the near infrared region and their small reflectance in the visible region to estimate vegetation density [e.g., ], can be used to identify clouds over deep forests. Several reflectance ratio and difference tests judge pixels according to whether the observation value is within a given range: such a test has both an upper and a lower limit.2.1.3.&Brightness Temperature Tests[12]&A threshold test based on brightness temperature in the window region (where gaseous absorption is very small, e.g., 10.8 &m) can be applied to detect high clouds over ocean. Brightness temperatures in the gaseous absorption wavelengths (e.g., 13.9 &m due to the CO2 absorption) are also efficient for detecting high thin clouds over both ocean and land. The basis of this test is that satellite-observed radiance emitted from high clouds is not affected by gaseous emission and so the intensity is smaller than that of the clear sky. In daytime, clouds tend to raise the brightness temperature in near infrared wavelengths (e.g., 3.7 &m) because of solar reflection, whereas they tend to reduce the thermal infrared radiation in the window region because of low temperatures at the cloud top. Therefore, brightness temperature difference between the near-infrared and the thermal infrared window region has the potential to detect geometrically thick and high clouds. In addition, snow covered areas under clear skies are identified before cloud detection using the Normalized Difference of Snow Index (NDSI) [].2.2.&Estimation of the Clear Confidence Level[13]& illustrates the concept of the estimation of the clear confidence level for each threshold test. The observed value is compared to the upper limit and the lower limit. If the observed value is larger (smaller) than the upper limit (lower limit), the pixel is discriminated with high confidence as clear (cloudy) and assigned the confidence level of 1 (0). Otherwise a value of between 0 and 1 is assigned to the clear confidence level by linear interpolation. This procedure is based on the assumption that a pixel with an observed value of between the upper limit and the lower limit is ambiguous, but the nearer the observed value is to the upper limit, the larger is the probability of a clear sky. The clear confidence levels of the individual threshold tests must be combined to determine the overall confidence level for each pixel. Here, we explain a method to calculate the confidence level of each group (Gn) and overall (Q), proposing our idea, categorization of threshold tests into two groups.Figure&1. Concept of the clear confidence level with two threshold values, the upper limit and the lower limit.2.2.1.&Categorization of Each Individual Threshold Test Into Two Groups[14]&The threshold tests can be classified into two groups, according to their weak point. The first group is efficient for finding clouds but has a possibility of incorrectly identifying clear sky areas as cloudy if the surface under the clear sky is confusing. For example, the reflectance ratio of 0.87 &m to 1.64 &m can discriminate clouds from the desert surface, as shown in , but the ratio for clouds is similar to that for forest beneath clear sky so it is not effective for cloud discrimination over thick vegetation. Threshold tests derived from the specific features of individual surfaces (e.g., the NDVI test) tend to be categorized into this first group. On the other hand, threshold tests of the second group are able to correctly identify pixels as clouds but have a possibility of missing some types of cloud and falsely identifying cloudy pixels as clear. For example, the brightness temperature of 13.9 &m is quite sensitive to the existence of high or geometrically thick clouds, but is not sensitive to low clouds even if they are optically thick, as illustrated in . Therefore, our algorithm divides the threshold tests into two groups, group 1 and group 2, and then estimates a representative value of the confidence level for each group. For example,
lists the MODIS channels used for the cloud detection with the spatial resolution, and the threshold tests for each group and main targets are given in
(hereafter, we use the following notations: the reflectance is R, and the brightness temperature is Tb). The representative value of the clear confidence level for group 1, G1, is derived from the geometric mean as follows: where Fk is the clear confidence level of the kth threshold test.
implies that the representative value of the first group is calculated to be &cloud conservative&: even if only one threshold test takes the clear confidence level of 1 then G1 = 1 (clear), whereas G1 = 0 (cloudy) only if all the Fk are 0. This determination is valid if at least one of the threshold tests in the group is able to distinguish clouds from the surface. On the other hand, the representative value for group 2, G2, is calculated by
implies that the representative value of the second group is considered to be &clear conservative.&Figure&2. MODIS observation scenes at 0825 UTC 18 July 2006 over North Africa. (a) RGB composite image, (b) ratio of reflectance at 0.87 &m and 1.64 &m, and (c) brightness temperature at 13.9 &m.Table&1.&MODIS Channels Used for the Experiments of Cloud Screening in This Study10.6625020.8725040.5550051.2450061.64500170.9051000180.9361000203.71000213.91000261.381000276.710003111.010003513.91000Table&2.&Threshold Tests and the Targets1R(0.87 &m)Optically thick clouds over oceanR(0.64 &m)Optically thick clouds over landR(0.87 &m)/R(0.64 &m)Optically thick cloudsNDVIClouds over deep forestR(1.24 &m)/R(0.55 &m)Clouds over bare and half vegetationR(0.87 &m)/R(1.64 &m)Clouds over bright desertR(0.905 &m)/R(0.935 &m)Clouds over the Sun glint areas of ocean2Tb(11.0 &m)High (geometrically thick) cloudsTb(6.7 &m)High thin clouds (including cirrus)Tb(13.9 &m)High thin clouds (including cirrus)Tb(11.0 &m) & Tb(3.9 &m)Optically thick cloudsR(1.38 &m)Thin cirrus[15]&The grouping of threshold tests is also applied to MOD35, but the MOD35 is designed to make the calculation of the confidence level of all the groups be &clear conservative,& employing
for the tests of group 1. Our algorithm is designed to be neutral considering characteristics and habits of each threshold tests. This is one of the major differences between MOD35 and our algorithm.2.2.2.&Calculation of the Overall Confidence Level Q From G1 and G2[16]&Our algorithm finally obtains the overall clear confidence level (Q) from the geometric mean of the representative values for the two groups as follows: which means that if the clear confidence level of either one group of two is 0 (i.e., cloudy), the overall clear confidence level results in 0. The flow of the algorithm is briefly explained in . It is indicated that our algorithm is not a cascade or decision-tree type, which means an algorithm applying threshold tests as decision nodes such as CL***R-1 []. This means that our algorithm is versatile enough to be applied to various satellites equipped with different channels, because it can be adjusted simply by adding or removing certain threshold tests according to the wavelength of the available channels. In contrast, the adaptation of a cascade algorithm may require substantial changes.Figure&3. Flow of the cloud detection algorithm.[17]&Our cloud detection algorithm can make an unbiased discrimination between cloudy and clear areas, assigning an overall clear confidence level of between 0 and 1 to ambiguous pixels. Users of the cloud detection results are allowed to select an arbitrary value of the clear confidence level according to their purposes and targets, by identifying pixels whose confidence level is less than the value selected as cloudy. For example, if a user wants to exclude any small effects of cloud, the confidence level of 1 should be selected to result in clear conservative cloud detection. It should be noted that the overall confidence level of between 0 and 1 suggests a vague area that possibly contains clouds, but the clear confidence level does not directly represent either the optical thickness of cloud or cloud amount in the pixel. A detailed discussion of the interpretation of the confidence level is given in .3.&Experimental Results[18]&In this section, we present some examples of the application of our algorithm to cloud detection in MODIS data. We selected typical scenes for several types of surface: ocean, desert, half-vegetation and deep forest in daytime. We use the subsampled calibrated radiances 5 km data (SSH), which is adjusted to the spatial resolution of 1 km by averaging the data of channels that have finer resolution than 1 km (low resolution). We also show comparisons to the MOD35 cloud mask data with the spatial resolution of 1 km. The algorithm needs ancillary data: the minimum albedo for the reflectance test and the water/land flag. For the water/land flag, we referred to the United States Geological Survey (USGS) 1-km land/sea tag file, converting to 10-km by 10-km mesh data for the sake of saving computer resources. Our algorithm avoids using the land ecosystem map and thus the set of threshold tests is not prepared for vegetation and desert individually, unlike MOD35.[19]&Here we explain details of the threshold tests applied to cloud detection (listed in ). Following MOD35, the Tb(3.7 &m) & Tb(11.0 &m) test is combined with the R(0.905 &m)/R(0.935 &m) test, and the Tb(3.7 &m) & Tb(11.0 &m) and the Tb(3.7 &m) & Tb(3.9 &m) tests are combined with the R(0.55 &m)/R(1.24 &m) test, in order to avoid falsely identifying a cloudy pixel as clear. Furthermore, the R(0.905 &m)/R(0.935 &m) test is applied only when the cone angle is smaller than 36& and R(0.905 &m) is smaller than 0.08.[20]&The limits for the individual threshold tests are listed in
for ocean and
for land. Almost all of the upper and lower limits follow the latest version of MOD35 [], but the values for several of the tests were changed after visual inspections with the composite RGB images and comparisons to the high-resolution cloud mask results of MOD35. Since the threshold values of the R(0.87 &m)/R(0.68 &m) test in MOD35 are set to be slightly cloud conservative, we shifted them. In order to avoid missing cloudy areas, the individual threshold tests in group 1 are desired to be clear conservative, because the representative value of group 1 is estimated to be cloud conservative as mentioned in . Estimations by a plane-parallel radiative transfer model suggest that for the tests using radiances at wavelengths in the solar region, the upper and lower limit roughly correspond to the water cloud optical thickness of about 0.5 and 2, respectively, although the radiances also depend on other conditions, for example, the cone angle and surface albedo.Table&3a.&Values of the Upper and Lower Limits of Each Threshold Test for Ocean1R(0.87 &m)minimum albedo +0.07minimum albedo +0.03R(0.87 &m)/R(0.66 &m) smaller end0.90.74R(0.87 &m)/R(0.66 &m) larger end1.151.25NDVI smaller end&0.14&0.18NDVI larger end0.30.4R(0.905 &m)/R(0.935 &m)2.93.02Tb(11.0 &m)267 K273 KTb(13.9 &m)224 K228 KTb(6.7 &m)215 K225 KR(1.38 &m)0.030.04Table&3b.&Values of the Upper and Lower Limits of Each Threshold Test for Land1R(0.66 &m)minimum albedo +0.095minimum albedo +0.015R(0.87 &m)/R(0.66 &m) smaller end0.90.74R(0.87 &m)/R(0.66 &m) larger end1.42.0NDVI smaller end&0.14&0.18NDVI larger end0.240.4R(0.87 &m)/R(1.64 &m)0.820.94R(0.55 &m)/R(1.24 &m)1.821.982Tb(11.0 &m) & Tb(3.9 &m) smaller end&20 K&16 KTb(11.0 &m) & Tb(3.9 &m) larger end2 K&2 KTb(11.0 &m)-297.5 KTb(13.9 &m)224 K228 KTb(6.7 &m)215 K225 K[21]&The Tb(11.0 &m) & Tb(3.9 &m) test is not used for ocean areas, because this test tends to misinterpret clear-sky pixels over a bright surface, especially the Sun glint regions, as cloudy. This difference test, on the other hand, is included in group 2 when the surface is land because of its high sensitivity to clouds, especially over bright deserts. The upper and lower limits of the Tb(11.0 &m) & Tb(3.9 &m) test for land follow those of MOD35 for desert surfaces. The R(1.38 &m) reflectance test is not used for land for the reason mentioned in .[22]&An example of the histogram of the number of pixels against R(0.87 &m)/R(1.64 &m) for a bright desert area is illustrated in . Many MODIS scenes over desert show that the peak of the histogram of R(0.87 &m)/R(1.64 &m) exists around 0.8, which is considered to be the typical value of R(0.87 &m)/R(1.64 &m) for desert. We set the upper limit and the lower limit for the R(0.87 &m)/R(1.64 &m) test referring to this result.Figure&4. Histogram of the number of pixels of the reflectance ratio R(0.87 &m)/R(1.64 &m) in the MODIS observation scene at 0745 UTC 19 April 2006 over the Arabian desert. Only the pixels over land surface were extracted from the scene.3.1.&Ocean[23]& and
show the RGB composite image, and the clear confidence level results of cloud detection over ocean without and with a Sun glint region, respectively. The comparison to the MOD35 cloud mask data is also shown in
and . The MOD35 cloud mask results are expressed by four categories, cloudy, uncertain, probably clear, and clear, and the value of 0, 1, 2, and 3, respectively, are assigned (hereafter referred to as Cmod35). Therefore, the difference between the result of our algorithm and MOD35 can be defined by dcld = Q & Cmod35/3. dcld = 1 means that our algorithm is clear and MOD35 is cloudy, whereas dcld = &1 means that our algorithm is cloudy and MOD35 is clear. If dcld = 0, the result is coincident. For the scene without Sun glint, 99.7% of cloudy pixels (Q = 0) by our algorithm (the total number of those is 64835) are identified as &cloudy& by MOD35, whereas only 7.4% of clear pixels (Q = 1) by our algorithm (the total number of those is 16937) are &confident clear& in MOD35. For the scene with Sun glint, 85.8% of pixels with Q = 0 (the total number of those is 24205) are &cloudy& in MOD35, and 36.5% of pixels with Q = 1 (the total number of those is 67458) are &confident clear.& Visual inspection with the RGB image reveals that the results of cloud detection by our algorithm are generally appropriate. Thin clouds as well as thick clouds have a confidence level of 0 and ocean surface is labeled as 1. In particular, for the clear sky over Sun glint regions, which is often likely to be falsely identified as cloudy because of large reflectance with wavelength independence, both clouds and clear sky are correctly identified (in MOD35, many pixels in Sun glint regions are categorized in &Probably clear&). The accurate cloud detection for the Sun glint region is mainly attributed to the R(0.905 &m)/R(0.935 &m) test. However, the relatively dark Sun glint region is sometimes mistaken for cloud, owing to the small value of R(0.905 &m)/R(0.935 &m).Figure&5. Result of cloud detection for the MODIS observation scene at 0510 UTC 18 July 2006 over the Indian Ocean. (a) RGB composite image, (b) the clear confidence level (Q), and (c) difference from the MOD35 result (dcld = Q & Cmod35/3).Figure&6. Result of cloud detection for the MODIS observation scene at 2135 UTC 18 July 2006 over the Pacific Ocean with Sun glint. (a) RGB composite image, (b) the clear confidence level (Q), and (c) difference from the MOD35 result (dcld = Q & Cmod35/3).3.2.&Desert[24]&The cloud detection result for a scene over desert is shown in . Comparison with the RGB image suggests that our algorithm can accurately distinguish between clear-sky and cloudy areas over the bright and white desert surface. 89.2% of pixels with Q = 0 (the total number of those is 16469) are identified as &cloudy& by MOD35, and 92.2% of pixels with Q = 1 (the total number of those is 67484) are labeled as &confident clear& Some pixels that have difference between our algorithm and MOD35 exist around clouds. The R(0.87 &m)/R(1.64 &m) test is a good discriminator for distinguishing clear sky area from the bright surface, whereas the other tests in group 1 falsely identify bright pixels as cloudy. The Tb(11.0 &m) & Tb(3.9 &m) test is quite effective for extracting clouds.Figure&7. Result of cloud detection for the MODIS observation scene at 0745 UTC 19 April 2006 over the Arabian desert. (a) RGB composite image, (b) the clear confidence level (Q), and (c) difference from the MOD35 result (dcld = Q & Cmod35/3).3.3.&Half Vegetation and Deep Forest[25]& illustrates the result of cloud detection over relatively dark land surface, such as half vegetation or bare surface. 91.9% of pixels with Q = 0 (the total number of those is 20093) are identified as &cloudy& by MOD35, and 86.8% of pixels with Q = 1 (the total number of those is 73470) are labeled as &confident clear.& Over such a surface, where MOD35 sometimes discriminates as &probably clear,& our algorithm can detect clear regions: the R(1.24 &m)/R(0.55 &m) test is effective for discriminating cloudy areas from clear. The value of R(0.87 &m)/R(1.64 &m) also has the same tendency as R(1.24 &m)/R(0.55 &m), but is less effective for the dark land surfaces. Cloud detection over a deep forest area also seems to result in good discrimination, as shown in . 86.1% of pixels with Q = 0 (the total number of those is 31672) are &cloudy& in MOD35, whereas 46.9% of pixels with Q = 1 (the total number of those is 42601) are &confident clear.& The comparison suggests that our algorithm results in clear regions where MOD35 tends to discriminate as &cloudy& or &uncertain.& The R(1.24 &m)/R(0.55 &m) test and the NDVI test are most critical for distinguishing between clear sky and clouds.Figure&8. Result of cloud detection for the MODIS observation scene at 0140 UTC 1 October 2006 over the Australian continent. (a) RGB composite image, (b) the clear confidence level (Q), and (c) difference from the MOD35 result (dcld = Q & Cmod35/3).Figure&9. Result of cloud detection for the MODIS observation scene at 1415 UTC 21 April 2006 over the Amazon forest. (a) RGB composite image, (b) the clear confidence level (Q), and (c) difference from the MOD35 result (dcld = Q & Cmod35/3).[26]&The overall tendency is for the confidence level of group 2 to be the same as or larger than that of group 1. This is shown in
where many pixels are distributed in areas I, II, and III, but area IV is sparse. This means that the cloud detection of the examples in this study is almost entirely determined by group 1, i.e., the solar radiation. This is because clouds with high cloud top are generally developed and thick, or overlap thick clouds. In contrast, low and optically thick clouds also exist. The occurrence of smaller values of the confidence level for group 2 compared with group 1 is often due to the detection of cloud by the R(1.38 &m) test, implying the existence of thin and nonoverlapped cirrus. Areas I, II, III, and IV in
may correspond to optically thin cloud, low and optically thick cloud, tall and optically thick cloud, and high cirrus, respectively. If an observed area contains a lot of cirrus but no overlapping thick clouds, the threshold tests included in group 2 may be more important for cloud detection.Figure&10. Example of the relation between the clear confidence level of group 1 and that of group 2. Data is the MODIS scene at 0510 UTC 18 July 2006 corresponding to
(only over ocean). The graph field is presented with a division into four areas, labeled I, II, III, and IV.[27]&The cloud edges, which are expected to be a transition region from clear to cloudy sky, often have a clear confidence level of between 0 and 1. However, the clear confidence level of the several clear sky pixels over complicated surfaces, such as coral reef, a bright desert, and a dark Sun glint region, also tend to be between 0 and 1. Regions that are suspected to contain heavy aerosols also sometimes have a confidence level of between 0 and 1.4.&Discussion4.1.&General Feature of Results and Differences From MOD35[28]&The comparisons between the overall clear confidence level by our algorithm and MOD35 for the examples are summarized in &, where pixels over only ocean ( and ) or land (, , and ) are counted. They suggest that cloudy regions are coincident with each other for all surface conditions except forest, where the number of pixels discriminated as cloudy (Q = 0) by our algorithm and as clear by MOD35 is relatively large. On the other hand, there are many pixels that are discriminated as clear (Q = 1) by our algorithm but are not &confident clear& in MOD35, especially over ocean and forest. The discrepancy over ocean without Sun glint and over forest is mainly due to the difference of the procedure of the reflectance test. Over Sun glint regions, our algorithm discriminates bright pixels as clear, whereas MOD35 tends to discriminate them as &probably clear,& because in MOD35 the R(0.905 &m)/R(0.935 &m) test is used to only change &uncertain& or &cloudy& (owing to large reflectance) to &probably clear.& Over half-vegetation regions, many pixels that are labeled as &probably clear& by MOD35 are also discriminated as clear (Q = 1) by our algorithm. These comparisons suggest that our algorithm is neutral compared to MOD35.Table&4a.&Percentage of the Number of Pixels of Each Overall Clear Confidence Level by Our Algorithm With Each MOD35 Category Over Only Ocean, 0510 UTC 18 July 2006062.540.090.060.010&0.252.890.030.020.010.25&0.56.230.160.130.020.5&0.756.570.420.630.020.75&13.100.330.340.0118.292.244.641.20Table&4b.&Percentage of the Number of Pixels of Each Overall Clear Confidence Level by Our Algorithm With Each MOD35 Category Over Only Ocean, 2135 UTC 18 July 2006019.000.581.970.590&0.251.050.100.360.130.25&0.52.220.301.450.380.5&0.752.610.433.230.930.75&11.080.201.150.53111.142.9825.0522.53Table&4c.&Percentage of the Number of Pixels of Each Overall Clear Confidence Level by Our Algorithm With Each MOD35 Category for Only Land, 0745 UTC 19 April 2006016.550.280.681.040&0.250.250.040.070.180.25&0.50.570.080.280.560.5&0.750.560.070.290.500.75&10.920.100.430.5711.860.383.6970.05Table&4d.&Percentage of the Number of Pixels of Each Overall Clear Confidence Level by Our Algorithm With Each MOD35 Category for Only Land, 0140 UTC 1 October 2006018.930.350.440.870&0.250.330.040.080.170.25&0.50.610.130.190.550.5&0.750.380.080.220.540.75&10.260.050.120.3511.170.338.4565.38Table&4e.&Percentage of the Number of Pixels of Each Overall Clear Confidence Level by Our Algorithm With Each MOD35 Category for Only Land, 1415 UTC 21 April 2006026.731.910.451.960&0.256.311.290.311.360.25&0.55.771.710.481.960.5&0.753.051.120.331.460.75&11.490.260.070.21111.007.603.5419.6[29]&It is worth mentioning about the difference of the concept of the discrimination procedure between our algorithm and MOD35. MOD35 initially estimates the clear confidence level from the results of many threshold tests in order to be clear conservative, but finally the pixels labeled as uncertain or cloudy are judged by the several clear-sky restoral tests. Consequently, the discrimination, especially for ocean with Sun glint regions and bright land surfaces, is determined almost entirely by the clear-sky restoral tests as if it were cloud conservative, and the previous estimate of the confidence level becomes meaningless. The MOD35 algorithm flow sometimes confuses individual tests of cloud conservative and clear conservative methods. In our algorithm, on the other hand, every threshold test is comparable in importance to every other one and the neutral estimation of the overall clear confidence level is assured. This is one of the merits of grouping the threshold tests. As a consequence, the algorithm structure becomes very simple with the straightforward procedure, and can be easily applied to other imagers that have more or fewer channels.4.2.&Interpretation of the Values Between 0 and 1 by Cloud Fraction[30]&As described in , confidence levels between 0 and 1 tend to appear at the edge of a thick cloud, in thin and isolated cloud, above confusing surfaces, and at heavy aerosol areas. It seems that one of the factors contributing to changes of the confidence level is the cloud fraction within a pixel. If clouds cover only a part of a pixel, the radiance observed by satellite is a mixture of cloud and surface. The cloud field condition can be retrieved from high-resolution data (250 m by 250 m) of MOD35, 16 pixels of that corresponding to 1 pixel of 1 km. If all 16 pixels of 250 m by 250 m are cloudy (clear), the corresponding pixel of 1 km by 1 km is completely cloudy (clear), otherwise the pixel is considered to be &broken.&
lists the ratio of the number of pixels that have the overall clear confidence level (Q) of 0, between 0 and 1, and 1, in each cloud field condition (completely cloudy, broken, and completely clear) for the examples. It indicates that the ratio of the number of pixels with 0 & Q & 1 in broken cloud condition is larger than that in completely clear and cloudy condition, whereas the ratio for Q = 0 in completely cloudy is largest, and the ratio for Q = 1 in completely clear is largest except for the scene of ocean with Sun glint. Our algorithm may be able to extract areas with ambiguous cloudiness condition. Actually, the threshold values of the reflectance and the reflectance ratio tests are selected so that a large number of pixels of broken condition have the confidence level of between 0 and 1. However, there are many pixels considered as broken condition with Q = 1, especially in the scenes of ocean. In contrast, in MOD35 a large part of pixels considered as broken condition in the examples were occupied by &cloudy,& &uncertain,& or &probably clear.& It suggests that our algorithm tends to identify pixels of broken condition as clear, whereas MOD35 tends to identify it as cloudy. It should be mentioned that the high-resolution cloud mask data are derived from the reflectance and reflectance ratio test, which have a tendency to be clear conservative.Table&5.&Ratio of the Number of Pixels That Have the Overall Clear Confidence Level of 0, Between 0 and 1, and 1 in Each Cloud Field Condition Implied by High-Resolution Data of MOD350510 UTC 18 July 2006Ocean0.700.210.090.080.320.600.010.150.842135 UC 18 July 2006Ocean0.630.160.210.170.240.600.020.130.450745 UTC 19 April 2006Land0.830.10.070.330.280.390.020.030.950140 UTC 1 October 2006Land0.930.050.020.450.270.280.010.030.961415 UTC 21 April 2006Land0.810.140.050.330.360.310.110.250.64[31]&The confidence level may be applied to cloud fraction estimation. For example,
tried to estimate the cloud fraction in such a mixed pixel (referred to as &mixel&) by assuming that the observed radiance of a pixel can be decomposed into some components whose proportion is linearly correlated with the areas of surface and cloud (called the &unmixing procedure&). However, the relation of the confidence level to the cloud fraction in the pixel seems to be complicated. Several simulations of three-dimensional radiative transfer [e.g., ; ] suggested that the radiation field of cloud is dependent not only on the cloud fraction but also on the pattern of spatial distribution, which makes direct conversion of the confidence level into the cloud fraction more difficult. The interpretation of the value of the confidence level is not yet sufficiently understood and is left as a topic for future work.[32]&In this study, examples of cloud detection for snow or ice areas are not shown. Our algorithm first calculates the Normalized Difference Snow Index (NDSI) for each pixel and estimates whether pixels are suspected to be snow or not, as illustrated in . The cloud detection over snow areas (and in the polar regions) is carried out using a certain combination of the threshold tests. However, clouds over snow surface are very difficult to correctly identify and more investigation is required.5.&Summary and Conclusions[33]&In this paper, we have proposed a new concept of cloud detection for observations by multispectral spaceborne imagers. We aimed at developing a neutral algorithm, which means that results of cloud detection are not biased to either clear or cloudy. Previous cloud detection algorithms tended to bias the discrimination according to the use of the cloud detection results, by identifying an ambiguous pixel as cloudy (clear conservative), such as MOD35, or as clear (cloud conservative), in order to provide a distinct division between cloudy and clear-sky areas. Our algorithm is designed to avoid a biased discrimination and to provide an estimate of the degree of ambiguity, because we consider that the extraction of ambiguous areas is more useful and informative than a distinct division. Therefore, we use the concept of clear confidence level, which was introduced by MOD35 but was used only as an intermediate product to assign four levels of confidence to pixels.[34]&The algorithm comprises many pixel-by-pixel threshold tests, which can be classified into two groups: tests included in group 1 tend to falsely identify clear areas as cloudy, while those in group 2 tend to falsely identify cloudy areas as clear. First, our algorithm estimates the clear confidence levels for the individual threshold tests. Then it determines a cloud-conservative representative value for group 1, and a clear-conservative value for group 2. Finally, the values of each group are integrated to derive the overall clear confidence level. Consequently, our algorithm allows users of cloud detection results to determine an arbitrary border between cloudy and clear-sky areas according to their purposes and targets. We referred to the MOD35 algorithm to select the threshold tests and used its threshold values with some slight modifications. For example, to improve the discrimination of clouds over a bright desert, we incorporated the minimum reflectance over a month into the threshold value of the simple reflectance test of solar radiation, unlike MOD35 that uses a static and globally constant value, and combined the R(0.87 &m)/R(1.64 &m) test.[35]&Examples of cloud detection using MODIS data with visual inspection and comparisons to MOD35 results indicate that our algorithm is capable of reasonable discrimination between cloudy and clear-sky areas over ocean with and without Sun glint, forest, half vegetation, and desert. In particular, clouds over a Sun glint region in ocean or over desert surfaces, where discrimination is often difficult owing to bright and white conditions, can be identified. The effective threshold test varies according to the surface type: the R(0.905 &m)/R(0.935 &m), R(0.87 &m)/R(1.64 &m), R(0.55 &m)/R(1.24 &m), and NDVI tests are important for Sun glint regions, bright deserts, half-vegetation regions, and deep forests, respectively. In all the examples, the cloud edges of thick clouds sometimes have a clear confidence level between 0 and 1. The comparisons to high-resolution data of MOD35 suggest that our algorithm is able to extract areas with ambiguous cloudiness condition.[36]&Our algorithm structure is not a decision-tree type but is instead a very simple and straightforward procedure compared to MOD35. This feature will be useful when applying our algorithm to other existing or future imagers that have more or fewer channels, such as the Advanced Very High Resolution Radiometer (***HRR) aboard the NOAA satellite, and the Cloud and Aerosol Imager (CAI) aboard the Greenhouse gasses Observing SATellite (GOSAT). In fact, in order to make the algorithm applicable to the imager, the user need only to add or delete channels, and to adjust threshold values according to the channel specification of the imager in use. This manipulation also allows the user to assess and quantify the effectiveness of each channel for the cloud detection of several types of cloud.[37]&The confidence level may be related to the optical thickness of thin clouds, and/or the cloud fraction in a pixel, although the quantitative interpretation of the value of the clear confidence level is difficult and will be examined in a future work. Data of CLOUDSAT and CALIPSO may help more detailed validation and improvement of this algorithm. Strengths and weaknesses of each threshold test can be revealed by comparisons to cloud type and phase of cloud particles, which can be retrieved from these satellites. Comparisons to the liquid water content can estimate the relation between ambiguity of cloud and the clear confidence level. It is necessary for more correct cloud detection, especially over snow surface, to optimize the threshold values for the tests by using radiative transfer simulation and satellite data. Actually, the reflectance of cloud and some surfaces is dependent on the solar and satellite angle. It may be more accurate for the threshold tests using reflectance to apply dynamic threshold values, which can be estimated by radiative transfer models, as a function of the solar and satellite angle.Acknowledgments[38]&This research is supported by the GOSAT Science Project of the National Institute of Environmental Studies, Tsukuba, Japan (), a Grant-in-Aid for Young Scientists (A)
provided by the Ministry of Education, Science, Sports and Culture, Japan (), and the ADEOS-II Science Project (PSPC-06, 2007) and the EarthCARE Science Project (2007) of the Japan Aerospace Exploration Agency (JAXA).
Advertisement