DOI: 10.3334/CDIAC/cli.ndp040 1. BACKGROUND INFORMATION On May 23, 1972, the United States and the Union of Soviet Socialist Republics (USSR) established a bilateral initiative known as the Agreement on Protection of the Environment (Tatusko 1990). The primary goal of the initiative, which remains active despite the breakup of the USSR, is to promote cooperation between the two countries (Russia and the United States) on numerous environmental protection issues. Currently, the agreement fosters joint research in at least 11 "Working Groups" (i.e., areas of study), including: I. Prevention of Air Pollution II. Prevention of Pollution Effects on Vegetation III. Prevention of Pollution Associated with Agricultural Production IV. Enhancement of the Urban Environment V. Protection of Nature and the Organization of Preserves VI. Protection of the Marine Environment from Pollution VII. Biological and Genetic Effects of Environmental Pollution VIII. Influence of Environmental Changes on Climate IX. Earthquake Prediction X. Arctic and Subarctic Ecological Systems XI. Legal and Administrative Measures for Protecting Environmental Quality Given recent interest in possible greenhouse gas-induced climate change, Working Group VIII has become particularly useful to the scientific communities of both nations. Since its inception in 1972, Working Group VIII has been the primary conduit through which numerous cooperative studies of climate have been carried out. Its focus has evolved considerably through time and currently is quite broad, ranging from climate change, atmospheric composition, and stratospheric ozone to radiation fluxes, cloud climatology, and climate modeling. Among its achievements, Working Group VIII has established the Climate Data Exchange and Management Agreement Project. The purpose of this ongoing project is to promote the transfer of climatological information between the principal climate data centers in each country [i.e., the National Oceanic and Atmospheric Administration's National Climatic Data Center (NCDC) in Asheville, North Carolina, and the Research Institute of Hydrometeorological Information (RIHMI) in Obninsk, Russia]. A considerable amount of data has been exchanged as a result of this project. Some of the land-surface data received by NCDC to date include: - daily temperature and precipitation data collected at 223 USSR stations (1881-1989) - 3-hourly synoptic data collected at 223 USSR stations (1966-86) - 6-hourly synoptic data collected at 223 USSR stations (1936-65) - monthly temperature data collected at 243 USSR stations (1891-1988) The exchange of climatic data such as these will probably continue in the coming years. Acquisitions anticipated by NCDC in the near future include data for additional stations and updates for previously supplied stations. Considering the relative lack of climate records previously available for the USSR, data obtained via the bilateral exchange are particularly valuable to researchers outside the former Soviet Union. To expedite the dissemination of these data, this Numeric Data Package (NDP) presents one of the more useful archives that can be applied to the study of climate change and variability in the USSR: the 223-station daily temperature/precipitation data set. 2. DESCRIPTION OF THE DATA SET The data set documented in this NDP contains daily temperature and precipitation measurements collected at 223 USSR stations over the period 1881-1989. It was compiled from digital and manuscript records archived at RIHMI in Obninsk, Russia. This section describes: - the meteorological, geographical, and historical variables contained in the data set; - the methods and instruments used in collecting the meteorological observations; and - the temporal and spatial coverage of the station network. 2.1 Variables Daily mean, minimum, and maximum temperatures are available (to the nearest tenth of a degree Celsius) for each station. Temperature observations were taken eight times a day from 1966-89, four times a day from 1936-65, and three times a day from 1881-1935. Daily mean temperature is defined as the average of all observations for each calendar day. Daily maximum/minimum temperatures are derived from maximum/minimum thermometer measurements. To identify potentially erroneous data, two flag codes accompany each daily value. Daily precipitation totals are also available (to the nearest tenth of a millimeter) for each station. Throughout the record, daily precipitation is defined as the total amount of precipitation recorded during a 24-h period, snowfall being converted to a liquid total by melting the snow in the gauge. From 1936 on, rain gauges were checked several times each day; the cumulative total of all observations during a calendar day was presumably used as the daily total. Wetting corrections <=0.2 mm were applied beginning in 1966, depending upon the type and amount of precipitation. As with temperature, two data quality flags accompany each daily total. Extensive geographical and historical information supplements each time series (this info. can be found the in the files named "station.inventory"and "station.history"). Geographical parameters include station name, coordinates, and elevation. Historical parameters include station relocation date(s), the distance and direction of any such move(s), and the date on which the station switched to the Tretyakov-type rain gauge. Only 32 stations remained at their initial locations through 1989, and all stations switched to the Tretyakov-type gauge during the period 1946-60. 2.2 Recording Methods and Instrumentation Recording methods and instrumentation varied considerably over the period of record. The following describes the types of instruments used throughout the network, the apparatus employed to shelter these instruments, and the times at which observations were taken. Temperature and precipitation are addressed separately. Additional information regarding the history of the network is contained in publications and instruction manuals prepared by the Academy of Sciences of the Russian Empire (1892, 1893, 1894, 1896, 1897, 1898, 1900, 1902, 1908, 1912), The Nicholas Main Physical Observatory (1915), The Voyeikov Main Geophysical Observatory (1928, 1931, 1963), the Central Administration of the Unified Hydrometeorological Service of the USSR (1935, 1936, 1939, 1940), the Council of Ministers of the USSR (1946, 1954, 1958, 1962, 1969, 1985), and Gidrometeoizdat (1972). Temperature The types of thermometers in use at each station remained the same throughout the period of record (Table 1). Minimum temperature was consistently measured with an alcohol thermometer, whereas hourly and maximum temperatures were each collected with separate mercury thermometers. When the air temperature approached the freezing point of mercury (-38.9 C), either an alcohol thermometer, or in some cases a minimum thermometer alcohol column, was used in place of the mercury thermometer. Whether or not (much less when) the thermometers themselves were replaced at each station is not currently known. The type of shelter or screen surrounding the thermometers varied considerably before 1930. In 1912, official instructions recommended sheltering thermometers with the Stevenson-type screen (before 1912, no such guidelines existed). However, it is likely that this change was not implemented at many stations. From 1920-30, Stevenson screens were replaced with the current screens (name unknown) at all operating stations. In 1928, additional guidelines regarding the exact dimensions of the shelters and their mounting heights were issued (before 1928, no such specifications had been defined). Therefore, from 1930 on, most stations had their thermometers sheltered in roughly the same fashion. Major changes in the time of observation occurred in 1936 and 1966. Prior to 1936, "hourly" measurements for computing daily mean temperature were taken at 0700, 1300, and 2100 Local Mean Time (LMT) (minimum and maximum thermometers were checked at one of these hours or at 0900 LMT, depending upon the year). Because of the lack of nighttime observations, daily mean temperature was probably overestimated by some location-dependent amount during this period. Beginning in 1936, all thermometers (hourly, minimum, and maximum) were checked at 0100, 0700, 1300, and 1900 LMT at most stations. As a result, the bias in daily mean temperature dropped to ~0.2 C. From 1966 present, all thermometers were checked at 3-h intervals beginning at midnight Moscow winter Legal Time (MLT) (MLT being three hours later than Greenwich Mean Time). This rendered the bias in daily mean temperature insignificant. Table 1. Temperature recording methods and instrumentation Year Recording method/instrumentation implemented ---- -------------------------------------------- 1881 Measurements for computing daily mean temperature taken at 0700, 1300, and 2100 LMT; mercury thermometer used; because of lack of nighttime observations, daily mean temperature probably overstated. 1881 Daily minimum temperature thermometer checked at 0900 LMT; alcohol thermometer used. 1881 Daily maximum temperature thermometer checked at 0900 LMT; mercury thermometer used. 1881 No regulations regarding type of shelter surrounding thermometers. 1883 Daily minimum temperature thermometer checked at 0700 and 2100 LMT (lower value chosen); multiple measurements taken only to determine approximate time of occurrence of minimum. 1891 Daily maximum temperature thermometer checked at 1300 and 2100 LMT (higher value chosen); multiple measurements taken only to determine approximate time of occurrence of maximum. 1912 Official meteorological instructions recommended use of Stevenson screen to shelter thermometers; practice not implemented at all stations. 1920 Official meteorological instructions recommended use of current screen to shelter thermometers; practice implemented over next ten years. 1928 Official meteorological instructions specified exact size/height of screens. 1936 Measurements for computing daily mean temperature taken at 0100, 0700, 1300, and 1900 LMT (or at 0700, 1300, 1900, and 2100 LMT); bias in daily mean temperature dropped to ~0.2 C; daily maximum and minimum thermometers may or may not have been checked each hour. 1966 Measurements for all temperature variables collected at 3-h intervals beginning at midnight MLT; bias in daily mean temperature eliminated. Precipitation The type of rain gauge used at each station changed at least once during the period of record (Table 2). In particular, the old-style gauge (type unknown) was replaced with the Tretyakov-type gauge over the period 1946-60 (see the file named "station.history" for the date of implementation at each site). Whether or not other gauge replacements occurred at each station is not currently known. The type of shielding surrounding the rain gauges varied considerably over time. For example, in 1883, official instructions recommended that cross- shaped zinc strips be inserted into the gauge to prevent snow from drifting. Other shielding guidelines were issued at various times over the next half- century, up until the Tretyakov-type gauge was introduced. However, whether or not (much less when) any of the shields were installed at each station is not currently known. Changes in the time of observation occurred in 1936, 1966, and 1986. Before 1936, rainfall was measured only at 0700 LMT. From 1936-65, gauges were checked at 0700 and 1900 LMT. Beginning in 1966, the time of observation became time-zone dependent (the USSR being comprised of 11 time zones). In particular, from 1966-85, readings were taken at 0300, 0900, 1500, and 2100 MLT in zone 2 (i.e., Moscow); at 0300, 0600, 1500, and 1800 MLT in zones 3-5; at 0300 and 1500 MLT in zones 6-8; at midnight, 0300, 1200, and 1500 MLT in zones 9-11; and at 2100, 0300, 0900, and 1500 MLT in zone 12 (the easternmost part of the USSR). In 1986, the 0300 and 1500 MLT observations were discontinued in all but the second time zone. Table 2. Precipitation recording methods and instrumentation Year Recording method/instrumentation implemented ---- -------------------------------------------- 1881 Rain gauge measurements taken at 0700 LMT; snowfall converted to a liquid total by melting snow in gauge; type of gauge and shielding not standardized. 1883 Official meteorological instructions recommended that cross- shaped zinc strips be inserted into the gauge to prevent snow from drifting; change probably not implemented at all stations. 1887 Official meteorological instructions recommended surrounding the gauge with the funnel-shaped Nifer's shield; change probably not implemented at all stations. 1892 Official meteorological instructions recommended erecting a fence around the gauge; change probably not implemented at all stations. 1902 Official meteorological instructions recommended erecting a double fence around the gauge; change probably not implemented at all stations. 1936 Rain gauge measurements taken at 0700 and 1900 LMT; daily total rainfall obtained by summing all measurements for the calendar day. 1946-60 Old-style gauge (exact type unknown) replaced with the Tretyakov-type gauge (see the file named "station.history" for the exact date of implementation at each site). 1966 Rain gauge measurements taken at 0300, 0900, 1500, and 2100 MLT in time zone 2; at 0300, 0600, 1500, and 1800 MLT in zones 3-5; at 0300 and 1500 MLT in zones 6-8; at midnight, 0300, 1200, and 1500 MLT in zones 9-11; and at 2100, 0300, 0900, and 1500 MLT in zone 12; wetting corrections <=0.2 mm applied to each hourly measurement (Because four observations per day were collected at stations in time zones 2-5 and 9-12, four corrections were counted in the daily total; therefore, total daily corrections are higher for stations in these areas.) 1986 Rain gauge measurements at 0300 and 1500 MLT discontinued at all stations except those in time zone 2. 2.3 Temporal and Spatial Coverage The size of the observing network has increased with time. Twenty-three sites contain daily measurements dating to 1881 (though for 76 stations, maximum and/or minimum temperature observations began several years after mean temperature and precipitation). Aside from the period 1914-21 (i.e., during World War I, the Russian Revolution, and the Civil War), the number of stations rose at a relatively constant rate over the next half-century. The largest change in the network occurred in 1936, when an additional 65 observing posts were opened. Thereafter, only modest additions are evident, all stations collecting data by 1966 and only five (Adamovka, Vereb'e, Oktiabr'skaya, Rostov-na-Donu, and Surgut) closing before 1989. As the number of operational stations increased, spatial coverage improved. The distribution of posts early in the record, for example, is biased. In fact, most stations were located in population centers west of the Ural mountains and at ports along the Black and Caspian seas, whereas vast tracts of Siberia were entirely unsampled. Spatial coverage was much more representative of the country for the mid-1930s, with the exception of certain areas east of the Urals and north of the Arctic Circle. From a practical standpoint, the data set can probably be used to study long-term climate variations over the entire USSR for the period 1936-89. The density of stations, as well as their spatial distribution, was even better by 1985. Except for areas along the coast of the Arctic Ocean, most of the country was extremely well-sampled. In general, however, Arctic regions in the eastern part of the country are somewhat underrepresented throughout the record. The amount of missing data varies from element to element and station to station. Typically, the records of minimum/mean temperature are more complete than those of maximum temperature and rainfall. Most stations (90%) have at least 50 years of data for each parameter. 3. DATA PROBLEMS IDENTIFIED BY CDIAC An important part of the NDP process at the Carbon Dioxide Information Analysis Center (CDIAC) involves the quality assurance (QA) of data before distribution. Data received at CDIAC are rarely in a condition that would permit immediate distribution, regardless of the source. To guarantee data of the highest possible quality, CDIAC conducts extensive QA reviews. Reviews involve examining the data for completeness, reasonableness, and accuracy. Although they have common objectives, these reviews are tailored to each data set, often requiring extensive programming efforts. In short, the QA process is a critical component in the value-added concept of supplying accurate, usable data for researchers. The following summarizes the QA checks performed by CDIAC. The Russian data set compilers also conducted extensive manual and automated QA assessments. Although the archive was in fairly good condition upon its arrival at CDIAC, three important data quality problems were identified as a result of our QA checks: - incomplete metadata (particularly station history information) for 202 stations, - suspect data values and flag codes for all stations, and - extensive data problems for 25 stations. 3.1 Incomplete Metadata Metadata (i.e., station inventory/history information) was supplied to CDIAC on three 5.25-in floppy diskettes. Upon arrival, all files on these diskettes were checked for gross data processing problems (e.g., truncation of lines) and corruptions that might have been introduced in transport (e.g., unreadable characters). No problems of this variety were detected. CDIAC then assessed the accuracy of all station inventory parameters (i.e., WMO Nos., station names, coordinates, and elevations). This was accomplished by comparing each post's station inventory information with the official parameters given for that station in the latest version of WMO Publication No. 9, Vol. A, a document that contains station inventory information for all WMO posts. Through this comparison, the WMO Nos., station names, coordinates, and elevations for 221 of the 223 stations were verified. However, stations 26188 (Vereb'e) and 35133 (Adamovka) had no matching entries in WMO Vol. A. The accuracy of the station inventory information for these sites thus could not be corroborated. Station history parameters (i.e., station relocations and rain gauge replacements) were also checked for reasonableness. Three minor problems were noted: (1) the presence of bogus dates such as 31 September; (2) the nonchronological sorting of entries; and, most important (3) the lack of information for some stations. After consulting with the data set compilers, all such problems were resolved. However, it should be noted that many station relocation dates and rain gauge replacement dates are only listed as a year or year/month rather than a year/month/day. 3.2 Suspect Data Values and Flag Codes CDIAC received the daily data set in two shipments, the first containing data for the period 1881-1986 and the second extending the record through 1989. As with the metadata files, several general checks were first performed to identify any pervasive data processing problems and to verify that the files had not been corrupted in transport. No problems of this type were identified. Subsequently, all WMO Nos. were cross-referenced with the official list of stations provided by the Russian data set compilers. As a result, it was determined that station number 34731 (Rostov-na-Donu) was incorrectly listed as 34734 for the years 1987-88; the station number was corrected on all relevant lines. Finally, the values of year and month were checked for reasonableness and proper sorting. No year, month, or sorting problems were detected. CDIAC then examined the actual daily data values for reasonableness. In particular, minimum, mean, and maximum temperature on each day were compared to verify that the minimum was less than or equal to the mean and that the mean was less than or equal to the maximum. For 4544 days scattered over 220 stations, this relationship was violated. To alert the user to these cases, CDIAC flagged all such occurrences in the data set. Extreme value checks were applied to identify negative rainfall totals and temperatures that exceeded known world-record values (i.e., temperatures below -73 C or above 58 C). As a result, 230 minimum and 13 maximum temperature observations were flagged as suspect. Precipitation totals above 500 mm were also checked for reasonableness, though none were flagged as problematic. Finally, each time series was plotted and visually inspected for values that were anomalous but that did not exceed the aforementioned thresholds. To screen out seasonal effects, z-scores (i.e., standardized deviations from the long-term monthly mean) were also graphed. Consequently, another 572 minimum, 373 mean, and 346 maximum temperature values were flagged as suspect. The daily flag codes assigned by the Russian data set compilers were also checked for validity. Seventeen observations were annotated with undocumented codes. Given the infrequency of these unspecified flags, all 17 observations were set to "missing." In addition, 293 minimum, 311 mean, and 997 maximum temperature observations had "missing value" flag codes, yet none of the values had been set to missing. The validity of these observations is uncertain. 3.3 Extensive Data Problems As described earlier, each time series was plotted and visually inspected for errors. As a result, extensive data problems were identified for 13 sites. For example, station 24966 (Ust'-Maja) has numerous mean temperature observations <-40 C through most of its record, but none prior to 1902 Given the pervasive nature of these findings, no individual values were flagged as suspect; rather, an inventory of problematic stations was constructed (Table 3). Thirteen stations also contain large gaps early in the record for some variables. For example, the archive for station 29866 (Minusinsk) begins in earnest in 1905, though a few observations are available as early as 1901. CDIAC did not flag individual values to indicate these gaps; rather, the following inventory of stations with data gaps was prepared: Reboly (22602): all variables Borzja (30965): maximum temperature Kirov (27196): maximum temperature Brest (33008): mean/maximum temperature Barabinsk (29612): maximum temperature Celinograd (35188): maximum temperature Minusinsk (29866): all variables Uil (35416): maximum temperature Cita (30758): maximum temperature Leninakin (37686): maximum temperature Ulan-ude (30823): maximum temperature Termez (38927): all variables Kjahta (30925): maximum temperature It should also be noted that changes in station location, instrumentation, and time of observation may have introduced other inhomogeneities (ones undetectable by plotting) on each series. Methods for identifying such discontinuities are given in Potter (1981), Alexandersson (1986), Karl and Williams (1987), Gullet et al. (1991), and Peterson and Easterling (1993). Table 3. Inventory of stations with extensive data problems WMO No. Description of data problem and approximate time of occurrence ------- -------------------------------------------------------------- 24266 From 1895-1920, there are few maximum temperature values greater than -36 C; thereafter, numerous values are greater than -36 C. 24641 From 1900-1930, there are few maximum temperature values greater than -36 C; thereafter, numerous values are greater than -36 C. 24944 From 1900-1930, there are few maximum temperature values greater than -36 C; thereafter, numerous values are greater than -36 C. 24959 From 1888-1927, there are few maximum temperature values greater than -36 C; thereafter, numerous values are greater than -36 C. 24966 From 1897-1901, there are few mean temperature values less than -40 C; thereafter, numerous values are less than -40 C. 25551 From 1894-1898, there are few minimum, mean, and maximum temperature values less than -40 C; thereafter, numerous values are less than -40 C. 26406 From 1881-1886, numerous precipitation totals are equal to 0; thereafter, far fewer values are equal to 0. 30823 In 1896, several precipitation totals are anomalously large. 31510 In 1928, several minimum temperature values are anomalously high. 36177 From 1918-1921, many precipitation totals are only recorded to the nearest millimeter. 37472 From 1898-1911, many minimum temperatures are only recorded to the nearest degree Celsius. 38895 In 1889, many maximum temperature values are anomalously high. 38954 In 1910, several precipitation totals are anomalously large. 4. HOW TO OBTAIN THE DATA FILES This data base is available in machine-readable form, on request, from CDIAC without charge. CDIAC will also distribute subsets of the data base as needed. It can be acquired on two 9-track magnetic tapes or from CDIAC's anonymous FTP area (see FTP address below). However, because of space constraints, it will not be distributed on floppy diskette. Requests should include any specific tape instructions (i.e., 1600 or 6250 BPI, labeled or nonlabeled, ASCII or EBCDIC characters, and variable- or fixed-length records) required by the user to access the data. Requests not accompanied by specific instructions will be filled on 9-track, 6250 BPI, standard-labeled tapes with EBCDIC characters. Requests should be addressed to: Carbon Dioxide Information Analysis Center Oak Ridge National Laboratory Post Office Box 2008 Oak Ridge, Tennessee 37831-6335 U.S.A. Telephone: +1 (615) 574-0390 Fax: +1 (615) 574-2232 E-mail: BITNET: CDP@ORNLSTC INTERNET: CDP@STC10.CTD.ORNL.GOV OMNET: CDIAC The data files can be also acquired via FTP from CDIAC's anonymous FTP account: - FTP to CDIAC.ESD.ORNL.GOV (128.219.24.36) - Enter "ftp" as the userid - Enter your e-mail address as the password (e.g., "rtv@ornlstc") - Change to the directory "pub/ndp040" - Acquire the files using the FTP "get" command 5. REFERENCES Academy of sciences of the Russian Empire. 1892, 1893, 1894, 1896, 1897, 1898, 1900, 1902, 1908, 1912. Guide to second grade meteorological stations. St. Petersburg. Alexandersson, H. 1986. A homogeneity test applied to precipitation data. Journal of Climatology 6:661-75. Central Administration of the Unified Hydrometeorological Service of the USSR. 1935, 1936. Guide to making meteorological observations and their processing (3rd and 4th eds.). Leningrad. Central Administration of the Unified Hydrometeorological Service of the USSR. 1939. Guide to making meteorological observations and their processing (5th ed.). Moscow. Central Administration of the Unified Hydrometeorological Service of the USSR. 1940. Guide to making meteorological observations and their processing (6th ed.). Leningrad-Moscow. Council of Ministers of the USSR. 1946, 1954, 1958, 1962, 1969, 1985. Manual for hydrometeorological stations and posts, Issue 3 (meteorological observations at stations), Part 1: Main meteorological observations. Leningrad. Council of Ministers of the USSR. 1958 69. Manual for hydrometeorological stations and posts, Issue 3 (meteorological observations at stations), Part 2: Processing of the meteorological observations. Leningrad. Gidrometeoizdat, 1972. USSR climate reference book: History, physical, and geographic descriptions of meteorological stations and posts. Leningrad. Gullet, D. W., L. Vincent, and L. H. Malone. 1991. Homogeneity Testing of Monthly Temperature Series: Application of Multiple-Phase Regression Models With Mathematical Changepoints. Atmospheric Environment Service, Downsview, Ontario, Canada. Karl, T. R., and C. N. Williams, Jr. 1987. An approach to adjusting climatological time series for discontinuous inhomogeneities. Journal of Climate and Applied Meteorology 26:1744-63. Peterson, T. C. and D. R. Easterling. 1993. Creation of homogeneous composite climatological reference series. International Journal of Climatology, in press. Potter, K. W. 1981. Illustration of a new test for detecting a shift in precipitation series. Monthly Weather Review 109:2040-45. The Nicholas Main Physical Observatory. 1915. Guide to second grade meteorological stations, Issue 1. Petrograd. The Voyeikov Main Geophysical Observatory. 1928, 1931. Guide to second grade meteorological stations, Issue 1 (main meteorological observations). Leningrad. The Voyeikov Main Geophysical Observatory. 1963. Review of changes in the technique of making meteorological observations over the network of stations and posts. Leningrad. Tatusko, R. L. 1990. Cooperation in climate research: An evaluation of the activities conducted under the US-USSR agreement for environmental protection since 1974. National Oceanic and Atmospheric Administration, Washington, D.C. 6. FILE DESCRIPTIONS This section describes the content and format of each of the 18 files that comprise this NDP (Table 4). Because CDIAC distributes the data set in two ways (i.e., via anonymous FTP and on two 9-track magnetic tapes), each of the 18 files is referenced by both an ASCII file name (e.g., "ndp040.txt") and a tape file number (e.g., File 1, Tape 1). The files and their contents include the following: - "ndp040.txt" (File 1, Tape 1), a detailed description of both the 223-station network and the 18 data files; - "inventory.for" (File 2, Tape 1), a FORTRAN data retrieval routine to read "station.inventory" (File 8, Tape 1); - "history.for" (File 3, Tape 1), a FORTRAN data retrieval routine to read "station.history" (File 9, Tape 1); - "data.for" (File 4, Tape 1), a FORTRAN data retrieval routine to read "ussr1.data"-"ussr9.data" (Files 10-15, Tape 1 and Files 1-3, Tape 2); - "inventory.sas" (File 5, Tape 1), a SAS data retrieval routine to read "station.inventory" (File 8, Tape 1); - "history.sas" (File 6, Tape 1), a SAS data retrieval routine to read "station.history" (File 9, Tape 1); - "data.sas" (File 7, Tape 1), a SAS data retrieval routine to read "ussr1.data"-"ussr9.data" (Files 10-15, Tape 1 and Files 1-3, Tape 2); - "station.inventory" (File 8, Tape 1), a listing of station location information and period of record statistics (by variable) for each of the 223 stations; - "station.history" (File 9, Tape 1), a listing of rain gauge replacement dates and station relocation data for each of the 223 stations; and - "ussr1.data"-"ussr9.data" (Files 10-15, Tape 1 and Files 1-3, Tape 2), a listing of daily temperature and precipitation data for the 223 stations (25 stations per file). The remainder of this section describes (or lists, where appropriate) the contents of each of the 18 files. The files are discussed in the order in which they appear on the magnetic tapes. Table 4. Content, size, and format of data files File number, name, Logical FTP file Tape file Block Record and description records size (K) size (K) size length Tape 1 1. ndp040.txt: 1,200 54.6 93.8 8,000 80 2. inventory.for: 17 0.6 1.3 8,000 80 3. history.for: 14 0.4 1.1 8,000 80 4. data.for: 20 0.6 1.6 8,000 80 5. inventory.sas: 10 0.3 0.8 8,000 80 6. history.sas: 8 0.2 0.6 8,000 80 7. data.sas: 14 0.4 1.1 8,000 80 8. station.inventory: 223 21.3 21.8 10,000 100 9. station.history: 810 22.9 27.7 10,500 35 10. ussr1.data: 74,672 18,972.5 19,688.9 6,750 270 11. ussr2.data: 74,456 18,855.2 19,632.0 6,750 270 12. ussr3.data: 79,908 20,305.4 21,069.5 6,750 270 13. ussr4.data: 83,107 21,098.0 21,913.0 6,750 270 14. ussr5.data: 81,450 20,635.8 21,476.1 6,750 270 15. ussr6.data: 73,585 18,662.8 19,402.3 6,750 270 Tape 2 1. ussr7.data: 80,354 20,391.7 21,187.1 6,750 270 2. ussr8.data: 79,700 20,191.8 21,014.6 6,750 270 3. ussr9.data: 76,073 19,245.6 20,058.3 6,750 270 Total (both tapes) 705,621 178,460.1 185,591.6 NOTE: "FTP file size" applies only to files in CDIAC's anonymous FTP area. All such files have variable-length records. NOTE: "Tape files size," "Block size," and "Record length" apply only to files distributed on magnetic tape. All such files have fixed-length records. inventory.for (File 2, Tape 1) This file contains a FORTRAN data retrieval routine to read "station.inventory" (File 8, Tape 1). The following is a listing of this program. For additional information regarding variable definitions and format statements, please see the file description for "station.inventory." C FORTRAN data retrieval routine to read the file named C "station.inventory" (File 8, Tape 1). C C Unit 1 is input. C Unit 6 (terminal) is output. C INTEGER WMO, MINTFYR, MIDTFYR, MAXTFYR, PRCPFYR, LYR REAL LAT, LON, ELEV, MINTMISS, MIDTMISS, MAXTMISS, PRCPMISS CHARACTER NAME*25 OPEN (UNIT=1, FILE='station.inventory') 10 READ (1, 1, END=99) WMO, NAME, LAT, LON, ELEV, MINTFYR, MINTMISS, *MIDTFYR, MIDTMISS, MAXTFYR, MAXTMISS, PRCPFYR, PRCPMISS, LYR 1 FORMAT (I5, 1X, A25, 1X, F5.2, 1X, F7.2, 1X, F6.1, 1X, *4(I4, 1X, F4.1, 1X), I4) GO TO 10 99 STOP END history.for (File 3, Tape 1) This file contains a FORTRAN data retrieval routine to read "station.history" (File 9, Tape 1). The following is a listing of this program. For additional information regarding variable definitions and format statements, please see the file description for "station.history." C FORTRAN data retrieval routine to read the file named C "station.history" (File 9, Tape 1). C C Unit 1 is input. C Unit 6 (terminal) is output. C INTEGER WMO, YEAR, MONTH, DAY CHARACTER TYPE*4, DIST*2, DIRECT*3 OPEN (UNIT=1, FILE='station.history') 10 READ (1, 1, END=99) WMO, TYPE, YEAR, MONTH, DAY, DIST, DIRECT 1 FORMAT (I5, 1X, A4, 1X, I4, 1X, I2, 1X, I2, 1X, A2, 1X, A3) GO TO 10 99 STOP END data.for (File 4, Tape 1) This file contains a FORTRAN data retrieval routine to read "ussr1.data"-"ussr9.data" (Files 10-15, Tape 1 and Files 1-3, Tape 2). The following is a listing of this program. For additional information regarding variable definitions and format statements, please see the file description for "ussr1.data"-"ussr9.data." C FORTRAN data retrieval routine to read the files named C "ussr*.data" (Files 10-15, Tape 1 and Files 1-3, Tape 2) C C Unit 1 is input. C Unit 6 (terminal) is output. C INTEGER WMO, YEAR, MONTH, DAY, NOBS, DATA(31) CHARACTER TYPE*4, FLAGA(31)*1, FLAGB(31)*1 OPEN (UNIT=1, FILE='ussr*.data') 10 DO DAY = 1, 31 DATA(DAY) = 9999 FLAGA(DAY) = '9' FLAGB(DAY) = '9' END DO READ (1, 1, END=99) WMO, TYPE, YEAR, MONTH, NOBS, *(DAY, DATA(DAY), FLAGA(DAY), FLAGB(DAY), I = 1, NOBS) 1 FORMAT (I5, A4, I4, I2, I2, 31(I2, I4, A1, A1)) GO TO 10 99 STOP END inventory.sas (File 5, Tape 1) This file contains a SAS data retrieval routine to read "station.inventory" (File 8, Tape 1). The following is a listing of this program. For additional information regarding variable definitions and format statements, please see the file description for "station.inventory." * SAS data retrieval routine to read the file named; * "station.inventory" (File 8, Tape 1).; *; DATA INVENTRY; INFILE 'station.inventory'; INPUT WMO 1-5 NAME $ 7-31 LAT 33-37 LON 39-45 ELEV 47-52 MINTFYR 54-57 MINTMISS 59-62 MIDTFYR 64-67 MIDTMISS 69-72 MAXTFYR 74-77 MAXTMISS 79-82 PRCPFYR 84-87 PRCPMISS 89-92 LYR 94-97; RUN; history.sas (File 6, Tape 1) This file contains a SAS data retrieval routine to read "station.history" (File 9, Tape 1). The following is a listing of this program. For additional information regarding variable definitions and format statements, please see the file description for "station.history." * SAS data retrieval routine to read the file named; * "station.history" (File 9, Tape 1).; *; DATA HISTORY; INFILE 'station.history'; INPUT WMO 1-5 TYPE $ 7-10 YEAR 12-15 MONTH 17-18 DAY 20-21 DIST $ 23-24 DIRECT $ 26-28; RUN; data.sas (File 7, Tape 1) This file contains a SAS data retrieval routine to read "ussr1.data"-"ussr9.data" (Files 10-15, Tape 1 and Files 1-3, Tape 2). The following is a listing of this program. For additional information regarding variable definitions and format statements, please see the file description for "ussr1.data"-"ussr9.data." * SAS data retrieval routine to read the files named; * "ussr*.sas" (Files 10-15, Tape 1 and Files 1-3, Tape 2).; *; DATA DAILY; LENGTH FLAGA1-FLAGA31 FLAGB1-FLAGB31 $ 1; ARRAY DATA(31); ARRAY FLAGA(31) $; ARRAY FLAGB(31) $; INFILE 'ussr*.data' lrecl=266; INPUT WMO 5. TYPE $CHAR4. YEAR 4. MONTH 2. NOBS 2. @; DO I = 1 TO NOBS; INPUT DAY 2. DATA(DAY) 4. FLAGA(DAY) $CHAR1. FLAGB(DAY) $CHAR1. @; END; RUN; station.inventory (File 8, Tape 1) This file provides station location information and period of record statistics for each of the 223 stations. There is one entry for each station; consequently, the file has 223 lines. Each line contains a station's WMO No., name, latitude, longitude, and elevation, as well as the first/last year of record and percentage of data missing for each variable. The file is sorted by WMO No. and variable type and can be read by using the following FORTRAN code (contained in "inventory.for," which is File 2 on Tape 1): C FORTRAN data retrieval routine to read the file named C "station.inventory" (File 8, Tape 1). C C Unit 1 is input. C Unit 6 (terminal) is output. C INTEGER WMO, MINTFYR, MIDTFYR, MAXTFYR, PRCPFYR, LYR REAL LAT, LON, ELEV, MINTMISS, MIDTMISS, MAXTMISS, PRCPMISS CHARACTER NAME*25 OPEN (UNIT=1, FILE='station.inventory') 10 READ (1, 1, END=99) WMO, NAME, LAT, LON, ELEV, MINTFYR, MINTMISS, *MIDTFYR, MIDTMISS, MAXTFYR, MAXTMISS, PRCPFYR, PRCPMISS, LYR 1 FORMAT (I5, 1X, A25, 1X, F5.2, 1X, F7.2, 1X, F6.1, 1X, *4(I4, 1X, F4.1, 1X), I4) GO TO 10 99 STOP END This file can also be read by using the following SAS code (contained in "inventory.sas," which is File 5 on Tape 1): * SAS data retrieval routine to read the file named; * "station.inventory" (File 8, Tape 1).; *; DATA INVENTRY; INFILE 'station.inventory'; INPUT WMO 1-5 NAME $ 7-31 LAT 33-37 LON 39-45 ELEV 47-52 MINTFYR 54-57 MINTMISS 59-62 MIDTFYR 64-67 MIDTMISS 69-72 MAXTFYR 74-77 MAXTMISS 79-82 PRCPFYR 84-87 PRCPMISS 89-92 LYR 94-97; RUN; Stated in tabular form, the contents include the following: Variable Variable Starting Ending Variable type width column column WMO Numeric 5 1 5 NAME Character 25 7 31 LAT Numeric 5 33 37 LON Numeric 7 39 45 ELEV Numeric 6 47 52 MINTFYR Numeric 4 54 57 MINTMISS Numeric 4 59 62 MIDTFYR Numeric 4 64 67 MIDTMISS Numeric 4 69 72 MAXTFYR Numeric 4 74 77 MAXTMISS Numeric 4 79 82 PRCPFYR Numeric 4 84 87 PRCPMISS Numeric 4 89 92 LYR Numeric 4 94 97 where WMO is the WMO No. of the station. NAME is the name of the station. LAT is the latitude of the station (in decimal degrees). LON is the longitude of the station (in decimal degrees). Stations in the Western Hemisphere have negative longitudes. ELEV is the elevation of the station (in meters). Missing elevations are coded as 999.9. MINTFYR is the first year in which minimum temperature (MINTFYR), MIDTFYR mean temperature (MIDTFYR), maximum temperature (MAXTFYR), MAXTFYR or precipitation (PRCPFYR) data are available at this PRCPFYR station. MINTMISS is the percentage of minimum temperature (MINTMISS), mean MIDTMISS temperature (MIDTMISS), maximum temperature (MAXTMISS), or MAXTMISS precipitation (PRCPMISS) data that are missing at this PRCPMISS station. LYR is the last year in which data are available for all variables at this station. station.history (File 9, Tape 1) This file provides rain gauge replacement dates and station relocation dates for each station. There are two types of entries for each station. One type contains the station's WMO No. and rain gauge replacement date. The other type contains the station's WMO No. and a relocation date, distance, and direction. The file is sorted by WMO No., year, month, and day and can be read by using the following FORTRAN code (contained in "history.for," which is File 3 on Tape 1): C FORTRAN data retrieval routine to read the file named C "station.history" (File 9, Tape 1). C C Unit 1 is input. C Unit 6 (terminal) is output. C INTEGER WMO, YEAR, MONTH, DAY CHARACTER TYPE*4, DIST*2, DIRECT*3 OPEN (UNIT=1, FILE='station.history') 10 READ (1, 1, END=99) WMO, TYPE, YEAR, MONTH, DAY, DIST, DIRECT 1 FORMAT (I5, 1X, A4, 1X, I4, 1X, I2, 1X, I2, 1X, A2, 1X, A3) GO TO 10 99 STOP END This file can also be read by using the following SAS code (contained in "history.sas," which is File 6 on Tape 1): * SAS data retrieval routine to read the file named; * "station.history" (File 9, Tape 1).; *; DATA HISTORY; INFILE 'station.history'; INPUT WMO 1-5 TYPE $ 7-10 YEAR 12-15 MONTH 17-18 DAY 20-21 DIST $ 23-24 DIRECT $ 26-28; RUN; Stated in tabular form, the contents include the following: Variable Variable Starting Ending Variable type width column column WMO Numeric 5 1 5 TYPE Character 4 7 10 YEAR Numeric 4 12 15 MONTH Numeric 2 17 18 DAY Numeric 2 20 21 DIST Character 2 23 24 DIRECT Character 3 26 28 where WMO is the WMO No. of the station. TYPE is the type of change indicated by this entry. The possible values of TYPE are as follows: RAIN = rain gauge replacement (i.e., change from old-type gauge to Tretyakov-type gauge). Each station will have only one RAIN entry. In this type of entry, DIST and DIRECT (described below) are not relevant and thus are coded as blanks. MOVE = station relocation. Each station will have at least one MOVE entry. If a station moved on more than one occasion, then separate entries are included for each relocation. If a station never moved, then that station will have only one MOVE entry; in this entry, YEAR, MONTH, DAY, DIST, and DIRECT (described below) are all coded as missing. In other words, if a station has only one MOVE entry, and if all variables in that MOVE entry are coded as missing, then the given station never moved. YEAR is the year in which the change took place. Missing years are coded as -999. MONTH is the month in which the change took place. Missing months are coded as -9. DAY is the day on which the change took place. Missing days are coded as -9. DIST is the distance (in kilometers) that the station was moved. Missing distances are coded as -9. A distance of zero indicates that the station moved less than one kilometer. DIST only applies to station relocation entries (i.e., lines in which TYPE = MOVE). In rain gauge replacement entries (i.e., lines in which TYPE = RAIN), DIST is not relevant and thus is coded as blanks. DIRECT is the direction in which the station was moved (e.g., N = north, SE = southeast). Missing directions are coded as -99. DIRECT only applies to station relocation entries (i.e., lines in which TYPE = MOVE). In rain gauge replacement entries (i.e., lines in which TYPE = RAIN), DIRECT is not relevant and thus is coded as blanks. ussr1.data-ussr9.data (Files 10-15, Tape 1 and Files 1-3, Tape 2) These files contain the daily temperature and precipitation values for each of the 223 stations. Each file consists of a block of 25 stations (except ussr9.data, which only has 23). The range of WMO numbers associated with each file is as follows: File name: WMO No. Range --------- ------------- ussr1.data 20674 23804 ussr2.data 23849 25744 ussr3.data 25913 28064 ussr4.data 28138 30230 ussr5.data 30253 31532 ussr6.data 31594 33837 ussr7.data 33889 35188 ussr8.data 35229 37549 ussr9.data 37686 38987 Each logical record in these files contains one month of data for a given variable. In particular, each line consists of a WMO No., a flag indicating the type of variable (i.e., minimum, mean, maximum temperature, or precipitation), the year and month of the record, a tally of the number of days (n) with data, and n daily values with their respective flag codes. To conserve space, only days with nonmissing values are included in each record. Likewise, if no data are available for a particular month, then there is no entry for that month in the data file. Because only days with nonmissing values are contained in the data base, the record length of the file varies from line to line. In addition, a given day of the month can fall within a different set of columns from one line to the next. The files are sorted by WMO No., variable type, year, and month and can be read by using the following FORTRAN code (contained in "data.for," which is File 4 on Tape 1): C FORTRAN data retrieval routine to read the files named C "ussr*.data" (Files 10-15, Tape 1 and Files 1-3, Tape 2) C C Unit 1 is input. C Unit 6 (terminal) is output. C INTEGER WMO, YEAR, MONTH, DAY, NOBS, DATA(31) CHARACTER TYPE*4, FLAGA(31)*1, FLAGB(31)*1 OPEN (UNIT=1, FILE='ussr*.data') 10 DO DAY = 1, 31 DATA(DAY) = 9999 FLAGA(DAY) = '9' FLAGB(DAY) = '9' END DO READ (1, 1, END=99) WMO, TYPE, YEAR, MONTH, NOBS, *(DAY, DATA(DAY), FLAGA(DAY), FLAGB(DAY), I = 1, NOBS) 1 FORMAT (I5, A4, I4, I2, I2, 31(I2, I4, A1, A1)) GO TO 10 99 STOP END These files can also be read by using the following SAS code (contained in "data.sas," which is File 7 on Tape 1): * SAS data retrieval routine to read the files named; * "ussr*.sas" (Files 10-15, Tape 1 and Files 1-3, Tape 2).; *; DATA DAILY; LENGTH FLAGA1-FLAGA31 FLAGB1-FLAGB31 $ 1; ARRAY DATA(31); ARRAY FLAGA(31) $; ARRAY FLAGB(31) $; INFILE 'ussr*.data' lrecl=266; INPUT WMO 5. TYPE $CHAR4. YEAR 4. MONTH 2. NOBS 2. @; DO I = 1 TO NOBS; INPUT DAY 2. DATA(DAY) 4. FLAGA(DAY) $CHAR1. FLAGB(DAY) $CHAR1. @; END; RUN; Stated in tabular form, the contents include the following: Variable Variable Starting Ending Variable type width column column WMO Numeric 5 1 5 TYPE Character 4 6 9 YEAR Numeric 4 10 13 MONTH Numeric 2 14 15 NOBS Numeric 2 16 17 DAY Numeric 2 N/A N/A DATA(1 31) Numeric 4 N/A N/A FLAGA(1 31) Character 1 N/A N/A FLAGB(1 31) Character 1 N/A N/A The variables contained in "ussr1.data"-"ussr9.data" have the following definitions: WMO is the WMO No. of the station. TYPE is the variable type. The possible values of TYPE are as follows: TMIN = minimum temperature (tenths of C); TMAX = maximum temperature (tenths of C); TMID = mean temperature (tenths of C); and PRCP = precipitation (tenths of millimeters). YEAR is the year of the data record. MONTH is the month of the data record. NOBS is the number of days in the month that have nonmissing data values. Days with missing values are NOT included in the data files. DAY is the day of the month. DATA(1-31) are the daily data values. FLAGA(1-31) are daily quality codes that were assigned by the Russian data set compilers. The codes and their meanings are as follows: 0 = the value is assumed to be reliable, 2 = the value is doubtful (beyond the set limit), and 4 = the value is rejected (Note: According to the documentation supplied by the Russian data set compilers, all values with a FLAGA code of 4 should have been set to missing because the meteorological observation was never carried out in the first place. However, 292 minimum temperature values, 311 mean temperature values, and 997 maximum temperature values had FLAGA codes of 4. The validity of these values is unknown). FLAGB(1-31) are daily quality codes specific to the type of variable. For minimum, maximum, and mean temperature, these flags were assigned by CDIAC based upon the findings of various visual and digital quality assurance checks. The codes and their meanings are as follows: 0 = the temperature value is assumed to be reliable, and 3 = the temperature value is suspect. The value might have been flagged as suspect for two reasons: (1) it appeared to be extreme according to digital or visual quality assurance checks, or (2) the relationship between minimum, mean, and maximum temperature (i.e., MINT<=MIDT<=MAXT) was violated. For precipitation, these flags were assigned by the Russian data set compilers. The codes and their meanings are as follows: 5 = a rainfall total >0.1 mm (though CDIAC determined that some observations after 1986 were in fact 0); 6 = a multiple-day rainfall total; 7 = a rainfall total of 0 (i.e., no precipitation recorded); and 8 = a rainfall total <0.1 mm. Note that in these cases the actual rainfall total is coded as 0 (i.e., DATA = 0). The following is a sample line to illustrate the format of "ussr1.data"- "ussr9.data": 1 2 3 4 Column: 12345678901234567890123456789012345678901 Data: 38987PRCP198912b3b4bbb205b6b2780515bb4605 In this sample line: Record Position Contents Variable Meaning 1-5 38987 WMO Line contains data for st. 38987. 6-9 PRCP TYPE Line contains precip. data. 10-13 1989 YEAR Line contains data for 1989. 14-15 12 MONTH Line contains data for December. 16-17 b3 NOBS 3 days in the month have data. 18-19 b4 DAY Fourth day of the month. 20-23 bbb2 DATA(4) 0.2 mm of rainfall. 24 0 FLAGA(4) Value is assumed to be reliable. 25 5 FLAGB(4) Rainfall total >0.1 mm. 26-27 b6 DAY Sixth day of the month. 28-31 b278 DATA(6) 27.8 mm of rainfall. 32 0 FLAGA(6) Value is assumed to be reliable. 33 5 FLAGB(6) Rainfall total > 0.1 mm. 34-35 15 DAY Fifteenth day of month. 36-39 bb46 DATA(15) 4.6 mm of rainfall. 40 0 FLAGA(15) Value is assumed to be reliable. 41 5 FLAGB(15) Rainfall total > 0.1 mm. - Note lack of data for days 1-3 [i.e., DATA(1-3), FLAGA(1-3), and FLAGB(1-3) are missing]. - Note lack of data for day 5 [i.e., DATA(5), FLAGA(5), and FLAGB(5) are missing]. - Note lack of data for days 7-14 [i.e., DATA(7-14), FLAGA(7-14), and FLAGB(7-14) are missing]. - Note lack of data for days 16-31 [i.e., DATA(16-31), FLAGA(16-31), and FLAGB(16-31) are missing].