Assessment of Spatial Water Quality Observation of Citarum River Bandung Regency Using Multivariate Statistical Methods

Citarum River is one of the most important rivers in Indonesia. Around 16 million people interrelate with this river, covers 12,000 Km2 of the watershed, supplies water for irrigation of 420,000 hectares of rice fields, provides 80% of water need for the city of Jakartathe capital of Indonesia. Unfortunately, Citarum was also known as one of the most polluted rivers in the world. Although there is much attention to this river nowadays, there is still no analysis to determine the latent contributing factors of water quality cluster distribution. This study aims to provide spatial water quality on the Citarum River Bandung Regency. This study can help the government decide on how to manage the water quality of Citarum and all socio-cultural factors involved in polluting the river. Open Data can also use the data and result for further research. Assessment of Citarum water quality is done through the application of multivariate statistical approaches. The data set comprises one-month observation data from 75 stations positioned in Citarum Bandung Regency and its tributaries. Factor Analysis with PCA as the extraction method gives two factors while CA showed three clusters suggesting the different physicochemical characteristics and pollution levels of the Citarum water systems. BOD, COD and DO, together with total P and Fecal Coliform are identified as two underlying factors on water quality in Citarum and its tributaries in Bandung Regency. Descriptive Statistic values confirm the quality of Citarum Bandung Regency low water quality. Keywords— Citarum; water quality; multivariate statistics; cluster analysis; latent factors; coliform. Manuscript received 13 Apr. 2020; revised 29 Oct. 2020; accepted 28 Nov. 2020. Date of publication 28 Feb. 2021. IJASEIT is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.


I. INTRODUCTION
Rivers are essential sources of water used for human consumption. Citarum is the longest river in West Java, Indonesia. Start from Lake Cisanti Gunung Wayang, flows as far as 269 Km to the Java Sea. Citarum's watershed covers an area as large as 12000 Km2 [1]. Around 16 million people interact with Citarum and create typical anthropogenic activities, including contaminants burden from industrial, domestic sewage, agricultural effluents, and farming. Citarum is also the primary source of irrigation for 420,000 hectares of rice fields in Karawang and Subang. This river also provides eighty percent of Jakarta's water needs. Citarum is also responsible for around 2500 Megawatt electricity used by Java and Bali [2]. Three large reservoirs are hosted by the Citarum, Saguling 1985, Cirata 1988 and Jatiluhur 1967 [3], Jatiluhur provides 187.5 MW, Saguling 1.400 MW and Cirata 1.008 MW of hydroelectric power [1].
The government of Indonesia divides Citarum management problem into three sections: headwaters, middle and lower sections. On headwaters, the river faces deforestation and farming, among others, and gives way to the erosion of 31.4% and sedimentation of 7900 tons/hectare. The estimated subsidence of groundwater-surface is as high as 5 meters/year from unmanaged exploitation of groundwater.
The water quality of Citarum is ruined because of several factors. Around 400 tons per day of cow's manure is dumped into the river. Industrial sewage contributes to its deterioration by producing around 280 tons per day in Bandung Regency only. One popular method among factories to dispose of wastewater without treatment is through "stealth pipes." The public is aware of these pipes by observing that water is becoming red or unnatural and has a chemical odor. In various sectors of Citarum, which is now overseen by the military, closures of these pipes are made using cement. This case was forced to do because the responsible factories acted as if they did not know. In time, these closed pipes will cause water to burst at its source, and they are forced to admit and begin to fix their wastewater system [4], [5]. These point sources of wastewater can be up to more than 1-2 km away from the source plants.
Of around 2000 manufacturers that use Citarum as their final sewage disposal, only 20% equipped with industrial water treatment. Existing regulations and laws governing hazardous waste are relatively weak in enforcement [6].
In addition to factory waste, Citarum must also bear household waste. Along the Citarum river, there are still many residents who do not have adequate sanitation facilities. Consequently, household waste is dumped directly into the Citarum river or its tributaries. The unavailability of proper sanitation is closely related to economic conditions and land availability. This issue is compounded by the lack of knowledge and awareness about the environment. In 2017, around 1,500 tons of sewage was dumped to Citarum [7]. Unfortunately, this condition is still ongoing [8].
Farms around Citarum also contribute to damage to the water quality of Citarum and its tributaries. A typical example is from West Bandung Regency. Currently, in nine villages in Lembang District, there is 22,400 head of cattle. Citarum receives around 224 tons of cow dung per day for the Lembang area alone, assuming one cattle produce 10 kg/day [9]. To overcome this situation government has started to build a composter and biodigester installation pilot.
In the middle section, sedimentation in Saguling Reservoir is as high as 8.2 million m3 per year, Cirata 6.4 million m3 per year, and Jatiluhur 1.6 m3 per year. Water Quality evaluation data on the three reservoir shows that Saguling and Cirata suffer from complete deoxygenated 5 m from the surface although Jatiluhur remains oxygenated to the bottom [3]. Today, in its "Citarum Harum" initiative, Indonesia's government tries hard to clean and better manage Citarum and clean the river effectively in seven years starting from 2018. Before "Citarum Harum" program, several programs have been launched to restore the quality of Citarum river both from the government and other NGOs. "Citarum Bergetar (2001)", "Cita Citarum (2010)", "Citarum Bestari (2013)" are examples of many programs launched by the government with varying results [9].
The multivariate statistical approach is becoming prevalent for analyzing and understanding water quality because of their ability to give perspective on the large volume of data from many monitoring stations [10]- [18]. This study applies Factor Analysis (FA) with PCA (Principal Component Analysis) as the extraction method. Cluster Analysis is also prepared to identify the water quality pattern or its pollution status in Bandung Citarum Bandung Regency.
The contributions of this paper are:  Provide spatial water quality observation of Citarum watershed Bandung Regency using multivariate statistical analysis.  Provide potential contributing latent factors of water quality degradation of Citarum watershed Bandung Regency  Provide water qualities spatial similarities based on cluster analysis

A. Study Area
Citarum River is located within West Java Province in Indonesia. This study analyzes only the Citarum Watershed in Bandung Regency. Figure 1 and Figure 2 show 75 stations where a sample of water quality was collected. These locations cover important tributaries of Citarum River in Bandung Regency. These data are typically used for the environmental agency's routine operation and for planning policy as mandated by Regulation of The Republic Indonesia No 82 the Year 2001, Concerning Water Quality Management and Water Pollution Control [19].

B. Water Quality Parameters
Water quality is always measured in correlation with the purpose of the usage. We can see water quality as the suitability of water for one or more purposes, measured by chemical, biological, and physical characteristics [20]. Water suitable for the plant to grow may not be suitable for us to drink A total of 23 water quality parameters are available for the study and considered to be representing the water quality of the Citarum river and its tributaries. There are more than seventy monitoring sites set up along the main river and its tributaries. The Environmental Agency of Bandung Regency prepares the collection of data. Several parameters are commonly used to measure water quality, including BOD, DO, COD, TSS, pH, Total Coliforms, and Fecal Coliforms.
Dissolved Oxygen (DO), is a measure of how much oxygen dissolved in water; it is one of the most critical water quality factors. Without enough DO, aquatic life cannot exist. Oxygen in water can be obtained from direct absorption from the atmosphere, usually enhanced by turbulence aeration in nature such as in-stream, where water flows over boulders [21]. A sufficient concentration of dissolved oxygen in water is needed to have a healthy aquatic life and the aesthetic quality of a river or lake.
Five days BOD5(biochemical oxygen demand) is the amount of oxygen required by aerobic microorganisms to dissolve organic matter in a water sample within five days. It is the most widely used parameter to measure rivers' organic pollution, selected by the U.K. Royal Commission on River Pollution in 1908 and then adopted by the American Public Health Association Standard Committee in 1936 [22]- [24]. The amount of available Dissolved Oxygen (DO) in water is strongly related to BOD. BOD5 is the depletion of DO in a water sample for five days consumed by existing microorganism metabolism. The greater the BOD, the more severe the degree of pollution.
Total Suspended Solids (TSS) are the solid suspended particles in water. Chemical Oxygen Demand (COD), is the demand for dissolved oxygen for all chemical processes in water and commonly used to measure the acceptable quality of effluent water [25]- [27]. River water's pH is important because it affects the fish's life, chemical reactions in water supply, wastewater processes, and water usage's suitability for a particular use [25]. Coliforms are microbiological parameters used to measure maximum allowable values for water's recreational use [25], [28]. Coliforms are bacteria that are always present in the digestive tracts of animals and humans. Although most do not cause disease, coliforms bacteria are used as indicator bacteria; its presence is a reasonable indication that other pathogenic bacteria are present. This approach is used because testing for pathogens for every water sample collected is not practical. Coliform bacteria are relatively easy and come in a larger quantity than dangerous pathogens and become the standard for the basic test of bacterial contamination of a water supply. Fecal coliform is the group of total coliforms found in the gut and feces of warm-blooded animals or humans. Total coliform includes bacteria from soil, human or animal waste, and water that has been influenced by surface water [29].

C. River Water Quality
The government of Indonesia categorizes the quality of water, depending on the usage of the water, into four classes according to Government Regulation No. 82, the Year 2001 On Management of Water Quality and Pollution [28]. It has for class ranging from Class-IV safe for plant cultivation up to Class-I safe for drinking water processing, as shown in Table 2. This study mainly seeks for Class II fulfillment criteria for Bandung Regency watershed. This criterion is the basis of routine reporting activities of the Environment Agency in Indonesia. Indonesia also defines four water pollution levels, as seen in Table 3 [30].  [32], except oil and grease are not measured in Indonesia. This study analyzes these seven parameters using a multivariate statistics method. Data screening/cleaning is done by removing missing value or abnormal data. We have 75 valid observations. The outlier is detected using the Z value (normal standard) to smoothen the clustering process.

D. Factor Analysis and Principal Component Analysis
Assuming there is a strong correlation among the data, we want to reveal the structure, factors, or dimensions that define the relations among observed water quality variables that we have. Factor analysis is a statistical method that will give factors or structure or latent variables that previously cannot be observed or determined directly. These factors contain all the essential information about the relationship between the observed variables. In doing so, FA will also reduce the number of origin variables. This study follows SPSS' Exploratory Factor Analysis using PCA. PCA tries to reduce the dimensionality of a data set by transforming data into a new set of orthogonal (non-correlated) variables called the Principle Components (PCs), arranged in decreasing order of importance [11], [33]. Factor Analysis will give us the number of latent constructs and the dataset's underlying factors [34].

E. Cluster Analysis
Cluster Analysis (CA) helps group a significant volume of data into classes based on similarities within and dissimilarities between different classes. CA is commonly used to find patterns and analyze large complex water quality data together with other multivariate statistics methods [6]- [11]. In this study, CA is used to find the spatial similarity group from all monitoring stations based on their quality parameters. Similarities/dissimilarities are found using Euclidean distance (linkage distance). A step by step hierarchical clustering is used in this study. A dendrogram is a diagram that shows a visual summary of how the clusters are formed during the clustering process. As shown in Fig 2. Dendrogram, we can see how the 75 observation stations (Xaxis) start clustering together at the bottom based on how similar these observations are. For example, stations 24, 43, and 36 form a cluster at the first step, and on the second step grouped with stations 19 and 33, 58, …, 50. Y-axis is the cutting distance, a parameter that we use to decide whether an object belongs to a cluster. By increasing the value of the cutting Euclidean distance (Y-axis), we continuously group the data and reduce the number of clusters. Table 4 shows the sample's quality taken from Citarum and its tributaries for Class II qualification (safe for water recreations, fish, and plant cultivation). We can immediately see that the mean values are far above the prescribed limits, indicating in general water quality is low and not suitable for water recreation, fish, and plant. A high standard deviation shows that the data is widely spread, suggesting the presence of a considerable spatial variation, very likely caused by natural and anthropogenic polluting sources. We can see in the values of all parameters are far outside allowable limits. Most notably, the value of Fecal Coliform and Total Coliform has a value of more than 250 and 388 times the recommended standard, respectively. These extreme values confirmed one of the Indonesian Army's challenges to help clean the Citarum. Many people who live near Citarum do not have proper sanitation facility and use Citarum as their only alternative.

A. Descriptive Statistics
BOD and COD values are also far above the class II standards. These BOD and COD values confirmed the level of pollution by textile industries that dump their effluent without using proper processing procedures needed by regulations. Despite the effort that has been done by the government to clean and restore the water quality of Citarum, these values, unfortunately, show that it is far from allowable limits. Table 5 shows the result of PCA analysis, which results in two Principal Components. Component 1: BOD, COD, and DO and Component 2: Total P and Fecal Coliform. Component 1 is the biological parameters most used in measuring water quality, especially concerning the quality of the industries treat their effluents or to measure the degree of water pollution in general. This analysis confirms that BOD, COD, and DO move consistently together. This Component 1 is easily related to the fact that many industries that use Citarum River or its tributaries as their final effluents do not follow acceptable water quality standards mandated by the government. This factor analysis also presents us with Total P and Fecal Coliform as underlying or latent factors for Citarum Bandung Regency; their values moved together consistently. Total P and Fecal Coliform is strongly related to pollutions caused by human or animal feces and urine. This result is consistent because many people who live alongside the rivers do not have adequate sanitation facilities. Based on these two factors' values, we can safely deduct that Citarum Rivers and its tributaries suffer both from industrial effluents and domestic or farm waste.

C. Spatial Similarity and Sites Grouping, Cluster Analysis
The dendrogram in Fig. 2, shows that we can cluster available stations into three clusters if we choose 24 as our cutting distance. Two stations form Cluster 1: Cikapundung Hilir and Cipadaun Hulu, Cluster 2: Cilebak and Cikaro, Cluster 3 are the other stations. Total Coliform on Cluster 1 shows a very high value, far above the standards and far above average (Cluster 3); indeed, this is the main characteristic of Cluster 1. On Faecal Coliform, we cannot easily see the differences between the three clusters, although we can see that all the clusters exhibit a very high value far above the standard.
The TSS value on Cluster 2 is very high as compared to the other two clusters. This value suggests a follow-up investigation since erosions or other factors can cause a high TSS. Cluster 1 exhibits a good value below the allowable standard, while most of the rivers have higher values than the acceptable standard. The values of BOD and COD are all above the standard. The DO values on Cluster 2 are well within the acceptable standard, even though BOD and COD are far above standard. This phenomenon can be resulted from the error in taking the sample, or the oxygen supply is rich by nature. These results suggest that the government should take more vigorous law enforcement to all the industries that use Citarum as their final sewage disposal, especially the ones that do not have wastewater treatment plants. Besides law enforcement, win-win alternatives on providing wastewater treatment plant should be considered. One popular option worth further study is providing a special wastewater pipe alongside Citarum to be treated by a pool of wastewater treatment plants. If this option can be implemented, the final industrial effluent's quality can be controlled to be within the allowable permitted range.
As the studies showed in Table 6, total Coli and Faecal Coli are all above normal allowable standards. This phenomenon suggests that a massive program in providing adequate sanitation facilities is direly needed by the people who live alongside Citarum and its tributaries. Besides sanitation, a further study on how to better manage effluent from livestock and agriculture is also needed.

D. Challenges
The Environmental Agency is aware that the available data is not enough to analyze the quality of Citarum water, which covers an extensive area. One obstacle is the high cost of measuring BOD, despite the scarcity of certified personnel for this purpose. Coordination among different agencies also has many improvement opportunities. Telemetry and other innovative technology are needed to help manage the Citarum's data.

IV. CONCLUSION
Indications of heavy contaminations, with some variables far exceeding standards recommended by the government, confirmed the disastrous condition of Citarum. Cluster Analysis categorizes Citarum and its tributaries in Bandung Regency into three clusters. The highlight of the first cluster (Cikapundung Hilir and Cipadaun Hilir) is Total Coliforms' extreme values. Cluster 2 (Cilebak and Cikaro) is marked by a very high TSS, while Cluster 3 (all other rivers/stations) exhibiting values above prescribed limits. There are two important underlying factors for this study area; the first is BOD, COD, and DO; and the second factor is Total P and Fecal Coliform.
The government can use these results to take appropriate actions and further study based on the cluster's locations and conditions, as discussed in this study. One important observation is the magnitude of the socio-cultural influence in this Citarum watershed issues. Habits and paradigms due to economic, land, and knowledge limitations make Citarum and its tributaries a convenient place to dispose of waste. The technical approach alone will not be able to solve the Citarum problem completely. A holistic approach promises more sustainable solutions compared to other approaches.