Comparison Support Vector Machine and Fuzzy Possibilistic C-Means based on the kernel for Knee Osteoarthritis data Classification

— Osteoarthritis is a chronic joint disease that occurs when the protective cartilage that cushions the ends of bones wears down over time and fails to be repaired. The common form of the disease is knee osteoarthritis while it can affect all body parts with joints, such as hands, ankles, hips, and spine. The major cause of knee osteoarthritis is the continuous depletion of its cartilage. During the diagnosis, machine learning is used because early prevention is necessary for proper treatment. This study, therefore, considers classification methods of Support Vector Machine (SVM) and clustering methods using fuzzy clusterings such as Fuzzy C-Means (FCM), Fuzzy Possibilistic C-Means (FPCM), and Fuzzy Possibilistic C-Means based on kernel (FPCMK) to analyze of knee osteoarthritis. SVM is a machine learning technique that works based on the principle of structural risk minimization (SRM) to obtain the best hyperplane to separate two or more classes in input space. Otherwise, the fuzzy clustering is to determine the value of a distance and to know and measure the similarity of each object to be observed. FPCMK uses the kernel Radial Base Function (RBF) in the fuzzy clustering method. The kernel function is applicable for handling non-separable data problems. This method will be compared to the level of the measured parameter; their accuracy, recall, precision, and f1 score. The greatest level of accuracy is generated from SVM with an accuracy value of 86.7%, then followed by FPCMK with an accuracy value of 85.5%.


I. INTRODUCTION
Osteoarthritis is a chronic joint disease of the cartilage that often occurs in older adults [1]. It occurs because of the inability of the joint to repair the damage itself or also known as the slow degenerative joint process [1]. Factors leading to the occurrence of osteoarthritis include overweight, getting older, joint injuries, joint cartilage that experiences disability, and certain activities that cause joints to be damaged [2]. This disease occurs in all parts of the body that have joints, such as hands, ankles, knees, hips, and spine [2] where knee osteoarthritis is the most common form [2]. Knee osteoarthritis is a disorder caused when the cartilage continuously depletes until it finally runs out, allowing bones to rubs against themselves, causing cysts on the edge of the bone [3]. These cases are rampant in Japan, America, Vietnam, Japan, and Indonesia [3].
Therefore, early prevention of knee osteoarthritis is needed. The primary technique of prevention is the use of clustering and classification through machine learning methods for patients to receive proper and early treatment. The clustering methods considered in this paper are Fuzzy C-Means (FCM), Fuzzy Possibilistic C-Means (FPCM), and Fuzzy Possibilistic C-Means based on Kernel (FPCMK) while, the classification method is Support Vector Machine (SVM).
There are several previous studies regarding the methods of FCM, FPCM, FPCMK, and SVM. [4] Comparing fuzzy clustering algorithms for feature extraction in the vineyard showed the FCM method is the best technique based on the speed of performance compared to the PCM, FPCM, and Robust Fuzzy Possibilistic C-Means (RFCM). [5] Proposed kernel-based fuzzy and possibilistic c-means clustering. The results show that the Kernel Fuzzy C-Means (KFCM) and Kernel Possibilistic C-Means (KPCM) are resistant to the FCM and PCM methods when outlier data.
The research proposed a fuzzy c-means fuzzy swarm for fuzzy clustering problems [6]. It was found that the combination of FCM and Fuzzy Particle Swarm Optimization (FPSO) was more efficient than FCM and FPSO without being combined. [7] It provides a complete explanation of Support Vector Machines (SVM) that can be used for the classification of uncertain data. SVM uses kernel configurations to produce better results in classification. In this paper, therefore, breast cancer data is used for four types of SVM kernel methods, which include linear, polynomial, sigmoid, and radial kernels.

A. Fuzzy C-Means Clustering
In 1981, Jim Bezdek introduced Fuzzy C-Means, which is a clustering data technique where each data point in a group is determined by its degree of membership. The basic concept of the FCM is to determine the center of the cluster that will mark the average location of each cluster [8]. Each data point in each cluster has a degree of membership. The degree of membership is the distance between the data points provided and the cluster center. However, in the initial conditions, the cluster center and the degree of membership are not accurate. Therefore, the center of the cluster and the degree of membership are corrected repeatedly to ensure they are in the right location [8]. The output of the FCM method is not a fuzzy inference system, but the degree of the cluster center and the degree of membership for each data.
The FCM algorithm divides available data from its finite element to generate clusters based on the given criteria. This repetitive improvement is based on the objective function given in the equation below [8]: With the constraint function ∑ = 1 Where is the amount of data, is the number of clusters, is the center of the cluster, is the membership function, X is the data to be clustered, is the fuzzy ( > 1 and ‖ − ‖ is the distance between data points with the cluster center. The value of the degree of membership in the FCM method is The updated i-cluster center is Where the iteration termination criteria are as follows: With 4 as the center of the cluster in the t-iteration and 45 as the center of the cluster in the previous iteration.

B. Fuzzy Possibilistic C-Means (FPCM)
Fuzzy Possibilistic C-Means is an algorithm development from the Fuzzy C-Means and Possibilistic C-Means method.
In the FCM algorithm, the value of the degree of membership value is affected by all data to be clustered and all cluster centers [9]. Meanwhile, in the Possibilistic C-Means (PCM) algorithm, the typicality value 8 is affected by all data to be clustered and the cluster center to k [9]. The objective functions of FPCM are as follows [9]: , , , , , 9, 8 = ∑ ∑ : Where 9 is the number of sample data, is the number of clusters, is the fuzzy degree, ŋ is the possibilistic degree, is the k data, is the center value in the i-cluster, is member value in the icluster, and 8 is the typicality value in the i cluster.
The value of the degree of membership in the FPCM method is The value of the typicality of the FPCM method is The updated i-cluster center is With the iteration termination criteria as follows: where 4 is the center of the cluster in the t-iteration and 45 is the center of the cluster in the previous iteration.

C. Fuzzy Possibilistic C-Means Based on the Kernel (FPCMK)
Fuzzy Possibilistic C-Means based on the Kernel is a generalized algorithm from the Fuzzy Possibilistic C-Means method. FPCMK uses the kernel Radial Base Function (RBF) function in the FPCM method [10]. The kernel function handles problems that are linear to be applied in handling nonlinear problems by using the function ∅ which is a non-linear mapping from the input space to the feature space [10].
The kernel distance is defined as follows: with the constraints ∑ = 1, ∀A ∈ C1,2, … , 9D , ∑ 8 ŋ = 1, ∀0 ∈ C1,2, … , D. Where 9 is the number of sample data, is the number of clusters, is the fuzzy degree, ŋ is the possibilistic degree, is the k data, is the center value in the i-cluster, is member value in the icluster, and 8 is the typicality value in the i cluster.
The value of the degree of membership in the FPCMK method is as presented below.
The updated i-cluster center is Where the iteration termination criteria is as follows: ∆= ‖ 4 − 45 ‖ < 7 With 4 is the center of the cluster in the t-iteration and 45 is the center of the cluster in the previous iteration.

D. Support Vector Machine (SVM))
Support Vector Machine (SVM) is a machine learning technique that was first introduced by Vapnik in 1992. SVM works based on the principle of structural risk minimization (SRM) to obtain the best hyperplane that can separate two or many classes in input space [11 -[13]. The best hyperplane as a separator of two classes can be found by calculating the margin of the hyperplane and looking for the maximum point of the margin [11]- [13]. Margin is the distance between the hyperplane and the closest data from each class. The data closest to the hyperplane are called the support vector [11]- [13].
Suppose a dataset is denoted as M N ∈ _ ? which has a label denoted as ` ∈ C−1, +1D for 0 = 1,2, … , 9 where n is the amount of data. The formula for the hyperplane is between class -1 and +1 if it is assumed that the two classes are completely separated by a dimension n hyperplane as follows Where a represents the hyperplane, b is the normal field and b is the optimal hyperplane bias. The optimal hyperplane function is obtained by looking for the weight parameter b and the bias b parameter in the function below which is known as the decision function [12,13].
In getting the best hyperplane, a hyperplane will be found that has the largest margin value. The biggest margin can be found by maximizing the value of the distance between the hyperplane and the closest point of each class [12]. Therefore, forming: The problem of maximizing the margin |> I − > 5 | equivalent to minimizing the value of ‖w, hence written as a primal optimization problem as below [11]- [13] 09 ‖b‖

A. Data
The study used 41 data on knee osteoarthritis patients who performed examinations at Dr. Cipto Mangunkusumo (RSCM)'s Hospital. The data was divided into three classes based on the grade of osteoarthritis. Based on Kallgren and Lawrence, there are five gradings to assess the severity of knee osteoarthritis, as shown in Table I below. Has definite osteophytes and there may be a narrowing of the joint gap on the side of weightbearing. 3 Multiple osteophytes, narrowing of the definite joint gap, sclerosis, and possible bone deformity.

4
Large osteophytes, narrowing of the joint gap, severe sclerosis, and definitive bone deformity.
Based on the grading, data on knee osteoarthritis patients are divided into 3 classes, which includes class 0 which is the class for patients with grade 1 with 22 patients, class 1 which is the class for patients with grade 2 with 81 patients, and class 2 which is the class for patients with grade 3 with 38 patients.
This data has four features that influence the severity of knee osteoarthritis, and its features are explained in table II below.

Name of Feature
Definition of Grade

X1
The average thickness of cartilage in the medial femur

X2
The average thickness of cartilage in the lateral femur X3 The average thickness of cartilage in the medial tibia

X4
The average thickness of cartilage in the lateral tibia

B. Results and Analysis
In this study, data on knee osteoarthritis patients used were divided into two types; there are 70% training data and 30% testing data. Data validation is used with 5-fold crossvalidation to divide the dataset into five parts. In evaluating the performance of each method, parameters of accuracy, recall, precision, and f1 score will be used where the greater the values, the better the method in clustering and classification. Table III shows the formula for calculating these values. This study will compare FCM, FPCM, FPCMK, and SVM in the classification of knee osteoarthritis. Table IV, V, VI, VII and Figure 1 show the performance evaluation of each method based on the parameters of accuracy, recall, precision, and f1 score.  Table IV, the best value of accuracy is using the SVM method. The accuracy of the SVM method is 86.7% and the smallest value of accuracy is 79.5% using the FCM method.   The best value of f1-score from Table VII is 87.65% using SVM method. The smallest value of f1-score is using the FPCM method with 79.4% value.  Table IV, V, VI, VII, and Figure 1, the SVM is better than FCM, FPCM and FPCMK methods based on the value of accuracy, recall, precision, and f1 score, at 86.7%, 90.7%, 84.8%, and 87.65% respectively. The FPCMK is the best compared to FCM and FPCM based on the value of accuracy, recall, precision, and f1 score, at 85.5%, 88.1%, 84.1%, and 86.05% respectively.

IV. CONCLUSIONS
This study uses classification methods such as Support Vector Machine and fuzzy clustering, including Fuzzy C-Means (FCM), Fuzzy Possibilistic C-Means (FPCM) and Fuzzy Possibilistic C-Means based on the Kernel (FPCMK) in the early classification of knee osteoarthritis patients to obtain the best treatment. The comparison was made for the four methods, and SVM became the best for the classification of knee osteoarthritis. For the fuzzy clustering method, it is preferable to add the RBF kernel function to the FPCM. Measurement of the performance of these methods uses the following parameters, such as accuracy, recall, precision, and f1 score.