Comparative Analysis of Data Redundancy and Execution Time between Relational and Object-Oriented Schema Table

— Database design is one of the important phases in designing software because the database is where the data is stored in the system. One of the most popular techniques used in database design is the relational technique, which focuses on entity relationship diagram and normalization. The relational technique is useful for eliminating data redundancy because normalization produces normal forms on the schema tables. The second technique is the object-oriented technique, which focuses on a class diagram and generating schema tables. An advantage of the object-oriented technique is its close implementation to programming languages like C++ or Java. This paper is set to compare the performance of both relational and object-oriented techniques in terms of solving data redundancy during the database design phase as well as measuring query execution time. The experimental results based on a course database case study traced 186 redundant records using the relational technique and 204 redundant records when using the object-oriented technique. The query execution time measured was 46.75ms and 31.75ms for relational and object-oriented techniques, respectively.


I. INTRODUCTION
A database is a mechanism to store data or information in an organized manner. When data are stored in a database, the information has to be easily accessible and optimized for searching, modifying and removal [1]. A good database design is imperative for two reasons. One, poor design results in unwanted data redundancy. Two, it also generates errors leading to bad decisions. A practical approach to database design is to focus on principles and concepts of database design that will result in effective performance.
At present, database design is supported by many methodologies and techniques that strive for a perfect design [2]. One of the most established methodologies is the relational data model concept introduced by Codd [3]. However, because the core of the model is the collection of tables, this design is prone to redundancy among the tables. Data redundancy is the term used to describe databases that contain data fields are redundant in the database. Data redundancy may occur either when the field is repeated multiple times in a database for a variety of reasons.
Data redundancy is wasteful and inefficient. To solve the redundancy problem, normalization technique has been widely used [1], [4]. Normalization is an important technique for the design of relational databases. During normalization, the functional dependencies in the tables are first determined to match the normal forms with a breakdown of the tables. The benefits of normalization include as follows.
• It eliminates data redundancy.
• It eliminates to insertion, modification and deletion anomalies. • It results in the saving of more space in storing. • It allows adding new tables to the database and new rows to the table without any difficulty. • It ensures data consistency. • It ensures referential integrity.
Normalization is a technique of breaking down the given relational schemas focused around their functional dependencies and primary keys in an effort to decrease duplication. It focuses on producing a set of relational tables with the least amount of information redundancy by facilitating correct insertion, deletion, and modification. At present, it is very much time-consuming to use an automated technique to do this data analysis, as opposed to manual.
However, the main drawback of the normalized form is that the higher normal forms applied, the less vulnerable the update anomalies, but the more tables will be produced.
Consequently, this will affect the efficiency of a database since the updating process is more complex apart from the complexity in programming itself [5].
Another approach to solving data redundancy is by designing a relational database system based on the objectoriented methodology. In the object-oriented approach, the database system is created by the schema table generated from the class diagrams. The rules applied to adhere to the object-oriented concept, which is based on the relationships among the classes, multiplicity, attributes name, class name, data type and the behaviours of the classes [6].
The objective of this paper is to analyze the redundancy problem by examining two approaches in database design, which are the relational technique and the object-oriented technique. The analysis will measure the total data redundancy based on Structured Query Language (SQL). The important concepts considered in SQL are entities, relationships, and attributes, as well as the data schema while using the SQL query language. Aside from measuring the total data redundancy, the experiment will also measure the query execution time in terms of milliseconds.

II. MATERIAL AND METHOD
In order to compare the performance of both relational and object-oriented techniques in terms of solving data redundancy, an undergraduate database course at Universiti Tun Hussein Onn Malaysia (UTHM) has been chosen as the case study. Fig. 1 shows the process of applying the relational technique and the object-oriented technique in designing the database course.
In this figure, two different schema tables will be generated based on the Entity Relation Diagram (ERD) from the relational approach and the class diagrams from the object-oriented approach. The schema table from the relational approach will undergo another process called the normalization to produce up to the third normal forms from the tables. Total data redundancy from both schema tables will then calculated and analyzed. Next, a user-friendly window [7] is used to measure the query execution times for both relational and object-oriented database designs.

A. Relational Technique
In the relational technique, the database was designed based on the Entity Relationship Diagram (ERD), and schema tables were generated from these diagrams. Once the schema tables were ready, they are implemented into a physical database system. The steps to produce the database are as follows.  (10), Major varchar (7), Primary key (Student_number) ); Create  (10), Class varchar (10), Major varchar (7), Primary key (Student_number)); Create

B. Object-oriented Technique
The steps in object-oriented technique began by first translating the ER diagram into a class diagram. The Visual Paradigm program [10] supports generating class diagram from existing ER diagrams. This program mapped the entities and relationships into the corresponding classes and associations. In the object-oriented technique, the each class is then translated into schema tables.   (10), Class varchar (10), Major varchar (7), Primary key (Student_number)); Create The comparative analysis measured two items; the number of data redundancy produced by each technique and the query execution time for both database designs.

A. Data Redundancy
The total data redundancy inside the attributes from both relational and object-oriented schema tables was calculated using the SQL query as shown in Fig. 8.  Table 1 shows the total data redundancy resulting from the SQL query using both relational and object-oriented techniques. From the table, the database design using object-oriented technique produced higher data redundancy in the Department attribute. This is because the tables in the Database Course were transferred from class diagram [10]. The class diagram in object-oriented refers to the entities in the system requirements.

B. Query Execution Times
The second objective of this paper is to compare query execution times between the two different databases. In measuring the execution time, a user-friendly window using C# was used to calculate the query duration based on four queries each in the case study. Fig. 9 shows the userfriendly window, which displays and calculate query execution time using both relational and object-oriented techniques.
Next, the results of query execution time from running the user-friendly window are shown for four queries using the hospital object-oriented database. The queries were a student, marks, course, and requisite. The result of running these queries is shown in Table 2 where the query execution times are measured in milliseconds (Ms). Finally, Fig. 10 illustrates the average execution times for all queries in the relational and object-oriented database resulting from the case study.  The core benefit of adopting a common database in a big data system is reducing the data redundancy, hence reducing the man-hours for maintaining the database, as well as maximizing the database throughput. The basic idea is to reduce the number of associations between objects and to promote object reuse so that the database performance is optimized. This, in turn, will have a positive effect on the size of storage space. The important objective is to reduce data redundancy over databases. This paper focused on considering allotments of the relational and identical classes that share relations. Object-oriented approach detects redundancy more easily and efficiently by using comparable classes.
Data redundancy problem is analysed in this paper when designing the database. The comparative results showed that the relational database design technique achieved a bigger reduction in data redundancy as compared to the objectoriented technique. This is because the relational technique uses normalization to avoid this problem through the normalization rules. From the case study using the database course, the relational technique produced only 186 data redundancy as opposed to 204 data redundancy from the object-oriented technique. However, in terms of query execution times when performing the SQL queries via the user-friendly window, the object-oriented database produced faster results of 31.75ms as compared to 46.75ms by the relational database.
In conclusion, both techniques are accepted in designing database and applied in current system development in worldwide. However, the database designer must be really understood when the technique should be used. For example, in the big database which is involved with many tables, it not recommended to applied relational technique.