Metadata Schema for Traditional Knowledge

— Approximately four hundred indigenous communities in Indonesia originally utilize their traditional knowledge for supporting their daily life. Because of many benefits of that knowledge, many stakeholders have started to collect and write it into a digital report. However, the digital report was still documented in the different format of metadata because there is no specific metadata schema for describing digital data of traditional knowledge. Moreover, the differences of metadata schema will make the difficult process of documenting, managing and disseminating this traditional knowledge. To overcome this problem, this work attempted to design specific metadata schema for a domain of traditional knowledge by utilizing metadata development methods, i.e., domain analysis, derivation analysis, system-centric analysis, user-centric analysis and resource-centric analysis. The selection of those methods based on literature review result toward research articles that presented about metadata development. As a result, this paper proposed metadata schema of traditional knowledge that consists of 37 metadata elements which are categorized into 6 metadata sections, i.e., supporting data, material, supporting tool, success story, knowledge source, and knowledge engineer.


I. INTRODUCTION
Today, traditional knowledge as cultural heritage is not only important and utilized by indigenous people but also urban people for supporting their daily life. For example, traditional knowledge for health is still used by rural areas population in developing countries [1], [2]. However, the traditional knowledge rapidly disappears and largely undocumented because that knowledge is owned by elder and accessibility is restricted only to a member of within indigenous community itself.
In Indonesia, traditional knowledge has begun to be preserved [3] by many stakeholders through digital documentation to avoid those knowledge being disappearing [4]- [6]. However, the digital data was still documented in different metadata schema because there is no specific metadata schema for traditional knowledge. The different format metadata will impact to reuse, manage, and disseminate data of traditional knowledge [7], [8].
The previous researches have been proposed metadata schema for traditional knowledge. For examples, metadata proposed by [9] that were applied to Chinese Medicine Digital Library (CMDL), metadata developed by [10] for Chinese medicine literature metadata (TCMLM, for short) and so forth. However, metadata cannot be 'one size fits all' resources because it depends on the purpose of development [11].
Furthermore, a good metadata schema should cover all needs of many parties that will use metadata schema, such as end users, the connected system and so forth. Based on [12] metadata schema can be developed through several methods such as domain analysis, derivation analysis, system-centric analysis, user-centric analysis and resource-centric analysis.
In reference to research background above, this research attempted specific metadata schema for the domain of traditional knowledge by employing metadata development methods. This paper will organize into introduction, literature review, methodology, result and discussion, conclusion, acknowledgment, and references.

A. Metadata Schema
The definition of metadata schema is commonly described as specifications for representing metadata element in order to present information structure about data or dataset [13], [14]. The information structure consists of element or attribute values that are referred to the description of the data, such as attribute value about data owner, data format, and so forth. The main aim of metadata schema development is to ease in understanding, organizing and storing the data [15], [16]. White (2005) [15] identifies four categories of metadata: structural metadata, content metadata, descriptive metadata and administrative metadata. Structural metadata is commonly described information architecture of the document, for example, title, summary, image, and so forth. Content metadata provides information about subjects associated with a particular document. Descriptive metadata presents information to ease in searching document based on its format. Then, administrative metadata delivers detailed information about date created or modified, owner of the document, and so forth.

B. Traditional Knowledge
Traditional knowledge as intangible cultural heritage is commonly defined as local, intangible and unique knowledge based on experimental that is originated from a particular local community and delivered through oral tradition [18]. The example of traditional knowledge is the knowledge how to indigenous people manage their ecological relations of society and nature and knowledge how to adapt to environmental or social changes [19]. Based on [20], traditional knowledge is categorized into eight fields: beliefs, medicine, knowledge technology, education, communication, agriculture, food technology, and arts and crafts.
C. Method of Metadata Development 1) Domain Analysis: is used as the preliminary study to develop metadata which is adapted from a field of Information Science [21] through interpreting domain of information resources and defining its scope [22]. The particular aim of the method is to identify a domain of text resource, type of text resource, end user group and their activity in using the text resource [23].
2) Derivation Analysis: is a method to derive metadata element by reviewing the related existing metadata standard or schema [24]. For instance, metadata MARC-XML is the result of derivation analysis from metadata MARC.
3) System-Centric Analysis: is a method to derive metadata element by reviewing information architecture of an existing system that is related to proposed metadata [8]. The example of this method utilization is to develop metadata schema of the phyknome project [25].

4)
User-Centric Analysis: is a method to identify metadata element by reviewing information needs of end user [26] for achieving their specific purpose [27]. The example of user-centric analysis utilization in metadata development is presented on research by [28] by observing information needs of health practitioner in a library of a large pharmaceutical company. As the result of the usercentric analysis, they found that drugs, diseases, genes, companies, methods, authors, geographic regions, and drug sales were new elements of metadata schema for the library of a pharmaceutical company.

5)
Resource-Centric Analysis is a method to identify metadata element-based available information on the resource that will be described by metadata schema [7]. The example of resource-centric analysis utilization in metadata development is presented on Chao (2015) [27]. As the result of the resource-centric analysis, Chao (2015) found nine mandatory metadata elements for journal articles of soil science, including method descriptive name, method type/sub-category, brief method summary, method number/identifier, method source, source citation, media name, method official name, and instrumentation.

D. Related Work
In recent years, there has been an increasing amount of metadata schema for traditional knowledge that is developed for different purposes by using various methods because a metadata cannot be 'one size fits all' [11]. In 2000, metadata schema for traditional knowledge of Chinese medicine had been proposed by Yang and Chan [9] that were applied to Chinese Medicine Digital Library (CMDL). Yang and Chan divided their metadata schema into three sections, i.e., herbs, proprietary, and recipes [9]. The development of metadata schema used a user-centric method by asking information needs to end user and system-centric method by reviewing information architecture of previous Chinese medicine system. Then, eleven years later, [30] proposed metadata schema for cultural heritage documentation which consisted of metadata sections: people, restoration, management, basic, history, building, and publishing.
In 2012, [31] proposed metadata for the documentation of archaeological assets (objects, ancient buildings, and archaeological sites) in Archaeology Research Centre (STARC). The development of metadata schema used derivation analysis, system-centric analysis, user-centric analysis, and resource-centric analysis that defined metadata elements and grouped them into project information, cultural heritage asset, digital resource provenance, and activities.
In 2013, [32] proposed metadata about Chinese intangible cultural heritage that was developed by using derivation analysis to Dublin Core and related metadata schema and user-centric analysis by reviewing end-user needs (government). The result of [32] work is the 67 metadata elements that were grouped into 14 sections. In the same year, metadata schema for traditional knowledge of Chinese living epic traditions has been proposed by [33]. It was developed by using derivation analysis, system-centric analysis, user-centric analysis, and resource-centric analysis. As many 104 metadata elements have been identified and grouped into 19 metadata element sections.
In 2014, [10] proposed traditional Chinese medicine literature metadata (TCMLM, for short). It was developed by using derivation analysis to metadata Dublin Core and ISO 13119 Health Informatics and user-centric analysis for defining specific metadata element of traditional Chinese medicine. TCMLM consists of 24 metadata elements which are grouped into 7 sections: identification, content, distribution, quality, constraint, maintenance, and relation. Then, all related works have been summarized in Table 1 below.

II. MATERIAL AND METHOD
There are eight phases of research methodology (Fig. 1). We first conduct phase named literature review to get the understanding of metadata schema and its current research. Since the goal of this study is to obtain metadata element to the better document, manage, and disseminate traditional knowledge, we adopted the metadata schema development methods (i.e., domain analysis, derivation analysis, systemcentric analysis, user-centric analysis and resource-centric analysis) as part of research methodology. Then, we finalize metadata schema and write-down conclusion. The first phase is the literature review. In this phase, we limit our research to find literature about metadata schema, traditional knowledge, and cultural heritage. After the application of these criteria, we found selected articles for metadata schema definition, traditional knowledge definition, metadata schema development methods and related works.
The second phase is domain analysis. This step is to identify the domain of resource, type of resource, end user group and their activity in using the resource. Domain analysis has been done by using document analysis and interview to related stakeholder to get insight related to traditional knowledge.
The third phase is derivation analysis. This phase is to identify candidate of elements from related existing metadata. Related metadata schema for derivation analysis was selected based on domain and purpose of metadata.
The fourth phase is system-centric analysis. In this phase, we review existing system which related to our proposed metadata schema. The next phase is user-centric analysis. This phase has been done through interviews have been conducted on individuals and communities from the public owners of traditional knowledge in the several cities in Indonesia.
The sixth phase is resource-centric analysis. We collected data about traditional knowledge in several formats. We studied the provided information of collected data that will be described by metadata schema. The seventh phase is metadata finalization. All identified elements were mapped and grouped into several sections. Then, the last phase is the conclusion.

III. RESULT AND DISCUSSION
In domain analysis phase, we collected and analysed secondary data and primary data. Secondary data are the related documents with traditional knowledge included national policy, report of traditional knowledge research, etc. Primary data are collected by interviewing experts in traditional knowledge. In this research, we interviewed two experts in traditional knowledge field who have research experiences over 10 years. The purpose of domain analysis is to clearly determine domain, end users and their related activities in utilizing digital resources of traditional knowledge. The summary of domain analysis was presented in Table 1 below. Based on the result of domain analysis before, we identified primary data and secondary data for derivation analysis, system-centric analysis, user-centric analysis and resource-centric analysis. The brief result information for each phase is depicted in Fig. 2 below. In the user-centric analysis, we interviewed individuals and communities of traditional knowledge owners in the several cities in Indonesia that are Bengkulu, Gresik, Padang, and Bandung and identified 30 metadata element candidates. Then, in the resource-centric analysis, we reviewed documentation of traditional knowledge that has gathered from PROSEA, LINSTRAD, ISJD, and ISTDL and identified 50 metadata element candidates. As a final metadata schema, we found 37 metadata elements that related to traditional knowledge domain as depicted in Fig. 3. 2) Knowledge Source: describes people who owned traditional knowledge.
3) Material: describes the needed material that involves conducting guidance from traditional knowledge.

4) Supporting
Tool: describes the needed tool that use to process material in order to achieve the goal of traditional knowledge.

5) Success
Story: describes the experience of people who successfully tried the traditional knowledge. 6) Supporting Data: describes information of data source that use to explain traditional knowledge in more detail.
Then, every single metadata section consists of several metadata elements. The detail of metadata element can be seen in Table 2 below. After conducting domain-analysis, derivation analysis, system-centric analysis, user-centric analysis, and resourcecentric analysis for traditional knowledge domain, a metadata element set was proposed. This new schema is designed for the description of metadata for traditional knowledge domain. The purposed metadata as many 37 metadata elements are categorized into 6 metadata sections, i.e., supporting data, material, supporting tool, success story, knowledge source, and knowledge engineer.
As future research, the obtained metadata element set will implement to the system named I-Grest (Indonesian Genetic Resources and Traditional Knowledge).