Automatic Assessment of Technology Readiness Level Using LLDA-Helmholtz for Ranking University

— The assessment process of Technology Readiness Level using the questionnaire-based tool for Indonesian university's academic papers is considered to be labor-intensive. This paper introduces a new method of determining the TRL of an academic paper based on a text mining technique. The content of the research paper represented by their abstract published by university lecturers is justified to represent the technology maturity of research. Abstracts of papers were collected from the nine most reputable universities in Indonesia. By utilizing Labelled Latent Dirichlet Allocation, the abstracts were categorized into 1 of 9 levels of TRL. To determine the prior label of LLDA, we built a corpus of keywords representing each TRL level based on Bloom Taxonomy. Beforehand, Helmoltz principle was utilized to select the text feature. Since Bloom Taxonomy has only six levels, we split the keywords into 9 level. Afterward, the reputation score is calculated using our formula. Lastly, the university ranking is generated according to the extracted academic reputation score. To evaluate the proposed method, we compare our rank with QS’s. We calculate the ranking gap and Pearson correlation to evaluate the result. Helmholtz has successfully pruned 86% of features. The utilization of Helmholtz significantly improves the Pearson correlation of our proposed method. In short, the new insight of university ranking introduced in this work is promising. For all indicator experiments, LLDA-Helmholtz performed better results indicated by 0.95 Pearson correlation between two rankings, while for LLDA without Helmhotz, the correlation is 0.78.


I. INTRODUCTION
Technology Readiness Assessment (TRA) was a tool to evaluate technology maturity [1] for space technology by employing Technology Readiness Level (TRL) scale that ranges from 1 to 9. NASA pioneered the assessment method. Carmack [2] has provided a definition of TRL for nuclear fuel technology. The approach adopted the Department of Energy of USA TRA and applied it to nuclear fuels and material systems. The paper adopted nine levels of maturity, which are divided into three major functional categories, namely Proofof-Concept (level 1-3), Proof-of-Principle (level [4][5][6], and Proof-of-Performance (level7-9). Several criteria were established for each level. All the criteria must be met when Critical Technology Element (CTE) is considered achieving a certain level. The development of a questionnaire based on Air Force Research Laboratory was also discussed.
Through regulation of the Ministry of Research and Technology of Higher Education, the Indonesian Government has adopted TRL [3] to assess technology maturity [4] of research and technology development of Universities in Indonesia. The evaluation aims to assess the implementation of a research program under the ministry of research and technology of higher education and reduce the risk of failure [5] in the technology implementation [6]. TRL scoring is also used as a funding basis for the researcher by the ministry office. In the implementation, the TRL scoring process is conducted using a spreadsheet-based questionnaire called Teknometer that contains several indicators. An expert does the assessment process. This questionnaire-based evaluation is accurate yet labor-intensive in terms of a large number of research papers that need to be evaluated.
Likewise, previous related research mostly dealt with a questionnaire-based method to provide TRL. Technology readiness level metrics had been employed to assess the establishment of thermochemical conversion in the biofuels production from pyrolysis of triglyceride biomass [7]. In this area of interest, the examination of TRL provides insight dealing with the future challenge of the mechanism of kinetics, the technique to be upgraded, and the initial plant research. The assessment itself was conducted using questionnairebased metrics.
As a promising option for energy production, biorefineries offer a substitute for fossil fuels. TRL examination in this area of research is beneficial. In describing energy production from residues of micro-algae from Swedish forest, Badr [8] insight further experiment attempt to handle the challenge in the advancement of biorefineries technique. Several gaps were recommended, such as environmental and economic examination of material stream inventory. Key integration of the scaling up relied on intensifying laboratory experiments, the boundary of material recycling, and the process performance impact.
Subcritical water extraction (SWE) was compared with conventional techniques using ethanol extraction (EE) in yielding bioactive kanuka leaf extract [9]. The assessment involves TRL valuation dealing with process units, total capital expenditure (CAPEX), and profitability. The potential environmental impact (PEI) was also explored using several indicators of the waste reduction algorithm (WAR). In short, the promising economic return of SWE was the potential to sway the favor of EE.
In another work, a Pasuraman based Technology Readiness Index (TRI) was employed along with Davi's Technology Acceptance Model (TAM) to assess technology acceptance in Electronic Human Resource Management (e-HRM) in Turkey [10]. The survey instrument was questionnaires sent to 500 participants from the 500 largest private sector companies in Turkey. The major finding was that optimism and innovativeness positively correlate to perceived usefulness and ease of use.
Combining TRI and TAM, Walczuch quantified the relationship between personality and technology acceptance. Four personality categories as proposed by Pasuraman was used, i.e., optimism, innovativeness, discomfort, and insecurity. TAM was used to represent apprehended usefulness and apprehended ease of technology being used. Data was collected from the employees of Belgian multi-site financial service providers. The result of the research was surprising since innovativeness was negatively correlated to usefulness. Straub [11] has reviewed deeply the history and codification of NASA TRL system and how other agencies used it. The paper proposed the notion of TRL 10 advancing NASA TRL system that contains TRL 1-9, i.e., technology concept, proof-of-concept, technology demonstration, conceptual design and prototype demonstration, preliminary design and prototype validation, detailed design and assembly level build, subsystem build, and test, and system operators. The work defined TRL 10 as a proven operation aiming to provide more mature technology as a requirement of higher frequency space access.
In order to establish an elucidated source of information dealing with the maturity of the partitioning and transmutation (P-T) technology, The Global Nuclear Energy Partnership TRL definition was adopted [12]. Along with the maturity of P-T technology, the other system was also evaluated, i.e., fast reactor (FR), accelerator-driven subcritical transmutation system, aqueous reprocessing, molten salt electro-refining partitioning technology, and oxide, metal, and nitride fuels. For every system being reviewed, every specific definition of TRL was introduced. The use of IT has been strongly pushed for the construction industry by the Malaysian Government. Using multiple scales of the Technology Readiness Index from Pasuraman, construction firm managers' readiness to embrace IT technology has been reviewed. A TRI score for every respondent was calculated by counting the average of four components: optimism, innovativeness, discomfort, and insecurity.
TRL framework was employed to put up new constraints of several components, including technology, regulatory, and market, to model market penetration of new technology [13]. The framework assists in recognizing factors and rates to promote technology development that satisfy the urged technical and policy goal in the coming decades. Similarly, Zhang et al. [14] conduct self-evaluation involving TRL measurement of technology, organization, and environment in achieving green innovation. Surveyed using a questionnaire containing construct of research model from 340 companies in China, attempted to align theoretical with practical implication in planning green innovation.
Jafari [15] attempted to recognize the relationship between digital transformation and entrepreneurship involving technology readiness components such as investment in ICT, technology access, education, exploration, and exploitation. Using several control variables, i.e., GDP per capita, GDP growth rate, Cost of starting a business, Time of starting a business, and Procedure for starting a business, the impact of Independent Variables on Technology entrepreneurship and Technological market expansion was then examined. The finding summarizes that such factors were part of the dynamic capabilities encouraging societies to achieve digital innovation.
Most work previously described relying on expert judgment based on several TRL indicators. In terms of assessing research programs under the Indonesian ministry of research and technology of higher education, many research papers need to be evaluated. In this context, the TRL evaluation is ineffective if it depends on the manual expert evaluation. This work provides an approach to solving this gap by automatically determining the TRL of the research paper based on several adopted text mining techniques.
Accordingly, this work proposes a new technique based on several text mining approaches [16] to evaluate TRL of Indonesian universities' research papers [17]. Text mining technique has many applications in different fields of research [18]. The evaluation of the proposed method is based on the research paper published by university staff. A new insight that TRL can be represented by the content of the research paper of university staff is introduced. Therefore, the research paper is then grabbed from the nine most reputable Indonesian universities and then categorized using Labelled Latent Dirichlet Allocation [19]. Prior label for LLDA is determined by matching the content of the abstract with the corpus of keywords. We build the corpus of keywords based on Bloom Taxonomy. The building of keyword corpus involves sorting Bloom Taxonomy keywords using WordNet Similarity Algorithm.
TRL of research that is automatically generated by using the proposed method previously overviewed is then employed to assign academic reputation for university. TRL indicates the maturity of research being conducted. In terms of university ranking assessment, this maturity measure can evaluate university academic reputation [20] by with the ranking is generated. We propose a formula to calculate university academic reputation from the TRL of the university. In the last step, ranking is then generated based on the extracted university academic reputation. For the ground truth, we use the university ranking from QS World University Rankings and compare the result with the ranking generated by our proposed method.

II. MATERIAL AND METHOD
In this paper, we introduce a new insight in determining the TRL of the research paper of Indonesian University by utilizing a topic modeling technique. Topic modeling technique is employed to classify the content of the research paper into one out of nine TRL as presented in Figure 1. Therefore, the technique being introduced can be considered a text classification task. The whole step of the proposed method is presented in Figure 2. There are seven steps in this works, i.e., 1) dataset and TRL corpus Development, 2) text pre-processing, 3) Helmholtz feature selection, 4) keyword corpus enrichment, 5) label assumption determination, 6) Gibs Sampling Inference for L-LDA, and 7) Adaboost-MH Optimization. The flow of the step is presented in Fig. 2.

A. Dataset and Corpus Preparation
Dataset used in this work is an abstract of a paper of academic staff from the nine most reputable universities in Indonesia. The best university list used to choose the most reputable university refers to the ranking of QS World University Rankings for the region of Indonesia. We pick the abstract with the highest citation from the metadata of Google Scholar to ensure that the abstract used for the experiment represents qualified research since the abstract's assessment represents the evaluation of TRL of a research product.
TRL Corpus contains keywords that represent the maturity of research of Indonesian Universities. Since Indonesian TRL has nine categories of maturity, then we need to develop a corpus that consists of nine levels of maturity. We develop the TRL corpus based on the keyword collection of Bloom's Taxonomy in the assumption that the taxonomy level of thinking in Bloom's Taxonomy represents the maturity of TRL. Since Bloom's Taxonomy has only six keyword categories, as presented in Table 1, we disparted the whole categories into 9 separate categories by first sorting the keywords. To get a better result of the keyword matching, we enrich the corpus collection by using synonym word in WordNet Database. WordNet is a rich lexical Database that arranges its collection of words in the form of a semantic network based on psycholinguistics theory [21]. WordNet is utilized in many applications in the field of Natural Language Processing [22]. WordNet organizes its collection in the form of a synonym set (henceforth synset) that shares the same sense Jember, Univerather than alphabetically. For example, the word "car" shares the same sense with "auto", "automobile", "machine" and "motorcar" i.e.: "a motor vehicle with four wheels". This set of words is called synset and is associated with a certain part of speech (POS): noun, verb, adjective, and adverb. The result of the enrichment process is presented in Table 2. folds: 1) clean up unimportant words and 2) eliminate nonalphabetic characters. In this work, text pre-processing involves tokenization, stop word removal, and stemming. Tokenization is the process of splitting the document into elements, usually called tokens. At the same time, stop word removal is the process that aims to remove punctuation, prepositions, connecting words, and unimportant words. The last stage of text pre-processing is stemming that aims to obtain the basic form of the word.

C. Helmholtz Feature Selection
Helmholtz principle is employed to seek the meaningful features of the abstract document and remove the rest. Accordingly, it can reduce the size of the feature being processed. It means reducing the working time of the process. Helmholtz introduces a formula for filtering such features. The formula is called NFA or Number of False Alarms that can be seen in Equation (1).
, , In Equation (1), w represents a word, P represents a part of a document such as a sentence or paragraph, and D represents the whole document. The word w appears m times in P and K times in D. N = L / B where L is the length of D and B is the length of P in words. In this formula, N is the total number of documents. According to Alexander, Hellen, and Steven, if in some documents the word w appears m times and NFA < 1 then it is an unexpected event. Based on NFA, the meaning score of words is calculated using Equation (2).
, , = − log , , In equation (2), log of NFA is utilized based on the observation that NFA values can be exponentially large or small [14]. If Meaning > !, then add word w to the set Kw and mark w as a meaningful word for Pi. We define a set of keywords as a set of all words with NFA < !, ! < 1. Smaller ! corresponds to more important words. It is easy to see that Meaning > ! is equivalent to NFA < !. The ! is a parameter that is used to vary the size of the set typically chosen strictly positive as we are only interested in meaningful words.

D. LLDA Label Inference
Labeled Latent Dirichlet Allocation (L-LDA) is one topic modeling technique that improves LDA by incorporating supervision. In this work, a topic that L-LDA generates is considered as the label of TRL. LDA models a document as a mixture of topics. LDA only infers discrete probability distribution over topics per document that is often hard to interpret the generated topic to conform to an end-use application [3]. As an extension of LDA, LLDA offers a solution for this limitation. Unlike LDA and other extensions of LDA like Disc-LDA [19] and MMLDA [20], LLDA models each document label directly with one topic generated. LLDA can also be regarded as the improvement model of Multinomial Naïve Bayes in its mixture model [3]. In terms of generating a mixture of topics for each document, LDA and LLDA are similar.
However, LLDA introduced supervision to infer a topic that corresponds to the document's label set. This work provides the document's label set for LLDA by matching the abstract document with the corpus of keywords built based on Bloom's Taxonomy. In the application of LLDA, we make use of an open-source python tool developed by Nakatani Shuyo. Every abstract document is represented into a tuplecontained word index list and topic binary list.

E. Academic Reputation Score Formula
In this work, we propose a formula to calculate academic reputation score based on Technology Readiness Level of research. Firstly, we introduce a level weight like presented in Table 3 that will be utilized for counting academic reputation score.

III. RESULT AND DISCUSSION
Experiment is conducted using 450 abstracts documents collected from nine most reputable universities in Indonesia i.e.: Institut Pertanian Bogor (IPB), Institut Teknologi Bandung (ITB), Institut Teknologi Sepuluh Nopember (ITS) Surabaya, Universitas Airlangga (UA), Universitas Brawijaya (UB), Universitas Diponegoro (Undip), Universitas Gajah Mada (UGM), Universitas Indonesia (UI) and Universitas Muhammadiyah Surakarta (UMS). The abstract is grabbed from the most cited paper in google scholar from those universities. For the ground truth of the experiment, we use QS World University Ranking 2017 for Indonesian University, as can be seen in Table 4. We calculate the ranking gap and Pearson correlation for the parameter performance of the ranking method. The result of the pre-processing step of the text data is cleaned terms. The term in text classification task is a feature by with the classification process will be carried out [4]. Feature selection is an important step in the text classification task. Reducing the size of the feature means reducing time computation. Selecting meaningful features means providing better classification performance. In this work, we utilize Helmholtz principle [21] to select meaningful features of the abstract document.
After pre-processing the abstract documents' step, we perform feature selection by employing Helmholtz principle. The result of feature selection is presented in Table 5. The implementation of Helmholtz effectively reduces 86% of the feature and left 24% meaningful features. In the next two tables, we present the result of the experiment comparing classification tasks using LLDA with and without Helmholtz feature selection. For the performance parameter, we use the ranking gap between ground truth and our proposed method ranking. We present the ranking of LLDA without Helmholtz in Figure 3. The Pearson correlation between the LLDA result without Helmholtz compared to ground truth is 0.3. We calculate Pearson correlation between our ranking and the ground truth by using Equation (3). Pearson correlation coefficient measures the strength of association between two sets of data. In our case (university ranking), when our ranking is fully equal with the ground truth ranking, then the value of the coefficient will be 1. In the equation, " denotes Pearson correlation coefficient, # is our ranking, and $ points the ground truth ranking while is the number of universities experimented.  The utilization of Helmholtz feature selection in L-LDA classification successfully increases the accuracy of the proposed method. The score of Pearson correlation between L-LDA+Helmholtz is 0.68, significantly outperforms L-LDA without Helmholtz. The results of the experiment previously described in Table  6 and Table 7 is ranking generated based merely on academic reputation score using our proposed formula. We also experiment to generate ranking using all indicators employed by QS ranking system i.e., academic reputation (40%), employer reputation (10%), faculty/student ratio (20%), citations per faculty (20%), number of professors (5%), and quality of citations (h-index & i10-index) (5%). We grabbed the information from each university. The result of the experiment indicating the gap with QS ranking is presented in Table 8 and Table 9. For LLDA without Helmholtz, the Pearson correlation coefficient is 0.78. While for LLDA with Helmholtz, the coefficient value is 0.95.

IV. CONCLUSION
This work proposes an automatic ranking system of universities based on LLDA-Helmholtz. LLDA is the improvement of a topic modeling method named Latent Dirichlet Allocation (LDA). For determining the prior label of LLDA, we develop a keyword corpus based on a taxonomy level of thinking called Bloom's Taxonomy. We assume that keyword of Bloom's Taxonomy can represent the maturity level of TRL. We make use of Helmholtz principle for selecting the noteworthy feature of the abstract document. In the evaluation step, we compare our ranking with the QS ranking. The result of the experiment indicates that the proposed method is promising. Experiment emphasizes that Helmholtz has a significant role in reducing the feature and increasing the ranking quality. The best performance is achieved by using all indicators and employ LLDA with Helmholtz. A significant result is achieved using all QS indicators and LLDA-Helmholtz to calculate university academic reputation validated using ranking gap and Pearson correlation coefficient. length of in words total number of documents 2 variation parameter " Pearson correlation coefficient # system's ranking $ ground truth ranking