Generalized Agile Estimation Method

— Agile cost estimation process always possesses research prospects due to lack of algorithmic approaches for estimating cost, size and duration. Existing algorithmic approach i

Software cost estimation (also known as software estimation) has been an important and difficult task since the evolution of the software. Many formal and informal methods have been proposed for software estimation. It is important for estimation methods to generate realistic software estimates to build the trust of customers as well as team members. Unrealistic estimates are major factors for either software project failure or decreasing the quality of the software [1]. The software estimation process becomes more cumbersome in case of Agile Software Development Process (ASDP) [2]. ASDP is a light weight process that addresses the volatile requirements at any stage of software and follows iterative and incremental development [3] [4]. Thus, software estimation in case of ASDP involves some challenges such as uncertainty in requirements and more dependency on oral communication etc. Many agile estimation techniques have developed and are classified as non-algorithmic and Algorithmic methods [8].
Nonalgorithmic methods are frequently used by agile practitioners and include planning poker, disaggregation etc.[9] [10]. These methods are used to derive the estimates on the basis of expert opinion and historical data. However, these methods are not useful in case of unavailability of both aforesaid factors. Further, these methods may generate the different estimates for same project depending on the intuition of the estimators [8]. On the contrary, Constructive Agile Estimation Algorithm (CAEA) is an iterative estimation method that incorporates vital factors for estimation of Cost, Size and Duration (CSD) of an agile project [2]. These vital factors being project specific and hence cover the related factors mainly; performance, configuration, complex processing, data transfer, security, multiple sites and operational ease. Algorithmic approach generates realistic estimates [8]. At the same time, this method has certain limitation in terms of number of vital factors, their prioritization and uncertainty of project. Therefore, there is a strong need to develop the generalized estimation based algorithm for agile projects.
In this paper, we have introduced Generalized Estimation Method (GEM) and presented algorithm based on GEM for agile projects with some case studies. The terminology used in our proposed method is described in Section II first. In Section III, GEM along with an algorithm is discussed. We cover some case studies in Section IV. Finally, we conclude in Section V.

II. TERMINOLOGY
In our proposed GEM, some terms are frequently used. These terms are discussed in elaborated manner as follows: A. Project Domain (PD) Project domain is defined as an environmental class of projects that has specific attributes and may be developed for specific community/ environment. For example, military project requires higher reliability, security and is developed for defense applications. The classification of PD for CSD estimation of projects is shown in Table I.
B. Vital Factors (v) Vital factors refer to the factors that affect the CSD of project in agile estimation and classified vital factors into main sub classes as project, sociological, technological and ergonomic classes as shown in Table II.
C. Weights (w) It is defined as the value assigned to each vital factor depending upon its priority in a project from any project domain. As per the importance of the vital factors, various levels may be formed to accommodate the vital factors having same priority with appropriate weights.

D. Intensity levels(I)
Intensity level refers to influence/ impact of a particular vital factor in a project and assumes either low (L), medium (M) and high (H) intensity levels of a vital factor.
E. Uncertainty Factor (UF) It deals with the level of uncertainty in requirements, technology and resource aspects related to project and its value lies between 0 and 1.
F. Story points (SP) It defined as approximate lines of code in a story and also referred as size of the story. A story in agile project refers to a small piece of requirement.

G. New Story Points (NSP)
It is a measure of efforts required to develop a story after the inclusion of vital factors.
H. Size of Project (SOP) SOP is term used for denoting the size of project and is measured in terms of NSPs.

I. Duration of Project (DOP)
Duration of project is denoted as DOP and is measured in time as either in weeks, months or years.
J. Velocity It is number of story point developed by team in a specified time.

III. GENERALIZED ESTIMATION METHOD (GEM)
Generalized Estimation Method is an iterative algorithmic approach that follows an agile estimation process. Agile estimation process starts with identification of project domain class and vital factors on the basis of project behavior and initial information of user class. GEM is divided in two phases: Early Estimation (EE) and Iterative Estimation (IE). EE takes input of project domain, identified vital factors and risk involved in project and is non-chargeable activity to develop faith and trust of customer. IE is chargeable activity and performed during iteration planning. It re-estimates the CSD of working software after identification of actual velocity, influence of vital factors in project and risk involved in requirements and resources. It is way of autocorrecting the estimates after updating requirements and feedback received from stakeholders. In following subsection, we will discuss algorithm based on GEM and its formal description.

A. GEM Based Algorithm and its Description
In GEM based algorithm, we assume that class of project domain (PD) and associated n vital factors v1, v2 …, vn are properly identified. Further, intensity level Ii of ith vital factor is assigned either Low (L) or Medium (M) or High (H) value and the quantified values accordingly. Weights w1, w2,, …, wn associated with vital factors v1, v2, …, vn are allocated in such a manner that sum of weights (SUM) of all vital factors for a project is 1. Using these weights and intensity levels, the priority factor PF(vi) for each vital factor and Unadjusted value UV are computed using equation (1) and equation (2) respectively. We have quantified the risk associated with project in the form of uncertainty factor (UF). The value of UF has been quantified as per the risk associated with project and may vary from 0 to 1. Every class of project consists of many functionalities corresponding to requirements. These may further be decomposed into m small independent pieces i.e. stories. Thus, it becomes possible for us to compute SPj for each story j. Using equation (3), New Story Point (NSPj) is computed for each m number of stories. With the help of these NSPs and velocity of project development team, finally we compute SOP and DOP using equation (4) and (5).
It is important to note that the above process is executed for estimating CSD of a particular case of a specified project domain application. We describe formally an algorithm based on GEM in this section as follows: Algorithm: // We assume that PD and vital factors are identified in a given project to be estimated. This algorithm inputs quantified intensity levels of identified vital factors, weights of vital factors, story points and uncertainty factor.

IV. CASE STUDIES
In this section, we have considered the small applications to analyze results obtained from GEM. In our study, we have concentrated on two aspects of agile estimation; firstly impact of assigning weights to aforesaid vital factors on CSD of agile project and secondly, impact of uncertainty factor on CSD estimation. In this section, we first describe the research setup for our case studies following the discussion of project cases with results of this study.
Our study included small projects of three PD classes namely; web application, MIS project and military projects. We have used square series for quantifying the efforts required for particular PD class such as web application possesses PD value as 1, MIS project as 4 and military project possesses PD value as 9 in our study. Our study concentrated on only project specific vital factors assuming that other vital factors are favourable for agile software development and do not have any effect on the CSD of projects. Table III depicts the projects categories with other parameters that are considered for our study. The vital factors with varying intensity levels are considered with their corresponding weights are shown in Table IV. We have attempted to identify the impact of uncertainty on CSD estimation by considering the various levels of uncertainty for the projects with same intensity level of vital factors. The uncertainty levels in terms of uncertainty factor are ranging from 0.2, 0.4, 0.6 and 0.8. Also, we assumed the team velocity as 8, SP value of a story as 7 and each project consists of 25 stories of same size. In study, we have vital factors at all three intensities levels are considered for assessing generalized results obtained from use of GEM based algorithm on various class of project domains. Thus, includes around more than 1500 projects of all three domains with varying intensity levels.
All projects considered for study are categorized in two cases. Case I deals with projects having same intensity level of all vital factors. Algorithm computed NSP values, SOP and DOP for project of three aforesaid domains projects of three domains for case1 as shown in Table V. Thus, case I included three special cases namely; all vital factors at low intensity levels (i.e. LLLLLLL), all vital factors at medium intensity level (i.e. MMMMMM) and high intensity level (HHHHHHH).
NSPs values for low uncertainty projects range from 8.2 to 9.8 for web application whereas NSP values for higher uncertainty (i.e. 0.8) range from 8.8 to 15.2 for same PD class i.e. web application. It has been found that these NSP values are lower and upper limits of NSP values for the projects at same uncertainty level of particular PD class.
In Case II, GEM based algorithm uses input as various intensity level of various vital factors. Thus, HLMHLMH is a project case configured with high performance, low configuration, medium complexity, high data transfer, low security, medium multiple sites and high operational ease. The NSP computations of random combinations of intensity levels of projects have been shown in Table VI. It is evident from the Table VI that NSP values at various intensity values in the specified range of case I i.e. NSP values of case II are higher as compared to projects with all vital factors at low intensity and lower as compared to projects with all vital factors at high intensity. Also, it has been noticed that the UF plays a role of catalyst in increasing SOP and DOP of the project. We found in our study that a project case with low uncertainty MMMHHMM possesses higher NSP and SOP values due to maximum vital factors of higher prioritization level at higher intensity level.  CSD estimation of agile projects is a challenging task due to its principles and practices. Major challenge that lies in estimation is volatile requirements and ergonomic requirements of the ASDP. With the help of GEM based algorithm, early estimation process may be executed on the basis of vital factors with corresponding prioritized intensity levels, uncertainty levels and project domains by envisioning  upfront. These estimates are useful in identifying the early scope of project and establishing the trust of customer. On the other hand, an iterative estimation forces to improve the estimates after knowing actual team efficiency and clarity in requirements. We also presented two cases concerned with three project domains for study purpose. Further, we computed certain statistics such as size and duration of these project cases at various uncertainty levels. We have analyzed cost estimation of a particular project domain at various uncertainty level as well as cost estimation of various project domain at same uncertainty level as shown in Fig. 1 and Fig.  2. We observed following some interesting observation and facts as follows: 1)Iterative estimation provides the facility to improve the estimates after monitoring and tracking previous iteration thereby providing the scope of improvement in estimates and the actual facts to user and team.
2) GEM based algorithm eliminates the need of historical data and experts. Therefore, an average project manager can estimate more precisely.
3) It provides flexibility on number and type of vital factors, uncertainty level, project domains and quantification of aforesaid parameters depending on project behavior and team makeup. 4) Algorithm always generates different estimates for projects with different level of uncertainty. For example, CSD estimates for different uncertainty levels of web applications have been denoted in different colors in Fig. 2. Algorithm generates different statistics/ estimates for the projects with high intensity levels of different prioritization level. 5) Algorithm also resolves the limitations of CAEA by associating weights to each of the vital factors and at various uncertainty factors. 6) A dominating role of prioritization has been observed in case of low uncertainty on other hand impact of uncertainty is high in case of high uncertainty.