A Study of Criteria for Evaluating the Performance of World Expo Taiwan Pavilion

Corresponding Author: Wen-Pang Liao Department of Leisure Management, Yu Da University of Science and Technology, Taiwan Email: debbieliulbm@gmail.com Abstract: Under the circumstance of severe fund shortage of local governments in Taiwan, an exhibition pavilion’s commitment to inheriting and passing down diverse cultures and the exhibition pavilion’s business performance grows more important. This study utilizes the balanced scorecard concept, as proposed by Kaplan and Norton, reviews relevant literature, uses the Delphi method to gather and systemize opinions of experts in related fields and designs the “criteria for evaluating the performance of the World Expo Taiwan Pavilion”, which have 19 criteria in four dimensions. Furthermore, the Analytic Hierarchy Process (AHP) is performed to calculate the “relative importance of the criteria for evaluating the performance of the World Expo Taiwan pavilion” in order to rate its operations strategies in planning, forecasting, making judgments, distributing resources and determining investment portfolios. The results suggest that the relative importance of the four dimensions of evaluation criteria in descending order is: Financial management (36%), customer service (34%), internal operations (20%) and learning and growth (10%). In terms of the relative importance of the 19 evaluation criteria, criteria with the highest relative importance are: Annual budget allocation planning (33%), a hygienic and safe environment (40%), reduced operational issues (40%) and enhancing employees’ foreign language skills (25%). In conclusion, business operations in the World Expo Taiwan Pavilion should prioritize financial management, improve its various facilities and provide better customer services in order to have better customer satisfaction.


Introduction
In recent years, the Taiwanese government has been committed to promoting all types of cultural industries. Despite its effort in cultural heritage conservation, the Taiwanese government still receives criticism for some government-operated exhibition pavilions' poor business performance or failure to attract enough visitors. In order to absolve the World Expo Taiwan Pavilion from negative criticisms in the midst of the local government's severe fund shortage, it is necessary to develop better operational strategies for the World Expo Taiwan Pavilion, identify problems in relation to the exhibition pavilion's operations and seek strategies for its future development, as well as align management and decision-making level's strategies, visions and goals. In doing so, the World Expo Taiwan Pavilion could have a good command of useful information and key elements for business management, specific strategies for action, heightened its competitiveness and advantages and achieve better business performance. Furthermore, the World Expo Taiwan Pavilion can turn into a role model for other culture parks to learn operations strategies from and achieve the goals of cultural heritage conservation and promotion.

Literature Review
Following data collection and literature review, the Delphi method is adopted to determine relevant criteria question items, the Analytic Hierarchy Process (AHP) questionnaire analysis method and the Balanced Scorecard (BSC) to assist the World Expo Taiwan Pavilion to determine the relative importance of evaluation criteria. Then, the four dimensions of the evaluation criteria are combined: Financial management, customer service, internal operations and learning and growth with the management and decision making level's strategies, vision and goals as guidance of operations strategy for the World Expo Taiwan Pavilion. The findings are expected to promote the World Expo Taiwan Pavilion and ensure its continuous operations.

Performance Evaluation System
Descriptions of the definition of performance evaluation, the traditional performance evaluation system and the strategic performance evaluation system are provided, as follows: 1. The definition of performance evaluation.
In terms of the definition of performance evaluation, Fang and Lin (2003) pointed out that an internal performance evaluation system is required for understanding and checking each management level of an organization and enable a business to reach its goal in business operations and management. This system requires the highest decision-maker of a business to execute the following tasks: 1) Determine the business's current circumstance of operations 2) Improve business performance 3) Are the business objectives or operations strategy correct or calling for adjustments? Making necessary decisions and judgments quickly and confidently and taking appropriate action. 4) Thinking about whether the company is allocating and using its limited resources effectively from the standpoint of improving business performance. 5) Impartially assessing the performance of managers at all levels.
2. The traditional performance evaluation system. With regard to the disadvantages of the traditional performance evaluation system, some scholars proposed that. 1) Li (1995) suggested that the main disadvantages of the traditional financial performance evaluation system include: (1) Valuing results over the process and failing to assist management personnel to make a decision.
(2) Valuing the past, neglecting the future and unable to forecast the future. (3) Possibly leading to wrong evaluation results.
(4) The likelihood of being short-sighted and valuing short-term profits, while overlooking low-term competitive advantages.
2) Hoffecker and Goldenberg (1994) proposed that the traditional performance evaluation system centers on the internal accounting system.
Nevertheless, performance evaluation with a focus on financial performance is unable to provide information about other important dimensions of a business, such as customers and competitors, unable to carry out non-financial performance evaluation and therefore, loses the opportunity to take heed of warnings from the market.
3. A strategic performance evaluation system. Prior to making any business decisions, a business must understand the situation of demands in the market in order to determine profitable products. As the traditional performance evaluation system is no longer useful to management in the modern economic environment, a business must shift from using the traditional performance evaluation system to using the strategic performance evaluation system. In comparison with the traditional performance evaluation system, the strategic performance evaluation system, which consists of financial (quantitative) criteria, non-financial (qualitative) criteria and criteria for the evaluation process and results, incorporates the performance evaluation system with strategies. Therefore, the strategic performance evaluation system can better satisfy the needs of modern businesses, assist businesses to understand their position in a rivalry and indicate a direction for continuous improvement, as the guideline for the business' management activities and management evaluation (Fang and Lin, 2003;Santeramo et al., 2012). Eccles (1991) proposed that a performance evaluation should cover four dimensions: The work place, the factory, the entire business and the market. Details of these four dimensions are provided, as follows: 1) Work place evaluation: Stresses the evaluation of important steps in the process of work, such as quality, costs and transport. 2) Factory evaluation: Similar to the work place evaluation, but focuses more on evaluating the overall factory performance. 3) Overall business evaluation: Focuses on evaluating the performance of a business's individual departments or sections, such as net income, return on sales and market share. 4) Market evaluation: Focuses on considering the status of competition, the overall economic climate and the industry that the business belongs to, as well as the evaluation of quality and customer services.

The Meaning and Scopes of the Balanced Scorecard
Peng (2010) advocated using the strategic management concept to evaluate a business's performance. This concept involves using the balanced scorecard evaluation mechanism to evaluate a business's financial performance (financial aspect), customer satisfaction (customer aspect), effectiveness of internal business processes (internal aspect) and "learning, innovation and growth" (learning and growth aspect). Peng (2010) also stressed that the evaluation bar should align with an organization's strategies and mission to determine a set of "comprehensive performance evaluation criteria". This not only gives impetus to future performance evaluation scales (as a driving factor), but also compensates for the shortcomings of financial performance scales in the past (the reason for a poor result).

Features and Applications of the Delphi Method
The Delphi method involves making participants utilize their prior relevant knowledge to discuss a particular issue for several rounds until a consensus is reached. Procedures of the Delphi method possess the following features (Delbecq et al., 1975;Murry and Hammons, 1995): 1) Anonymity: To avoid the bandwagon effect among participants or participants' conformity to an authoritative group leader, the questionnaire survey is conducted individually. In addition, in order to prevent participants from selecting some questions to answer, while ignoring other questions due to their social status, the anonymity of participants is protected to encourage participants to give more information. Although a researcher may decide the extent of anonymity based on the issues under discussion, participants' answers should remain completely anonymous in order to avoid biased research results due to the influence of public opinion. 2) Iteration: Each participant must answer the same question at least twice. After the researcher presents participants with answers to the previous questionnaire, participants could have second thoughts regarding the questions and provide new answers. 3) Controlled Feedback: Delphi method is a research method that centers on a research topic and the researcher. Therefore, at the time the researcher is presenting the results, the researcher could indicate irrelevant answers or suggestions to guide the research direction. As such, questionnaire participants could have limited knowledge of other participants' suggestions of the discussed issue. 4) Consensus: Consensus in this context refers to participants' final opinions. After repetitive discussions through written correspondences, a Delphi survey would terminate upon participants reaching a consensus. The survey is completed when the researcher determines that participants' opinions regarding an issue are converging to a consensus and neither changes nor modifications are necessary after at least three rounds of questionnaire survey.

Advantages and Limitations of the Delphi Method
The Delphi method makes use of anonymous group participation. Other than the advantage of gathering experts' collective decisions, drawing on collective wisdom and absorbing all useful ideas, the Delphi method also precludes possible disturbances resulting from experts' face-to-face communication and discussions regarding a topic. In addition, the Delphi method has the following advantages (Murry and Hammons, 1995): • Going through a specific process and repetitive steps, the Delphi method makes group members gradually reach a consensus on an issue • The Delphi method is very suitable for deciding a collective goal or drafting a plan, as group members would gradually reach a consensus of an issue during the Delphi survey. Moreover, a consensus reached by the public is more likely to be backed up by the public • Participants of the Delphi method do not attend a face-to-face meeting; therefore, neither a particular time nor location must be prearranged, which can save survey participants time and energy. For that reason, the Delphi method is not restricted by participants' geographical location. Even if participants are at different places on Earth, they can still talk about an issue together • As Delphi survey participants do not need face-toface discussions, common problems of meetings, such as agreeing with the majority's opinions or unwilling to give opinions that contradict public opinions may be minimized. Even those who are too shy to speak at a meeting may have an equal opportunity to express their opinions. Therefore, while the Delphi method requires no group meetings, it is able to bring together experts' opinions, draw on collective wisdom and absorb all useful ideas • Participants of a Delphi survey are all experts.
Inviting experts to answer a question together may bring about more valuable and objective ideas • A Delphi survey is easily conducted. Neither historical data nor difficult statistical analysis techniques are required for analysis of a complex and multi-aspect issue • During the course of a Delphi survey, each issue can be thoroughly clarified. As such, in comparison with a single-round questionnaire survey, the results of a Delphi survey can better represent the subtle differences of collective opinions • After systematic investigation, analysis and repeated revisions on incongruent opinions, a result that represents almost all experts' opinions can be obtained Despite the advantages listed above, the Delphi method is inevitably restrained by the following limitations (Delbecq et al., 1975): • As research that utilizes the Delphi method must rely on experts' intuition and knowledge, research results are likely to be affected by experts' subjective judgments and interference • As the researcher is coordinating and overseeing the process of the Delphi method, the research might be subject to the intervention of the researcher • The Delphi method involves time-consuming processes and therefore, the progress cannot be easily controlled. One expert's opinions at different moments also inevitably contradict. Moreover, participants without strong motivation to participate may drop out halfway • Most financial conclusions resulting from the Delphi method are general and thus, are unable to offer meticulous planning and specific details. Therefore, it can only provide direction guidance and reference for designing strategies The Modified Delphi Method Murry and Hammons (1995) pointed out that the Delphi method involves repetitive written communication and expression of opinions in order to obtain experts' congruent opinions. However, the procedures are usually modified or abridged due to various factors, such as time, human resources and money, in order that research can continue. The Delphi method with modified procedures is called the modified Delphi method. There are two common revised versions: 1. Steps of the first-round open-ended questionnaire are omitted in order that no open-ended questionnaire is used to collect experts' opinions. Instead, question items are designed based on the research results of previous literature or the researcher' own experiences. Further, experts are invited to express their opinions regarding the question items. This modified method could mitigate the problem of a low questionnaire return rate due to the trouble for questionnaire participants to answer an open-ended questionnaire. 2. After the third round and the fourth round are combined, there would only be three steps in the process. Results of the second round are posted to participatory experts, who are requested to evaluate the importance and rating of items classified by the researcher. In this way, experts would have fewer chances to reexamine opinions. In a study that adopts the Delphi method, more obvious convergence of expert opinions usually takes place in the first and second rounds.

Features and Applications of the AHP
The AHP mainly involves using a systemic method to decompose a problem into a hierarchy of more comprehensible sub-problems, performs pairwise comparisons to determine the relative importance ratio of two compared elements, lists alternative options in order and systemizes a complex problem. The AHP allows the utilization of experts and scholars' subjective opinions and evaluations to systematically decompose a complex and complicated decision problem and structuralize a decision-making scenario. A decision goal and evaluation criteria construct a hierarchy, where a rating scale is used to evaluate the relative importance of two compared goals or criteria and quantitative pairwise comparisons are conducted. Furthermore, the results of comparisons are used to determine the pairwise comparison matrix's principal eigenvectors, the relative weight of each decision goal or evaluation criterion and the relative advantage of each alternative plan or scheme. The alternative plans or schemes are ranked to provide decision makers with sufficient information for choosing an appropriate option and minimizing the risk of making poor decisions (Chang, 2013;Huang, 2015;Ma et al., 2014;Saaty, 1980;1990;Teng and Tzeng, 1989a;1989b;Trapani et al., 2014).

Research Method
This study used the Delphi method to design the "criteria for evaluating the performance of the World Expo Taiwan Pavilion" and applied the AHP to determine the "relative weights of criteria for evaluating the performance of the World Expo Taiwan Pavilion". Then, it cross-referenced the theories and ideas of existing relevant literature and various data and utilized the opinions and ideas from different parties and criteria of different aspects to evaluate whether current strategies and results meet the expected goals and satisfy public demand.

Determining the Evaluation Criteria
The modified Delphi method was used to design a structural questionnaire on the "criteria for evaluating the performance of the World Expo Taiwan Pavilion" for the first-round questionnaire survey. The anonymous experts' collective decision-making technique was adopted for the questionnaire survey. Meanwhile, expert opinions were collected and organized and statistical analysis is performed to produce valid criteria for evaluating the performance of the World Expo Taiwan Pavilion.
In the modified Delphi survey questionnaire, the importance of each criterion for evaluating the performance of the World Expo Taiwan Pavilion was rated using a 1 to 5 rating scale. Experts were requested to rate the importance of each criteria's influence on the performance of the World Expo Taiwan Pavilion. In the semi-closed questionnaire survey, there was a blank space for "other suggestions", where participatory experts may offer opinions regarding modification of the criteria or other matters, which this study could use as a reference to make improvements.

The Modified Delphi Questionnaire Survey
Based on the research questions, this study gathered and reviewed literature in relation to evaluation criteria and classified the collected criteria into the financial dimension, customer service dimension, internal operations dimension and learning and growth dimension through brainstorming. Upon a preliminary hierarchy of criteria for evaluating performance, the World Expo Taiwan Pavilion was developed. This study commenced the first-round modified Delphi questionnaire survey with experts and scholars in a related field as survey participants. Participatory experts and scholars' ratings, opinions and ideas presented in the completed and returned questionnaire were systematized and scrutinized to endorse the validity of the preliminary hierarchy of criteria for evaluating the performance of the World Expo Taiwan pavilion. After the preliminary hierarchy is modified, the second-round of the modified Delphi questionnaire survey was administered to obtain reinforced endorsement of the validity of the hierarchy of evaluation criteria.

Collecting Data of the Modified Delphi Method Questionnaire
Employing the modified Delphi method, this study collected data by conducting two questionnaire surveys. The first open-ended questionnaire was designed based on the results of literature review and expert interviews. After copies of the first questionnaire were returned, statistical analysis was performed, modifications were made and the second questionnaire was provided to members of the same expert panel: (1) The first questionnaire survey: The evaluation dimensions and criteria under each evaluation dimension in the first Delphi method questionnaire were designed based on the results of literature review and consulting experts' opinions. Questionnaires were distributed to 15 voted experts and the return rate was 100%. Based on experts' rating of the evaluation criteria in the first questionnaire from 0 to 100, evaluation criteria receiving a score lower than 75 were removed, which reduced the initial 24 evaluation criteria to 19 evaluation criteria. After answers to the first questionnaire survey are collected, this study used Microsoft Excel to calculate the mean, standard deviation and coefficient of variation of experts' agreement to each evaluation criterion. It then analyzed experts in individual groups to determine the correlation between different expert groups and experts' agreement on the evaluation criteria. Experts' supplements and suggestions to the first questionnaire were scrutinized and categorized. The evaluation criteria were added or deleted accordingly and the second modified Delphi method questionnaire was redesigned.
(2) The second questionnaire survey: The second questionnaire was designed after modifications were made to the evaluation criteria of the first questionnaire. After answers to the second questionnaire survey were collected, this study used Microsoft Excel to calculate the congruency of experts' agreement on each evaluation criterion to determine the guideline for each level at the hierarchy of evaluation criteria and reorganize the hierarchy for the subsequent AHP. Again, all distributed questionnaire copies were returned, with a return rate of 100%. Experts also rated the evaluation criteria in the second questionnaire from 0 to 100. As the evaluation dimensions and evaluation criteria under each dimension in the first questionnaire were already modified in line with the opinions of members in the expert panel, experts reached a consensus on more items in the second questionnaire. There were a total of 19 evaluation criteria at the second level of the evaluation hierarchy and the coefficient of variation was lower than 0.1, indicating that expert's opinions have converged. The resulting evaluation dimensions and evaluation criteria from the two questionnaire surveys are as shown in Table 1.

Designing a Preliminary Hierarchy of Evaluation Criteria
To design the preliminary evaluation criteria, this study consulted relevant literature, applied brainstorming and systematized experts' consensus after discussions and exchange of opinions. This study first constructed a preliminary hierarchy of criteria for evaluating the performance of the World Expo Taiwan pavilion, which consists of three levels in a hierarchical structure. The first level is the goal of building an evaluation model for the World Expo Taiwan Pavilion, the second level is the four evaluation dimensions, which are the World Expo Taiwan Pavilion's financial, customer, internal operations and learning and growth dimensions. The third level is the evaluation criteria (elements). Based on the expert interviews, literature review and on-site inspection, the evaluation criteria of the second level of the hierarchy were proposed, as shown in Table 2.

Steps for Calculating the Relative Weight of Each Evaluation Criterion
The AHP consists of two parts. The first part is the establishment of the levels of the hierarchical structure and the second part is the evaluation of the different levels of the hierarchy. The AHP involves inviting experts to determine and evaluate the key elements of a complex decision problem, present the decision problem in a simple hierarchical structure, use a rating scale to carry out pairwise comparisons of elements, form pairwise comparison matrixes, calculate the eigenvectors and determine the order of elements at the same level. The consistency of the pairwise comparison matrixes is tested to detect possible errors and determine whether the results provide a valuable reference.

Designing the Questionnaire
In the AHP, pairwise comparisons are conducted in order to determine the relative importance of the compared elements. The AHP uses a 5-unit scale: Equal importance, weak importance, essential importance, demonstrated importance and absolute importance. Values 1, 3, 5, 7 and 9 are used to represent the degree of importance, while values 2, 4, 6 and 8 are intermediate values between the above two adjacent values. A clear explanation of what each value represents is as shown in Table 3. Table 3. The numerical scale used in the AHP and the explanation for the numerical scale (Peng, 2010) The numerical scale of the relative importance of Element A against Element B Definition Explanation 1 Equal importance Element A and Element B contribute equally to the objective 3 Weak importance Experience and judgment slightly favor Element A over Element B 5 Essential importance Experience and judgment strongly favor Element A over Element B 7 Demonstrated importance Element A is strongly favored and its dominance demonstrated in practice 9 Absolute importance Element A has absolutely superior importance over Element B 2,4,6,8 Intermediate values between two adjacent scales When compromise is needed Reciprocals of the above In comparing the relative importance of Element B against element A

Establishing Pairwise Comparison Matrices
Comparisons of the importance of elements are conducted through questionnaire survey. The values of the relative importance of the compared elements, as resulted from the questionnaire survey, are used to establish pairwise comparison matrixes. Elements are pairwise compared against the element at a higher level for importance and a pairwise comparison matrix is constructed based on this. If there are n elements, there would be n (n-1)/2 pairwise comparisons. Results of comparisons of the n elements are placed at the upper triangular matrix A (the main diagonals of Matrix A are both 1 as it represents the comparison of an element against itself). The values at the lower triangular matrix are the reciprocals of values at the corresponding position at the upper triangular matrix, which means W ij = 1/W ji . Elements in the pairwise comparison matrixes are shown, as follows:

Calculating the Eigenvalue and Eigenvector
Once a pairwise comparison matrix is established, the relative importance of elements at each level can be calculated. The equation for eigenvalues in numerical analysis is used to calculate eigenvectors or priority vectors. Saaty (1980) proposed the following formulas to calculate the standardized average mean of eigen vectors: 1. Building a comparison matrix, using the formula for eigenvalues to compute eigenvectors and calculating each decision criteria's relative weight. The eigenvector method: As most matrixes are inconsistent, the calculation of eigenvectors in the AHP mostly adopts the highly accurate standardized average mean of eigenvectors.

Results and Discussion
In terms of the hierarchical structure of the relative importance of criteria for evaluating the importance of the World Expo Taiwan Pavilion, Microsoft Excel is used to examine the consistency of the pairwise comparison matrixes resulted from questionnaire survey participants' answers. As the consistency index values of all participants' pairwise comparisons of the four evaluation dimensions and 19 evaluation criteria are smaller than 0.1, it indicates that all pairwise comparison matrixes are highly consistent (as shown in Table 4). Table 5 shows the relative importance of criteria for evaluating the performance of the World Expo Taiwan Pavilion. 1) From the experts' point of view, the most important criteria dimension for evaluating the performance of the World Expo Taiwan Pavilion is the financial dimension, which is followed by the customer service dimension, internal operations dimension and the learning and growth dimension, in descending order of importance.
2) The relative importance and ranking of evaluation criteria in the financial dimension: With regard to the relevant importance of evaluation criteria in the first dimension (the financial dimension), "annual budget allocation planning" is the most important, followed by "executing various types of budgets efficiently", "achieving profit growth", "correct filing" and "the percentage of personnel costs". 3) The relative importance and ranking of evaluation criteria in the customer service dimension: With regard to the relevant importance of evaluation criteria in the second dimension (the customer service dimension), "improving the quality of the exhibition pavilion" is the most important, followed by "boosting customer satisfaction", "theme-based events and innovative marketing" and "improving the convenience of transportation".
4) The relative importance and ranking of evaluation criteria in the internal operations dimension: With regard to the relevant importance of evaluation criteria in the third dimension (the internal operations dimension), "reduced operational issues" is the most important, followed by "organizing regular theme-based events", "effective internal horizontal and vertical liaison" and "controlling the schedule of events".
5) The relative importance and ranking of evaluation criteria in the learning and growth dimension: With regard to the relevant importance of evaluation criteria in the fourth dimension (the learning and growth dimension), "enhancing employees' foreign language skills" is the most importance, followed by "crisis management training", "inheriting and passing down professional knowledge and skills", "the training and quality of exhibition docents" and "valuing the provision of a hightech e-learning environment for employees". The relative importance and priority of these evaluation criteria, as listed in Table 5, can provide the exhibition pavilion with guidelines for strategy-making.

Conclusion
In this study, a preliminary questionnaire regarding the "criteria for evaluating the performance of the World Expo Taiwan Pavilion" was developed upon literature review about the World Expo Taiwan Pavilion. Expert interviews were conducted to gather opinions from the industry, government, academia and consumers. After the evaluation criteria were finalized using the Delphi method, experts were invited to complete a questionnaire about the relative importance of the criteria for evaluating the performance of the World Expo Taiwan Pavilion. Judging from the evaluation criteria' relative importance, the operations of the World Expo Taiwan Pavilion should prioritize its financial performance. Furthermore, the customer dimension should be the second focus of the World Expo Taiwan Pavilion. As such, there should be more customer-centric strategies, such as improving various service facilities at the exhibition pavilion that are favorable to visitors, improving the environmental quality and various dining services, enhancing visitors' satisfaction with the exhibition pavilion services, incorporating holiday festivals with a number of local theme-based events and utilizing various innovative marketing approaches.