© Science Publications, 2005 Analyzing Concept Maps as an Assessment (Evaluation) Tool in Teaching Mathematics

In this research concept mapping has been used as a testing instrument. In our country’s education system, the relationship between the scores which are given to concept maps and the scores which are given to traditional written exams and multiple choice examinations in teaching mathematics, has been analyzed. Especially the examinations about functions, numbers, exponent numbers, rooted numbers and absolute values have been evaluated. Literature class scores which are assumed to reflect the student’s oral thinking and their ability to express their thoughts have been compared with concept mapping’s scores. At the end of the research, it is understood that someone can make reliable testing and evaluation by using concept mapping. There is no meaningful correlation between concept mapping and multiple choice type examinations. On the other hand, there is a meaningful correlation between the scores of concept mapping and traditional mathematics examinations. About p<0.1 meaningfulness has been determined between concept mapping testing and literature examinations. In this research comments have been made on these results and various suggestions have been given according to these results.


INTRODUCTION
It is a well-known fact that testing and evaluation has a crucial role in education. If you can determine the real outcomes of your teaching process and if you can really determine whether things you have been teaching have been understood by your students, then the process of your teaching and learning will be well-planned. Among the world, the studies have been conducted to evaluate the outcomes in a most appropriate way by taking into consideration the fragile part of testing and evaluation. Besides, in order to organize these studies, in America in the constitution of California University, there is an institution called 'CREEST' 'National Center for Research an Evaluation, Standards and Student Testing'. During different times among various disciplines, technical reports have been published, in order to evaluate teaching. In a study that concept mapping were carried out in teaching mathematics, students' learning process depending on time was analyzed (McGowen, 1999). Mason (1992), Shavelson, Lang and Lewin have been used concept mapping in testing and evaluation and they have achieved successful results. Besides, in a result of a research made in Okebukola in 1992, it was ascertained that students, who were successful in solving problems, were also successful in concept mapping. In the light of these studies, the effect of the ability to make concept maps over problem solving as an evaluation criterion could be accepted. Mason in his research evaluated the students' concept maps by scoring the concepts which were the main points of the maps, the validity of the connections, the number of the connections, the parallel and perpendicular flow and the sense of the order of connections.
Problem: Especially in our country, using concept mapping to evaluate teaching mathematics is not a common study subject. In this context, asking the question "Is ıt appropriate to use concept mapping for testing and evaluation of the subjects in mathematics lessons, ın our country?" is the problem of this research. If it is not appropriate, the reasons and depending on these reasons what kind of suggestions can be offered has been analyzed.
The purpose: In this research, concept mapping was used as a testing and evaluation method in teaching mathematics. Whether concept mapping can be an alternative to traditional testing and evaluation methods or not was searched. Whether students' oral thinking and their ability to express their thoughts have a relation to establish concept maps related to mathematics was also searched.

Minor problems
• Is there a meaningful relationship between the testing and evaluation with concept mapping and traditional written examinations?Is there a meaningful relationship between multiple choice examinations and concept mapping based testing, since concept maps have been testing students' knowledge from a conceptual point of view?
• As, it is necessary to know the relationship between concepts and to have the ability to express this relationship to make a concept map. When you think that literature lesson's scores has the closest relationship with this ability, is there a meaningful relationship between literature lesson's score and concept mapping's score? • In the education system of our country, can concept mapping be used as an alternative testing and evaluation method in teaching mathematics?

Hypothesis
• There will be a meaningful relationship between concept mapping testing and evaluation and traditional written examinations. • There will not be a meaningful relationship between multiple choice type examination and concept mapping. • As it is necessary to know the relationship between concepts and to have the ability to express this relationship, there should be a meaningful relationship between concept map's scores and literature lesson's score which has the closest relationship with this ability. • In the education system of our country, concept mapping can be used as an alternative testing and evaluation method in teaching mathematics.
The purpose of the research: It is understood that three is not any experimental study to use concept maps in testing and evaluation in teaching mathematics. On the other hand, after a research made in a higher education institute's documentation center, it is understood that three is not any experimental ma study to use concept maps in testing and evaluation in teaching mathematics.Therefore, our research is important to present ideas about using concept maps in testing and evaluation in teaching mathematics.

Assumptions
• The sample of the research is accepted as sufficient. • It is accepted that the literature lessons are sufficiently reflecting the student's oral thinking and their ability to express their thoughts. • The subjects of testing and evaluation are assumed to be sufficient. • It is accepted that during the application stage, the results of the examinations which were prepared by the school's mathematics group, had the construct and content validity.

Limitations of the study
• The sample group of the research is limited with 17 students from 9 th grades in Anatolia A Science High School.
• The duration of the research is limited to fall term of the 2002-2003 education year. • Since using concept maps in testing and evaluation will be searched, a teaching session about concept maps was given to sample group students. The duration of the education was limited to 21 days which means 3 weeks. • The application process of the research is limited to subjects such as, numbers, exponent and rooted numbers and absolute values.
Literature review: Concept mapping is a kind of teaching and learning method which was produced by Joseph D. Novak and his ma students in Cornell University in 1981.The theoretical assumption of the theory was based on Jean Piaget and David Ausebel's Acquiring Cognitive Learning Theory. The learning style that Ausebel define as meaningful learning is forming a new conceptual frame in a learner's mind with the interaction of the new and previous concepts. When the learner is trying to learn a new item, he or she is trying to relate this concept with the previous concepts in his or her mind (Hamachek, 1986).
In 1981, Novak, as we stated above, with the ideas from Ausebel, improved the concept mapping procedure for students to organize concepts in a meaningful structure. From that point, in the researches of West (1981), Stewart (1980), Novak, Gowin (1984) and Charden (1985), it was seen that concept mapping was an effective teaching method. Later, research related to concept mapping was made in many countries in the world.
Concept maps in testing ve evaluation: Concept maps can be used as teaching method instead of a formal evaluation method. Maps can typically used as an evaluation method before and after teaching. Only Lomask and his friends were used concept maps in a large scale in a research, in 1992 and they reported the validity and reliability of the examinations with concept maps at the end of their research.With the image below, it was tried to express the relationship between reliability and validty. (http://trochim.human.cornell. edu, 2003): Think a dart board, the center of the board is the concept that we are trying to test. Suppose we are making a shot for each of the students that we are trying to evaluate. The shots that hit the target mean the perfect testing. Under these circumstances the Fig. 1 expresses the relationship between reliability and validity.
Evaluating concept maps: In order to evaluate concept maps with scores, first of all your students should have learned to make concept maps sufficiently. When students learn to make concept maps, their maps can be  People who used unified scoring method are trained to test every concept map and the students' understanding of the concept that he or she stated in his or her map. According to this evaluation, every map is evaluated with a measurement between 1 and 10 (McClure and Bell, 1990). The interrelated scoring system was adopted from a method which was improved by McClure and Bell (1990). In this method, individual maps composed from independent propositions which were defined in the map, were scored. A proposition is defined as a relationship between concepts, a connection of two concepts highlighted with a connection line. Every proposition was scored between 1 and 3 according to a scoring protocol accepted the proposition as true. Structural scoring model was adopted from a method which was defined by Novak and Gowin (1984). According to this model, scoring concept maps are made as shown below. For each proposition 1 point, for every hierarchical proposition 5 points, for every diagonal connection, 10 points, for every example 1 point will be given. Variety in scoring methods of concept maps: As you can guess, scoring concept maps can be realized in various ways.
One of the most extreme suggestions: Scoring concept maps should be used for students' conceptual improvement's clinical pursuit. (White and Gunstone, 1992) The most sophisticated scoring system was produced by Nowak and Gawin (1984). Examples: For exemplifying concepts with special events or things, you can give 1 point each.Between two extreme scoring methods there are many alternative scoring methods. Comparing students' maps with a standard map is one of them. Novak and Gowin added the 5 th rule below for scoring concept maps.
A criterion map can be established then it can be scored. After you compare your students' maps with this one, you can score their maps over 100. You should be sure that one of your students can make better maps than criterion map, so he or she can take points more than 100. In some scoring methods the connections between concepts are counted (White and Gunstone, 1992). Connections can be hierarchical, multiple and diagonal. Points are given to the same number of connections with the target map (teacher's map). Extra points are given to the meaningful connections and for false connections points are erased. For an alternative, connections can be separated to meaning categories and a point can be formed by separating the total connection number to meaning categories (Mahler, Hoz, Fischl, Tov-ly and Lernau, 1991).Another method focused especially on propositions in the concept map. A proposition is relating two terms or a concept with a directed arrow. With these method three parts of propositions can be scored; • The relationship between concepts • Etiquette • The arrows Direction which shows a hierarchy between concepts or the reason of the relationship.For example, McClure and Bell (1990) used concept maps to find an answer for the question: "How does teaching STS (Science Technology and Society) affect cognitive structure?" they focused on students' proportions for scoring. Another method is focused on the definitions of the terms given in the map. True definitions are evaluated with 4 points; partly true definitions are evaluated with 3 and 1 points, finally false definitions are evaluated with 0 point (Mahler and the others, 1991).

Reliability and validity of concept maps:
The reliability of concept maps can be interpreted as the consistency or generalizebility of the scores given to students (Cronbach, Gleser, Nanda, ve Rajaratman, 1972). Lomask in his research, analyzed the scores given by four teachers to 39 students' maps and tested the reliability of the evaluation according to consistency among the scores given by teachers. The validity of the concept maps were tested through establishing concurence validity. In 1989 Anderson and Huang decided the validity of concept maps by looking the correlation between concept maps' scores and other examinations.
The research about using concept maps for testing and evaluation: A research was made by Bolte to analyze using traditional testing methods together with concept maps as an evaluation method (Bolte, 1997). The prior purposes of the research were the following; • Using concept maps and traditional testing to evaluate connections of students' knowledge. • Determining the correlation between the scores' of the students that they took from concept maps, written examinations and finals • Determining what were gained by students after the application of concept maps and written examinations.
As a conclusion, it was understood that using concept maps with written examinations was a reliable instrument to test and evaluate mathematics' knowledge.Another research was done by Okebukola (1992) to search the effect of concept maps over problem solving. In this research, it was analyzed whether the students who were successful to establish concept maps, were also successful in problem solving.20 students who were accepted as successful to make concept maps were in the control group among 40 samples. They achieved a meaningful success in 3 different questions.This study was also searched making concept maps in groups. Some of the students made concept maps n groups and some made individually. There was not a meaningful difference between making concept maps in groups and making concept maps individually.
The model of the research: The research will be carried out over a sample. Traditional testing and evaluation methods and using concept maps for testing and evaluation will becarried at the same time and the correlation between the results will be analyzed. In this context the model of the research among scanning models is relationship scanning model including correlation type relation. The correlation between the data from traditional testing and evaluation and the data from concept maps will be searched. The students were asked to make maps about the sets subject in order for them to learn how to make a concept map first. This map was not evaluated. It was only analyzed to get feedback about how better concept maps could be made.After this training, concept maps were made together with 3 mathematics' examination during fall term in 2002-2003 education year. The correlations between the scores from these examination and concept maps have been analyzed.A synthesis of a system which was used by Novak (1984) and McClure (1999) to evaluate concept maps was used to evaluate concept maps. This system is explained in details below: Propositions: Are connection lines and words which connect the relationship between two concepts showed? Is the relationship sufficient? For every meaningful and valid proposition give 3 points gradually as explained below. If there was only a relation, give 1 point, if the relation was named, give 2 points, if the direction of the proposition was showed by using arrows, give 3 points.
Hierarchical structures: Is the map showing a hierarchical variety? Every sub-concept should be more special and less general than the concept above.
(According to the subject that the map was drawn for?) For every valid hierarchical level, give 5 points.
Diagonal connections: Are there meaningful connections between a hierarchical structure with another piece? Are the connections valid and meaningful? 10 points for both meaningful and valid connections. 2 points for valid connections but which does not show a synthesis of concepts or sets of propositions.
Examples: 1 point for exemplifying with special events or objects. Finally a map which had been made by a teacher was scored according to four items. This expert's map's score accepted as total score (in our study total score is 100) and students maps' scores were rearranged over 100.  Findings and comments: Since our research was for analyzing the correlation between testing and evaluation by using concept maps with traditional testing methods, first of all, a two-week training was given to students who were in the sample group to teach what a concept was and how a concept was made. During this process, concept maps were showed to students and they were asked to make a concept map about a free subject. During the application process including the first semester of 2002-2003 education year, 3 examinations, two of them were traditional and one of them was multiple choice, were applied to the students in the sample group by their school, in Mathematics 1 lesson. Students were asked to make concept maps parallel to these examinations about the same subjects. In order to remove the suspects about when to apply concept maps, 1 st concept map was made before the examination, 3 rd concept made after the examination. 2 nd concept map was made at the same time with examination by asking students to convert concepts into maps. 1 st examination couple (examination which was applied by school and concept map) was about functions, 2 nd examination couple was about numbers and 3 rd examination couple was about exponent numbers, rooted numbers and absolute values.Students' concept maps were scored as stated before. Teacher made a concept map related to each 3 subjects, in order to use during scoring.
The reliability of mathematics' examinations: 1 st and 2 nd mathematics examinations had 10 questions and each question was 10 points. 3 rd examination was multiple-choice and it had 20 questions, each question was 5 points. From the Table 2, we understood that the coefficients of the examination's reliability were about respectively 0.50, 0.70 and 0.60.
Before analyzing the correlation between concept maps and mathematics examinations which were carried out by the school, it would be suitable that the reliability coefficient of mathematics examinations should be on higher levels.Which questions should be skipped to determine the new reliability of the examination could be made with reliability analysis in SPSS software. The reliability of the examinations could be increased by skipping the suitable questions.
As you can understand from the above table, the reliability of the questions could be increased about %70 by skipping the suitable questions from Mathematics examination-1 (5, 6 and 8 th questions). The reliability of the questions could be increased about %76 by skipping the suitable questions from Mathematics examination-2 (1, 2 and 6 th questions). The reliability of the questions could be increased about %70 by skipping the suitable questions from Mathematics examination-3 (9, 10, 15, 16 and 19 th quest.).

The reliability of concept maps:
The reliability of concept maps was tested by looking at the concordance of the points that were given by different readers, with each other. (Lomask and others). With this purpose,   It is seen in Table 4 that even concept map 3, which has the lowest agreement coefficient, had about 0.8 agreement coefficient. Besides, almost 3 teachers agreed on the scores of the concept map-1.With this result, it is statistically proved that the scoring system which was used in our research was sufficiently reliable. Since we accepted the scoring system of concept maps as reliable, from now on in our analysis, the score of each student's concept map was taken by finding average of three teachers' scores.

The reliability of literature examinations:
The reliability of the literature examinations were determined as 43, 63 and 43%, respectively. These reliability ratios were increased about 60, 70 and 50% by skipping the questions stated in the table.
Analyzing the correlation: According to the table above, as we expected, there is not a meaningful relationship between multiple choice type mathematics-3 examination and the concept map according to this examination. In addition to this, it is determined that there is no relationship between the results of Mathematics Examinations 1 and 2 and the concept maps.
Surprisingly, the average of literature examination is highly related with concept map's scores. The information that we got can be interpreted as the following: -Concept maps tested the student's knowledge from a conceptual point of view. On the other hand, when we think the university entrance examination in our country, it is normal for the high schools to focus only on knowledge instead of a conceptual improvement. Besides, an examination system is carried out in which only answering as many questions as one can do in a limited time span, is rewarded without the need to interpret the meaning of a question. In this situation, it can be accepted as normal that there is not a correlation between an examination testing student's conceptual knowledge and that kind of an examination.
The lowest correlation is seen in Table 6 in which there was multiple choice type of mathematics examination application. According to this result, our second hypothesis, there is no correlation between multiple choice type tests and concept map testing, was supported.Although it does not mean anything statically, the correlation between concept mapping and second mathematics examination which were applied together was the highest with level.12.
-Besides, the students of the science high school were among the most successful students in our country test system. But unfortunately these students' conceptual education is being ignored.In addition to that, Table 6 is pointing that there is a meaningful correlation.01 between literature examination's average and concept map's average.That is, the student who has a high score from concept mapping has also a high score from literature examination according to his/her classroom, or the student who has a low score from literature examination has also a low score from concept maps.
In conclusion, even with mathematics, making a concept map requires oral thinking and ability to express your thoughts.

CONCLUSIONS
In this research concept maps have been used as a testing and evaluation method. In our country's education system, the relationship between the scores of traditional written examinations and multiple choice type tests and the scores of concept maps have been analyzed. Especially the examinations about functions, numbers, exponent numbers, rooted numbers and absolute values have been tested.
The application process of the research included the first term of the 2002-2003 education years. The sample on which studied was 17 students from 9-A class in Anatolia A Science High School. The research was built upon the question: "Is it suitable to use concept mapping to test and evaluate the subjects which were taught in mathematics lessons in our country?" The main hypothesis of the research is the results of the traditional testing method in mathematics lessons and concept maps will be concordant. Besides, although the concept maps which were done in the application were related to mathematics, it was thought that some possible factors to make concept maps could be related. One of these factors is together with students' numerical intelligence, oral thinking and expressing what they thought, the literature lesson scores which were believed to reflect these abilities sufficiently, have been compared with concept map scores.
During the application process of the research the two of the three mathematics examinations were about functions and numbers and they were traditional written examinations. The third examination was about exponent numbers and absolute values and it was multiple choice type. The students were asked to make a concept map about each examination's subject. The concept map related to first mathematics examination was made before the examination, second examination was made during the examination; third concept map was made after the examination. In the statistical analysis made after application, • The reliability of the examinations which were carried out with concept maps was analyzed.The reliability of these 3 examinations were, respectively 94, 84 and 80% • The reliability of the examinations which were carried out by school was, respectively 50 and 71 and 60% • There is not a statistical meaningful correlation between the scores of concept mapping based testing and evaluation and mathematics examination.
• There is a.01 meaningful correlation between the scores of concept mapping based testing and literature examinations.
The following results have been achieved when we have analyzed the concept map as an instrument to testing and evaluation in teaching mathematics.
• A reliable testing can be made by using concept maps in mathematic lessons.
• There is not a meaningful correlation between concept mapping based testing and evaluation and testing and evaluation with traditional written examination. In another words, when we limit with the subjects of this research, the testing with concept maps is not accepted as valid.
With this result our first hypothesis has been rejected. There are two or three views on scoring concept maps in literature. One of them is saying that it is not necessary to score concept maps (White and Gunstone, 1992) the other was presented by Novak and had a very complex system to score concept maps (1984). The views about scoring concept maps can be placed among these two views.In short, the views about using concept maps to test and evaluate is still vague. In this research using concept maps in teaching mathematics have been searched and relationship was expected, on the other hand, the possible factors that effected to have such a result are stated below.Besides, you should not forget that testing with concept maps under the assumption that mathematics examinations prepared by mathematics group of the school were valid, was not realized as valid.
• Students may not interested with concept maps sufficiently • The fifteen day training period about concept maps before the application may not be sufficient. That is some student may not learn to make a concept map.
• Although the sample have been chosen from 9 th grades, students in the university entrance examination system in our country have already entered the examination mood. The psychology may not give much importance to conceptual learning.
• Second concept map was made at the same session with the examination. Although it is not meaningful, the correlation between this map's score and the mathematics' examination score were the highest among others (approximately 0.1).
Asking students to make a concept map separately may take students from the examination atmosphere.
• There is not a meaningful correlation between the third mathematics examination and concept map about the some subject. Since multiple choice type of testing is not sufficient to test student's conceptual intelligence, this situation has been expected in the second hypothesis,.
• When analyze the relationship between the three concept maps and the average of three literature examinations, we found a.05 meaningful correlation.
The third hypothesis is stated to fin an answer to this question: Is making a concept map about any subject require knowing that subject and in addition to that recognizing the relationship between concepts and expressing your thoughts?
By moving from the assumption that these abilities are related with oral thinking and expressing your thoughts and the lesson which exactly reflects these abilities is the literature..05 a very close relationship ratio to.01 meaningfulness was determined when searching the relationship between the scores taken from concept maps and literature examinations. This result was supporting our third hypothesis which stated as "when making concept maps it is necessary to know the relationship between concept maps and literature lessons which had the closest relationship with those abilities." When we synthesis the results above, it is not seen as suitable to use concept maps itself to test and evaluate in the education system of our country.

Suggestions
• Before using concept maps to test and evaluate yours students, you should make sure that your students are using concept maps for their own learning. By this way, students can have a better understanding on the relationship between concepts.After this level, you may have more correct results by using concept maps to tests your students.
• You should not think to use concept map only for testing, you should use concept maps from the early steps of education (primary) by integrating every aspect of concept mapping into education programme.
• Most of the studies which have positive results after using concept maps for testing and evaluation carried out in science field (Novak and Goving 1984; Lomask and the others, 1992). Science by the help of its content has a chance to achieve its targets by learning the relationship between concepts. On the other hand mathematics gives more importance to practical applications with revealing the knowledge than conceptual learning when you compare it with science lessons. In this context, it is not sufficient to use concept maps only as a testing and evaluating method.more successful results can be achieved when you use oncept maps by synthesizing it with other testing methods that are directed towards application.
• When you want to use concept maps for testing, you should use it at the same session with the other testing method.
• The university entrance examination system used in our country affected the whole education processes. It was accepted by everybody that this examination system lacks to test conceptual knowledge. In such an environment, it should be accepted a natural conclusion for students to react to find a practical way to solve the questions instead of learning the essence of the subjects. When you think that concept maps have positive effects over meaningful learning, we should give more importance to test our students' conceptual intelligence in our country's general testing and evaluation system. • Studies should be carried out first for using concept maps for teaching and learning in a long period (at least two months), later for testing and evaluating students by using concept maps.
• A similar research can get more general results by using different mathematics' subjects.
• With this research, an already known problem in our education system has been noticed itself. A system for testing our students' conceptual structure is not concordant with the system that we officially test our students. This situation shows us that our students lack conceptual education.
Very important results can be achieved with scanning type research by searching the conceptual education in our country and very fruitful suggestions can offered to our education system.