Effects of an Educational Scenario Exercise on Participants’ Competencies of Systemic Thinking

Problem statement: Education for Sustainable Development (ESD) aims to shape key competencies of individuals and therefore needs methods to enable learners to acquire these competencies. Systemic thinking can be regarded as a meta-competency in ESD, because it contains many important aspects found in most key competencies of ESD. Scenario analysis is described as a learning environment that fosters the acquisition of systemic thinking and other important competencies, but empirical proof of this assumption is rarely found in the literature. This article presents such an empirical study and develops a specific instrument to investigate the effects on participants’ competencies taking part in ESD seminars in which scenario analysis was used as methodology. Approach: A study was conducted of four educational seminars, using a pre/post design with two treatment groups which took part in a scenario analysis seminar and two control groups. Altogether 72 university students from different disciplines (semesters 1-6) were involved. In a questionnaire, constructs like domain specific knowledge and the perception of the future as well as systemic thinking were operationalized quantitatively in order to achieve a practicable and quick measurement. Similarity Judgement Rating was used to elicit participants’ knowledge structures about climate change in order to gain concept-maps for comparison with reference models. Paired t-test, mean values, factor and cluster analysis and correlation were used. Results: No significant changes in the structural knowledge, perception and formal knowledge of the groups could be observed although some developments were noted, e.g. in the perception of the future. The fact that treatment groups showed little advancement in factual knowledge could be read as a hint that generally only weak (measurable) effects had taken place. Conclusion: Some indications were found that the measurement instrument works in principle, but that its application in the thematic domain of climate change seems to be problematic due to a relatively high level of general knowledge and systemic interrelations of concepts (of climate change) that cannot be precisely described. The hypothesis of educational effects of participation was not confuted but should be investigated in a different thematic context.


INTRODUCTION
In the literature educational effects are described on participants in an (educational) scenario analysis (SA), like competence building through reflection of feedback-loops and interrelations, stimulation of imaginative and explorative thinking in alternatives and being able to cope with complexity and uncertainties (Godet, 2000;van Notten et al., 2003;Swart et al., 2004). Therefore the use of scenario analysis is discussed as a teaching method in environmental education and Education for Sustainable Development (ESD).
So far there have only been a few attempts to test the educational effects attributed to scenario analysis. This article describes the development of a method and the results of its application in order to investigate the hypothesis that participation in an educational scenario analysis leads to more sophisticated systemic thinking and also affects participants' perception of the future and domain specific knowledge.
For this purpose, a learning setting (higher education for sustainable development) was developed in the form of a project seminar that used a basically qualitative and explorative scenario exercise to arrange a didactic setting. All goals of the seminar were derived from the relevant literature. The seminar focused among other driving forces on the development of regional future scenarios with a special emphasis on climate change. It aimed at answering the question of "what could the environmental conditions for tourism look like in the Black Forest in 2050 when climate change is taken into account" (Burandt and Barth, 2010).
The hypothesis mentioned above was studied especially by means of participants' cross-linked knowledge structure of the domain of climate change. Climate change was chosen because it was part of the seminar's framework of thought and also is a very complex topic demanding well linked knowledge structures to understand it. Therefore a new method for the measurement of the competency of systemic thinking was developed. One requirement on this method was for it to enable a quantitative collection of data that uses as little of the participants' time as possible.
The article first gives an overview of the background of systemic thinking and integrates it into the discourse about competencies for education for sustainable development. This is followed by an introduction to different methods of measuring systemic thinking before the theoretical framework for its measurement in this study is deduced by using Similarity Judgment Ratings. The second part of the article describes the actual study, the research instrument, results and conclusions of the study.

Systems, cross-linked thinking and structural knowledge:
What is systemic thinking?: The concept of "systems thinking" or "systemic thinking" has evolved over the last 50 years and is now used in many disciplines; this is why it has acquired diverse meanings, ranging from a set of skills to a proper discipline. From an educational perspective, all approaches agree more or less on the existence of thinking skills that help people to better understand interdependencies and processes in systems e.g. in order to achieve improved decision-making or to foresee the outcome of an action. The different approaches deliver tools and methods to cope better with complex situations or to facilitate the acquirement of useful thinking skills.
The quantitatively oriented branch of systems thinking emerged from Forrester's "industrial dynamics" and later "system dynamics" (Forrester, 1987). These concepts follow a strict quantitative paradigm and are often linked with the use of simulation software. Later, Richmond introduced the concept of "systems thinking" and enabled a broader use of simulation software by using flow charts and a graphic user interface instead of a simulation language only. Richmond describes systems thinking as a set of skills indispensable for the (competent) use of simulation software, including: dynamic thinking, closed-loop thinking, generic thinking, structural thinking, operational thinking, continuum thinking and scientific thinking (Richmond, 1993).
The qualitative use of the concept was, in the German speaking countries, introduced and established by Vester (leitmotif of cross linked thinking) (Vester, 1989). Thus, system oriented management approaches refer to Vester's concept and have developed a methodology to "model" and analyze systems without a computer, e.g., with flowcharts (Gomez and Probst, 1995). It is possible to identify a system's core drivers, feedback loops and certain aspects of system dynamics in order to deduce possibilities of managing the system that is being looked at. Senge's "systems thinking", though originating in the quantitative branch, has been developed into the idea of (qualitative) organizational learning (Ossimitz, 2000). Senge describes systems thinking as the most important "fifth discipline" in organizational learning that integrates four other disciplines. In its quintessence, systems thinking is viewed as a skill which enables one to see feedback and processes of change instead of snapshots; and to see "interrelationships rather than linear cause effect chains" (Senge, 1993).
In cognitive psychology Dörner introduced the approach of complex problem solving, which refers to Vester's idea of cross linked thinking (Dörner and Wearing, 1995). He initiated a series of experiments from which evolved a whole new field of research. These experiments were intended to measure how people perform in complex problem situations (e.g. in a complex computer simulation with variables and a lot of feedback loops: "Tanaland") where systems thinking skills had to be used. Research was also done on whether special (educational) trainings can improve performance and skills. As some conclusions from large studies have shown, "correct" behavior is dependent on the situation. This demonstrates that individual systems thinking is context dependent and cross-references to different contexts are difficult to establish. Finally, it was concluded that systemic thinking is a bundle of abilities that cannot be described and investigated like a single ability (Dörner and Wearing, 1995).
It is obvious that a common definition or understanding of systems thinking is difficult to obtain, as is also stated by Sweeney and Sterman (Sweeney and Sterman, 2000): "There are as many lists of systems thinking skills as there are schools of systems thinking". Ossimitz (Ossimitz, 2000; gives a general definition of systemic thinking that attempts to integrate different system approaches from the literature.
Systemic thinking embraces four interrelated dimensions: • Interrelated thinking: A thinking in interrelated, systemic structures • Thinking in models: Explicitly comprehended modeling • Dynamic thinking: A thinking in dynamic processes (delays, feedback loops, oscillations). • Steering systems: The ability for practical system management and system control This is a very comprehensive definition that does not focus on a special "systemic thinking school". However, there is considerable evidence that thinking and knowing does not necessarily lead to appropriate action (Kollmuss and Agyeman, 2002). Furthermore, thinking in models does explicitly integrate a constructivist perspective, but is also understood as a skill at using software tools to represent (mental) models. As "dynamic thinking in cross-linked structures is always thinking in models" (Seel, 1991), people still could fail in representing models in an unfamiliar way (Sweeney and Sterman, 2000). So, Ossimitz definition (especially from the viewpoint of ESD) seems too limiting. In this article systemic thinking is understood accordingly to comprise only the first three dimensions: The ability to recognize, describe and model complex parts of reality as systems; the ability to identify drivers, elements and their interrelations and, third, the consideration of dynamic processes (dimension of time) for the further development of (mental) models (Although using different arguments, Rieß and Mischo (2008) arrived at a similar understanding of systemic thinking).
Systemic thinking -a competency in education (for SD)?: Internationally, the OECD project "Definition and Selection of Competencies: Theoretical and Conceptual Foundations (DeSeCo)" established a conceptual framework that embraces "Key Competencies for a Successful Life and a Well-Functioning Society". Independently from the ESD discourse, a normative framework was developed consisting of three categories into which certain key competencies can be classified: (1) interacting effectively in socially heterogeneous groups, (2) using tools interactively and (3) acting autonomously (Rychen, 2009).
Education for sustainable development embraces attempts to educate, enable and empower people to contribute and participate actively in the sustainable development of our society (de Haan 2006). Therefore the German discourse led to the central goal for ESD "to offer possibilities to acquire shaping competence" ("Gestaltungskompetenz") (de Haan 2006).
Shaping competence covers a set of corresponding key competencies. There is substantial agreement about these key competencies even if the exact definitions and distinctions are still under discussion. Barth classified key competencies of shaping competence into the DeSeCo framework, offered a theoretical background and showed that they are internationally linkable (Barth, 2009).
In neither discourse can "systemic thinking" be found explicitly among the discussed key competencies. But they consist in any case of a set of subsidiary competencies that in their interplay account for the full competence. As an example, the importance of systemic thinking on this subsidiary level is illustrated in the following.
The competency to plan processes and sequences of action refers directly to individual action but also implies a certain degree of systemic understanding, or the ability to identify steps and to correlate them. The importance of systemic thinking becomes more obvious in both the competency to think anticipatorily, to cope with uncertainties and to develop prognoses and the competency for dealing with uncertainties and thinking proactively. Coping with complex systems requires the ability to identify elements as part of a system as well as to adopt a holistic view of interrelations and dynamics (Burandt and Barth, 2010). The necessity to think proactively generally involves anticipating (unintentional) effects and accounting for the possibilities of risk, both of which necessitate a high degree of systemic and dynamic thinking. For the competency to collaborate interdisciplinarily (and transdisciplinarily) and the competency for using, shaping, handling and sharing different sets of information and knowledge, it is necessary to have a systemic understanding of the specific knowledge of e.g. one's own discipline and to be able to transfer it to new contexts. Systemic knowledge in the form of knowledge about structures, processes or interrelations also has to be linked to new contexts or problem areas and it has to be applied and communicated. In addition, the DeSeCo project stresses the general importance of systemic and cross-linked thinking because it contributes to (self) reflection, which is a core aspect of most key competencies (Rychen, 2009).
In summary, systemic thinking is an essential part of most key competencies. Therefore it is understood in the sense of a meta-competency that is part of, or facilitates the use of, specific competencies (Weinert, 2004).

Measuring systemic thinking:
Like measuring the success of (educational) interventions, systemic thinking has always been a focus of interest in the field of educational assessment. Today, this whole area is linked to the different discourses about competencies. The assessment and measurement of competencies is a very complex topic needing "sound models of competence structures, competence levels and competence development" (Klieme et al., 2008).
Pioneers from cognition psychology in the assessment of systems thinking in educational contexts are especially because of the Hilu-scenario task. Among other things, pupils had to represent a text in the form of a chart that describes the complex life of the Hilutribe and also had to answer questions about future states/scenarios (of this dynamic system). Over the years the survey methods became more complex and their scale increased. Later Niedderer et al. (1991) used a very sophisticated method consisting of eight different types of task and Ossimitz operationalized his definition of systemic thinking by assigning seven sub aspects (skills) to the 4 dimensions, e.g. the identification of cross-linkings whereby these aspects contribute to some extent to all dimensions. Among other tasks he used a refined version of the Hiluscenario in his survey instrument that he assessed as suitable (Ossimitz, 2000). Sweeney and Sterman (2000) summarize a great amount of (international) research which deals with the efficacy of interventions designed to develop systemic thinking. Their conclusion is that most research questions have so far remained unanswered. They identify (like Ossimitz) a couple of specific "systems thinking skills" and introduce a new type of tests to explore students' baseline systems thinking abilities. One of these is the "bathtub task": here a relatively simple task about in-and outflowing water in a bathtub without any feedback loops had to be visualized in a graph (diagram). This type of test captures systems thinking through some technical, quantitatively oriented skills and was repeated in a lot of similar studies in other countries. Sweeney and Sterman found that there is only a weak relationship between education and performance (even in the case of MIT students who are likely to be very familiar with higher mathematics). They assumed that there might be a difference between (everyday) understanding of systems and the presentation of the problem in the form of a graph (Sweeney and Sterman, 2000).
Although these types of tests have delivered viable results, the capturing of systemic thinking in the form of representing a linear quantitative development (water in a bathtub) without (complex) interrelations seems to be too narrowly focused to map it in the sense of education for sustainable development. As introduced before, the meta-competency systems thinking requires a more general understanding. Sustainability implies complex, ill-structured, real-world problems where relations of system elements can often not be exactly quantified. A tool would be desirable that captures quantitatively all qualitative dimensions of systems thinking, but this would lead to very complex survey methods that would be difficult to apply in practice- Niedderer et al. (1991) e.g. needed about 8½ and Ossimitz (2000) up to 5 hours for their respective surveys. Rieß and Mischo (2008) have developed and validated a questionnaire on the comprehension of systemic thinking in a context relevant to sustainability. They do not provide information about the time needed to answer all questions, but used open questions as did Sweeney (Sweeney and Sterman, 2000) too. These tasks generally require an elaborate procedure for the analysis of the data because e.g. charts or diagrams have to be interpreted and translated into codes by human raters.
So, a wide range of different methods has been evolved over the last decades, each with their own advantages and disadvantages. None of them is able to "measure" the construct of systems thinking independently from context so that results cannot be generalized. Finally, the right balance has to be found for each study between practicability and complexity regarding the coverage of the theoretical construct.
Measuring systemic thinking in ESD: Following Ossimitz (2000) some sub-aspects can be assigned to the four different dimensions of systemic thinking -in the sense of a meta-competency of key competencies for sustainable development. To keep the survey method as lean as possible, the construct of systems thinking will be represented here mainly by one aspect: the identification of connections and relations of drivers in a complex system of a given domain, because crosslinked thinking is an important aspect that is closely interwoven with all dimensions of systemic thinking (Ossimitz, 2000).
Cross-linked thinking demands both a certain degree of knowledge of the system one is looking at and a specific use (transfer) of this knowledge. Cognitive psychology gives different constructions of knowledge that can be useful for cross-linked thinking. Beside declarative (Ryle and White, 1972), procedural (Schank and Abelson, 1979) and tacit knowledge (Polanyi, 1997), an intermediate type of knowledge appears most useful "that mediates the translation of declarative into procedural knowledge and facilitates the application of procedural knowledge. Structural knowledge is the knowledge of how concepts within a domain are interrelated" (Diekhoff, 1983). According to this the following assumption can therefore be made: systemic thinking as a meta-competency for ESD can be represented largely by structural knowledge because qualitative relationships among concepts play a particularly important role in contexts relevant to sustainability.
Structural knowledge is a widely accepted construct of cognitive structure. It is based on the theory of semantic networks (Collins and Quillian, 1969), the most important feature of which is that human memory is organized semantically. Memory structures are composed of nodes and ordered relationships or links connecting them. Three important aspects build up the rationale for structural knowledge (Jonassen et al., 1993): • Structure is inherent in all knowledge (Mandler, 2004) • Learners assimilate structural knowledge (Shavelson, 1972) • Experts' structural knowledge differs from that of novices (Chi et al., 1981) Generally, when one works with structural knowledge, two assumptions have to be made. The first is the concept of semantic similarity: it refers to the spreading activation theory (Collins and Loftus, 1975) which basically says that the more closely two concepts are linked, or the more common properties two concepts have, the more similar they are processed in semantic networks. The second assumption that has very often to be made is that semantic similarity in the form of semantic space of concepts in memory can be represented in terms of geometric space.
The literature offers several ways to analyze structural knowledge. Generally it is necessary to (1) elicit the structural knowledge (directly or indirectly) in order to (2) represent and analyze the underlying structure (Jonassen et al., 1993). The elicitation of the structure of knowledge can be done by similarity ratings or similarity judgment tests (SJTs). For this, after a set of related concepts that define a subject's domain have been identified, the respondent is asked to assess the degree of relationship/proximity of each pair of concepts (e.g., on a Likert scale). It is the "most direct method for rating or comparing the semantic similarity between concepts in an individual's cognitive structure" (Jonassen et al., 1993). The results of the rating can be transformed into a (proximity) matrix that delivers the data basis for the representation of the knowledge structure. This is an indirect method to elicit the knowledge structure (Stanners et al., 1983;Goldsmith et al., 1991).
In a next step, the underlying knowledge structure of the matrix can be represented and analyzed. It is for example possible to transform the matrix into cognitive maps (Schvaneveldt et al., 1989;Jonassen et al., 1993;Shavelson et al., 2005). The software package "Pathfinder KNOT" delivers several possibilities of creating cognitive maps with nodes and links, e.g. in the form of data nets. The Pathfinder algorithm is in addition also able to extract the system's latent structure by identifying the closest connections of concepts. Cognitive maps can be analyzed qualitatively or quantitatively. To process cognitive maps automatically it is possible either to correlate their rough data or to compare maps with a reference system. KNOT supports the comparison of networks, e.g. by the index of correspondence "csim" that basically puts into relation the number of interrelations with the number of shared interrelations (corrected by probability). Figure 1 shows a data matrix (of concept pairs) and the resulting Pathfinder network. Fig. 1: Similarity matrix and pathfinder network (∞, n1) of a first semester student @t1 Therefore it is suggested that a comparison of students' networks with that of an expert at different stages of an educational setting can show changes in individual knowledge structures and indicate an advancement in systemic thinking.

MATERIALS AND METHODS
The primary purpose of this study was to test the hypothesis that participation in an educational scenario analysis has an effect on the competency of systemic thinking and, secondly, an impact on participants' knowledge and their perception of the future.
These hypotheses were tested in a pre/post design with four university seminars: Two Treatment Groups (TG) and two Control Groups (CG). Table 1 gives an overview of the samples. The CGs participated in a "normal" seminar while the TGs took part in the educational scenario exercise. The study was conducted in the winter semesters of 2008/2009 and 2009/2010; course length was 14 weeks.
In the first sessions students had to fill in a questionnaire (t1) and at the end of the seminars (t2) the same questionnaire had to be filled in again in an online version. It took students about 35-45 minutes each to answer all questions. As the data in the questionnaires were collected anonymously, participants were asked to create their own personal code so that t1 and t2 could be compared individually. At t1, each seminar was attended by about 30 students from various disciplines, such as environmental sciences, cultural sciences, law, economics and education in their (obligatory) complementary part of their disciplinary studies at [name of university] (semesters 1-5). Due to drop-outs and to wrong code entries the number of data pairs was reduced so that not all respondents could be taken into the analysis (see N paired datasets).
The questionnaire consisted of two parts: one was to measure systemic thinking and the other was to investigate changes in constructs of perception and knowledge. As the TGs' class work concerned climate change, that was also the topic which was chosen as the questionnaire's thematic domain. To elicit structural knowledge a SJT about climate change was developed. Table 2 shows 12 concepts from the domain of "climate change" which were derived covering ecological, societal and economic aspects. These concepts were paired in all possible combinations (12 x (12-1) / 2) and ordered randomly in the questionnaire to avoid context dependent effects. The 66 concept pairs had to be rated for their proximity (or semantic similarity, which could include causal relationships) on a scale from one (very close) to seven (no relation).
For the analysis of students' responses the list of pairs was converted into a similarity (or distance) matrix and transformed by the Pathfinder algorithm into concept maps in order to represent the underlying knowledge structure. To analyze the quality of and differences in individual systemic thinking both the structural knowledge networks and data matrices were compared to reference systems by two indices csim and R STJ (correlation of data matrix) calculated with KNOT. The main reference system was created as the median of the responses of three national and international experts on climate change. As additional references were used the group average and the TG's-teacher's concept map.
Attitude to climate change, perception of climate change and perception of the future were tested by different item batteries (Likert scale 1-7). Constructs were validated by factor analysis and tested for reliability (Cronbachs Alpha range 0.56-0.74 for t1 and t2).   Students were asked to self-assess their knowledge about climate change by assigning school grades to it. Knowledge also was tested by two item batteries: • Assessment of importance of 18 given "drivers" for climate change (Likert scale 1-4) • "'Facts' about climate change" consisted of a list of 15 true/false statements, arranged in order of increasing difficulty, that also contained some of the public "myths" about climate change in order to investigate whether their knowledge would advance to the level of expert knowledge. A third option ("don't know") was given to minimize guesses The data were analyzed by descriptive and multivariate statistics (percentage, mean, paired t-tests). Factor and cluster analysis were also applied to the SJT data in order to gain additional access to data as well as to identify different types of responding structures/areas of improvement within the 12 concepts of climate change and latent responding structures among participants. Other aspects that were calculated included bivariate correlation (Pearson and Spearman) of knowledge, different constructs and the similarity of structural knowledge to changes in systemic thinking (csim and R STJ ) and demographic aspects.

RESULTS
Changes occurred in all four groups during the period of the seminars in both the latent (csim) and the direct structure (R SJT ) of participants' knowledge. These changes in relation to the different reference systems are shown in Table 3. Taking the structural knowledge of the "3-experts average" model as reference point, both TGs' mean values of csim and R SJT are relatively high at the beginning (t1), but csim for example decreases at t2 (0.021 and 0.015), while CG01 showed an increase of -0.026 and CG02 a loss of 0.043 in correspondence with the reference model. In the TGs especially no common effect could be observed. Tests with additional reference models, like the teacher's one, delivered similar results.
A significant measurable effect was that participants in CG02 veered away from a shared knowledge structure of climate change ("Group Average" csim), whereas in both TGs the participants approximated their structural knowledge minimally (<0).
The data of the similarity matrices were investigated to identify structures and parts in which significant differences between the groups could be observed. No usable factors were yielded by a factor analysis of the 66 concept-pair variables of the difference t1-t2 (showing areas of strong and slight change) and of the absolute values |t1-reference model|-|t2-reference model| (showing areas of improvement in relation to the reference model). Concepts of major change were identified: within the top one third of concept-pairs with the highest number of changes (t1-t2), the dominant concepts were groundwater (7x), diseases (6x), erosion (6x) and kryosphere (5x). But no connections to different groups could be established.
A cluster analysis of participants' responding structures showed that those students who showed an above-average improvement in their knowledge structure during the course also evinced an above average correspondence with the reference model at t1. Additionally, there were no usable results of clusters of participants (by analyzing the strength of changes in the structural knowledge between t1 and t2). An analysis of only those areas which showed major changes (see above) did not deliver any clusters or groups that had describable differences.
A paired t-test of perception, attitude and knowledge did not deliver significant results either. Therefore only descriptive data about attitude and perception (Table 4) and relevant knowledge (Table 5) have been included in this article. These results do not indicate differences between the TGs and the CGs during t1 and t2. TG01 achieved a higher score for seeing/accepting the unpredictability of the future after the seminar. Both TGs reached a higher score in their attitude to act proactively in face of an uncertain future, but they did also deliver higher values than the CGs at t1 (Table 4).
Concerning knowledge, all groups tended generally to consider all given impacts as relevant and to have difficulties in distinguishing between important and unimportant impacts; only about 56 % of the maximum score of "wrong/not important impact factors" (like waste separation) was reached (Table 5); no major differences between t1 and t2 were noted. In "'Facts' about climate change", TG02 reached a higher percentage (3.9%) while TG01 lost and CG02 won a bit. CG01 did lose about 6.5% of correct answers.  .9705 ***: Significant at <0,001 level; a: index of correspondence corrected by probability -csim (range 0 to 1, similarity of cognitive-maps with reference system); b: correlation coefficient R SJT (accordance of data matrices with reference system); M = mean, SD= standard deviation Table 4: Descriptive data of participants' attitude and perception of the future t1, pre t2, post   Students were asked to assign school grades to their own knowledge about climate change. This item correlates with "impacts on climate change (all)" (0.338, p < 0.01) and is also a predictor for "true facts about climate change" (0.247, p<0.05). This self assessment correlates very clearly (0.605, p<0.01) with the csim index (3-experts-model at t1) but not at t2. The similarity of the knowledge structure (csim) with the teacher's model (not shown in table) correlates negatively with the number of wrong answers given in the true/false statement test (-0.207, p<0.05). No demographic data showed a correlation with the constructs and parameters used.

DISCUSSION
Although there were some changes visible in the data between t1 and t2, no significant results could be found to verify the hypothesis that perception, knowledge and knowledge structure were affected by participation in the educational scenario analysis. The following discusses the individual aspects in more detail.

Formal Knowledge, perception and attitude:
Considering the mean values of all four groups it is noticeable that the two treatment groups behave differently: for instance, while the expert knowledge of TG02 improved slightly it decreased in TG01 (generally the standard deviation is relatively high), but in other cases TG01 performed better.
The age, number of semesters and course of studies did not correlate significantly with results in the SJT and knowledge tests. These results suggest that a) cross-linked thinking is a competency that is independent of courses of studies and b) that while knowledge about climate change is general knowledge it is nevertheless more difficult than expected to advance partly (wrong) general knowledge to expert level. To illustrate: a frequent topic of discussion in the TGs was climate simulations that calculate mean values over time periods of 30 years. This means that it is very difficult to draw conclusions from the results of climate simulations about individual weather events in a specific year. One item (knowledge true/false statement) stated that it will never snow in the Black Forest if the global mean temperature rises by 3°C. This statement is incorrect (as is shown e.g., by the freak snowfall in the Sahara in 2005), but with few exceptions this question was answered incorrectly at t1 as well as at t2 in all groups. In both TGs enough information on and discussion of this topic was offered so that a transfer of this knowledge would have been possible, yet it did not happen. Assuming that the question was understood correctly, some support can be found in this result for the supposition that participants in the TG learned less than expected.
Perception and attitudes did not correlate significantly with variables nor could significant differences be observed. However, a positive change in personal proactivity was noted, but then the values of the TGs were higher at all stages than those of the CGs. These higher values can also be interpreted as showing that students' interest in and awareness of this topic made them choose the seminars about scenario analysis and future planning.

Structural knowledge:
The results for this aspect can be interpreted in three different ways: (a) the educational scenario analysis did not have a measurable effect on participants' knowledge structure; or (b) the teacher or the design of the educational scenario analysis was insufficient; or (c) SJT and comparison of indices (with reference models) did not capture relevant knowledge structure or delivered unreliable results.
The falsification of the main hypothesis would contradict all the literature that describes educational effects of participation in a scenario analysis, although the attributed effects have not been proved and measured quantitatively. But nor did the knowledge tests show any significant advancement in knowledge. The CGs did not work on climate change so that no significant change in formal knowledge could be expected. Due to the great relevance of climate change in the media, a significant rise in this performance could have indicated strong outside influence on (all) the seminars, but there was none. Additionally, the CGs and TG01 participated in at least one further lecture or seminar treating the topic of climate change. In the TGs, climate change was discussed as driver and impact factor in connection with the guiding question of the seminars of how tourism in the Black Forest could develop in future. While there was no explicit factual input from the teacher, there were students' presentations and, as mentioned at the beginning of this chapter, relevant facts and information was offered during the seminars to advance individual performance in the knowledge test. As no significant increase in formal knowledge about climate change was visible, this did not contribute to a confirmation of the main hypothesis. These considerations lead to possibility (b).
Insufficiency of the teacher or the design of the learning setting cannot be investigated by the design of this study. For one, the TGs students' final reports (assessments) indicated a general understanding of the methodology and most students did indeed report new insights gained into this complex knowledge topic. Although no qualitative analyses of the seminars' assessments were carried out for this study, some learning effects of participation in the seminar can nonetheless be assumed from them. Evaluation of the seminars by the regular evaluation of the Leuphana University of Lueneburg indicated a (self reported) success of the seminar. For another, the design of the learning setting for the TGs varied (Table 1) while the guiding question and steps within the seminars were exactly the same. Differing influences of the setting of the seminars on the results could not be found.
Measuring systemic thinking and knowledge structure is a difficult task, as was indicated in the theory section of this article. Similarity judgment ratings are used in practice to capture cognitive concept maps. Concept maps deliver insightful qualitative results, but statistical processing is difficult. Although both indexes (sim and R STJ ) delivered contradictory prognostic validity in previous studies, csim had greater validity (Goldsmith et al., 1991) or lower validity (Großschedl, 2010) in relation to R STJ. Nevertheless the indexes have delivered valid results for the comparison of conceptual knowledge (Großschedl, 2010) and its structure (Shavelson et al., 2005;Beatty and Gerace, 2002) gained by SJTs. How can the results of SJT comparison in this study be appraised?
First, it can be asserted that participants' self assessed climate change knowledge at t1 correlates highly with the correspondence of structural knowledge to 3-experts-model at t1 which indicates some validity of the measurement instrument. From a constructivist perspective it could be argued that a comparison with an expert's model need not show any approximation as long as the expert did not teach the lessons. Therefore the teacher's cognitive map was analyzed in addition but showed similar results like the external experts' one.
Nor were usable results delivered by the factoranalysis carried out to elicit certain areas of perturbation (differences t1-t2 of the absolute values (of the difference in single pair rating from the teacher's model) or areas of strong reframing in the knowledge structure (individual change from t1-t2). The individual differences from a common shared knowledge structure (group average model) decreased in the TGs although not significantly while the models differed more strongly (significantly in one case, Table 3) in the CGs. From this can be assumed a more common reframing of TG's mental models.
Generally it can be debated whether the 12 concepts chosen for the SJT focus too exclusively on certain aspects so that even in the case of enhanced systemic thinking hardly any changes could be made visible or measurable. Another explanation also suggests itself -that the general knowledge of climate change, one of the most hotly debated topics at present, was initially relatively high (at t1) so that, again, changes could hardly be measured because the concepts were already strongly linked at t1. Other studies using concept maps for quantitative measurements were confronted with only a few direct links between the concepts at t1 (e.g. only one connection (Wasmann-Frahm, 2005)) which offers broader possibilities for respondents to improve. In contrast, SJT offers the possibility to assign strength to the connection (here from 1-7) whereas in concept maps it is the kind of connection that is assessed (and whether there is one or not). Interestingly, neither the prescribed factor nor the cluster analysis led to usable results in identifying structures. Even if the whole construct of structural knowledge did not change significantly during the treatment, it was reasonable to expect that at least some of its parts would undergo a change. It seems that if the treatment has an effect on participants that there is no common responding scheme.
Finally, climate change as a domain for measuring and comparing structural knowledge could be a problematic choice because even in science many interrelations of concepts are under discussion and cannot be described precisely. Although the three expert ratings used had a correlation > 0.8, a fourth expert could not be included in the model because his rating differed too much from that of the other three. Probably the domain of climate change or other complex sustainability problems are too difficult to be used for exact quantitative comparisons of structural knowledge.

CONCLUSION
The hypothesis that participation in an educational scenario analysis has effects on participants' knowledge, future perception and competency of cross linked thinking could not be verified by the data -no final conclusion could be drawn from it. Pedagogical research is often confronted with weak effects.
On the one hand, some hints were given that while the effects of scenario analysis attested in the literature may be describable qualitatively, a quantitative effect is difficult to measure. The design of the learning setting should also be rethought in order to foster the acquisition of formal knowledge.
On the other hand, a methodology was developed both to measure the competency of systemic thinking by investigating the knowledge structure quantitatively by SJT and by comparison with experts' reference models and also to investigate a way to enable a large scale investigation of competence development. This instrument should be tested with a different set of concepts which are less linked to each other per se. The domain of climate change was probably not an ideal choice for eliciting structural knowledge.
In addition, the research instrument developed in this article should be tested in a different context and with a different thematic domain. Finally, a quantitative, empirically founded proof of educational effects of scenario analysis has yet to be given.