Faced with increasing class sizes, even for seminars, we decided to try holding some seminars over a computer conferencing system. We used the Network Telepathy computer conferencing system (Ashmount Research 1992), which supports threaded conferencing (similar to the COSY system developed at the University of Waterloo) through a simple browsing interface, as seen in Fig. 1. It shows the messages that start discussion threads, including the author, number of comments made and title. Students can browse through the messages that start a discussion thread, pick one, then look at the comments on that subject within the thread. Every day they can look at the new comments, and write some of their own. Network Telepathy (TPNET) runs on our Novell computer network, so it is accessible from any of our 60 machines. TPNET lets students and tutors write and leave messages contributing to the discussion whenever convenient, rather than forcing them to turn up at the same time (unlike audio and video conferencing or chat systems, which require participants to be present at their computers at the same time).
Fig.1 Reading an on-line tutorial in the Telepathy browser.
This way of working promises a more efficient use of students' and tutors' time, since they no longer need to keep on repeating points others may have forgotten, they can work at times of their own choosing, and tutors can handle more than one seminar group at their convenience. Also, the system organises the discussion points by subject and by threads of linked comment, promising to help students link ideas together.
But there is no point in making the teaching more efficient if the key learning objectives cannot be met. So we set out to evaluate the quality of learning when the students were doing their seminars face-to-face and when taking part in seminars on our Network Telepathy computer conferencing system. This paper provides an overview of the resulting work. More detailed accounts of particular parts of this work have appeared elsewhere (Webb et al, 1994, Newman et al, 1995, Webb et al, 1995, Newman et al, 1996).
From educational research we know that learners adopt deep or surface learning strategies. When surface learning they skim, memorise and regurgitate for tests, when deep learning they try to develop a critical understanding of material. Deep learners integrate new learning into their knowledge, while when surface learning uninterpreted information transfer occurs from book to brain to examination paper. There are many aspects of current university education that encourage surface learning styles: from an emphasis on getting high marks to large class sizes (Gibbs & Jenkins, 1992). So what can we do to promote deep learning?

Fig. 2 Key learning quality relationships
Deep learning is promoted by active learner participation. Biggs associated deep learning approaches with 'affective involvement' which is supported by interaction (Biggs, 1987). This interaction takes place in a social context, such as group learning. Lipman emphasised this social context, claiming that the development of a 'community of enquiry' is essential for the development of higher level, critical thinking skills within the individual (Lipman, 1991). In group learning we can find such communities of enquiry.
So there are relationships between deep learning, critical thinking and group learning, as shown in the semantic network in Fig. 2. Critical thinking is a key skill that is required in deep learning. Group learning provides a good (but not unique) educational context for critical thinking processes and deep learning styles, as well as promoting critical thinking through interaction. The lack of interaction in lectures and other unidirectional information transfers severely limits the scope for the testing of ideas, justification and criticism that occurs in more challenging group learning situations.
Best educational practice provides techniques for supporting deep learning. Large, face-to-face, classes can be broken up into smaller active groups using rounds, line-ups, pyramids, projects, courts of enquiry, posters, brainstorming and so on. Students can carry out group research or development projects, case studies or presentations. Critical peer and self-assessment can be used instead of tutor marks to encourage critical thinking (Gibbs & Jenkins, 1992).
Such techniques work without a computer in sight. The challenge is to use computer tools to support critical thinking during group learning.
The key feature of CSCL is the deliberate support for group learning processes. So it is not the use of computer tools by members of a team to do things they could do as individuals, such as designing a poster using presentation software, or responding as a group to the questions of a CBT package.
Instead the CSCL software mediates between group members, and provides tools that either force, or can be used to facilitate, desired group learning processes: processes such as creative ideas generation, critical discussion, or drawing together ideas into an agreed report.
We can design the instructional context to promote group learning while using current computer tools, such a shared whiteboards (where people on a network can all work on the same document at the same time) or computer conferencing. Or we can design computer tools to deliberately support particular group learning processes, just as Group Decision Support Systems, like CM/1, support particular models of decision-making, communication and thinking (Briggs & Nunamaker, 1994).
Second level groupware converts exchanged opinions into a shared understanding of the subject, and shared lists of priorities. Third level groupware goes further, to produce a shared mental model, perhaps as a group causal map. These types of groupware require the participants to think critically and develop in-depth understanding. We need to design second and third level groupware to directly support participants who adopt deep learning strategies, and teaching methods based on course needs, using educational proven group learning techniques.
Until they arrive it is possible for teachers and learners to change methods to fit in with the tool's limitations. This is what we did when using the Network Telepathy computer conferencing system.
In either case, need to find a way to evaluate the learning process, in order to find out whether the educational or technical innovation is worthwhile. For that we need an educational theory.
Different people consider different levels more important, leading to a lack of agreed criteria and techniques among evaluators, as shown by the work presented at the evaluation workshop held before the European CSCW conference in Milan in September 1993. Some relied on observation and picked out striking statements, which gave clues to help designers, but did not measure anything. Others measured everything in sight, then tried to analyse the data--without either theories of good work, or control groups.
What we need are theories of what constitutes good work. In particular, theories of quality working processes that can be used long before we can measure performance outcomes. For at least one type of work we have such theories: the work we call learning.
What are we trying to evaluate? We are looking for signs of critical thinking in a social context, as evidence for group and deep learning.
This is a measure of the learning process, not the final outcome as in assessing student performance. The comparison is between face-to-face and CSCL seminars, not between individual students.
We are looking for an assessment of the quality of this learning process, not its quantity. Robin Mason found that surveys, user interviews, empirical experimentation, case studies and computer generated statistical measurements are being used to evaluate computer conferencing (Mason, 1991). She rightly criticised them, as none tell us much about the quality of student learning taking place.
Nor does assessing the system usability tell us much about the learning quality. Measures of usability help system designers improve their systems. They do not help teachers decide whether and how to use CSCL in classes. A useable system is a necessary but not sufficient condition for CSCL.
So what do we measure? Henri identified five dimensions for analysing Computer-Mediated Communication (CMC): participative, social, interactive, cognitive and metacognitive (Henri 1991). Briefly, the first three dimensions reflect the degree of active participication on the system, the social effects of taking part in CMC, and an analysis of the interactions taking place over the system. The cognitive and meta-cognitive dimensions relate to the pychological processes taking place during learning. It is the cognitive (and metacognitive) dimension on which we focused, since the participative dimension leads to measures of quantity, not quality, and the social and interactive dimensions tell us about motivations and conversations rather than the learning taking place. Our evaluation included both declarative knowledge concerning the person, task and learning strategy (Henri's cognitive dimension) and procedural knowledge relating to evaluation, planning, regulation and self-awareness (what Henri calls meta-cognitive).
To evaluate this cognitive dimension of CSCL, we need a theory of critical thinking. Garrison's theory envisages critical thinking as a sequential problem-solving process with five stages: problem identification, problem definition, problem exploration, problem applicability and problem integration (Garrison, 1992). In these stages learners use the corresponding 5 critical reasoning skills that Henri (1991) identified, as defined and illustrated in Fig. 3. For a fuller discussion, see Garrison or Henri's papers, or Newman et al (1995).
Given this theory of critical thinking, it is possible to identify indicators of critical thinking at each stage. Example indicators are shown in Fig. 3.
| Garrison's CT stages | Henri's critical reasoning skills |
|---|---|
| 1. Problem identification | Elementary clarification |
| a triggering event arouses interest in a problem e.g. aroused interest, triggered a desire to understand, aware of issues | observing or studying a problem, identifying
its elements, observing their linkages e.g. identifying relevant elements, reformulating the problem, asking a relevant question, identifying previously stated hypotheses |
| 2. Problem definition | In-depth clarification |
| define problem boundaries, ends and means e.g. clarified subject, identified personal experience |
analysing a problem to understand its underlying values, beliefs and
assumptions e.g. defining the terms, identifying assumptions, establishing referential criteria, seeking out specialised information |
| 3. Problem exploration | Inference |
| ability to see to heart of problem based on deep understanding
of situation e.g. explore new ideas, develop new solutions, understand issues, disentangle ideas | admitting or proposing an idea based
on links to admittedly true propositions e.g. drawing conclusions, making generalisations, formulating a proposition which proceeds from previous statements |
| 4. Problem applicability | Judgement |
| evaluation of alternative solutions and new ideas e.g. critical assessment, judge solutions, critically evaluate, assess practical knowledge | making decisions,
evaluations and criticisms e.g. judging the relevance of solutions, value judgements, judging inferences |
| 5. Problem integration | Strategies |
| acting upon understanding to validate knowledge e.g. previous knowledge, test solutions, apply ideas, relating to other course tasks | for application of solution
following on choice or decision e.g. deciding on the action to be taken, proposing one or more solutions, interacting with those concerned |
The indicators for Garrison's stages of critical thinking were expressed as 17 Likert scale questions in a pair of student post-experience questionnaires, one for their experience of face-to-face seminars, the other for computer conferencing, designed to measure, stage-by-stage:
In addition to post-experience questionnaires, we set out to develop a content analysis technique to measure the quality of learning taking place.
We found, as Henri did, that it was harder to use indicators of the stage of critical thinking in content analysis. Henri suggested using paired opposites, one indicating surface processing the other in-depth processing (like surface and deep learning). For example, `making judgements without offering justification', versus `setting out the advantages and disadvantages of a situation or solution'. On examining her list, you find these are indicators of critical versus uncritical thinking, at different stages of the critical thinking process.
We developed our own set of paired indicators, by simplifying Henri's pairs, by looking for indicators in all of Garrison's stages, and from our experience of using similar techniques for assessing student work in computer conferences. These are shown in Fig. 4. These pairs of indicators were then used to define the type of statements to look for in seminar and computer conference transcripts.
| R+/- | Relevance | ||
|---|---|---|---|
| R+ | relevant statements | R- | irrelevant statements, diversions |
| I+/- | Importance | ||
| I+ | Important points/issues | I- | unimportant, trivial points/issues |
| N+/- | Novelty. New info, ideas, solutions | ||
| NP+ | New problem-related information | NP- | Repeating what has been said |
| NI+ | New ideas for discussion | NI- | False or trivial leads |
| NS+ | New solutions to problems | NS- | Accepting first offered solution |
| NQ+ | Welcoming new ideas | NQ- | Squashing, putting down new ideas |
| NL+ | learner (student) brings new things in | NL- | dragged in by tutor |
| O+/- | Bringing outside knowledge/experience to bear on problem | ||
| OE+ | Drawing on personal experience | O- | Sticking to prejudice or assumptions |
| OC+ | Refer to course material | ||
| OM+ | Use relevant outside material | ||
| OK+ | Evidence of using previous knowledge | ||
| OP+ | Course related problems brought in. E.g. students identify problems from lectures and texts | ||
| OQ+ | Welcoming outside knowledge | OQ- | Squashing attempts to bring in outside knowledge |
| A+/- | Ambiguities: clarified or confused | ||
| AC+ | Clear, unambiguous statements | AC- | Confused statements |
| A+ | Discuss ambiguities to clear them up | A- | Continue to ignore ambiguities |
| L+/- | Linking ideas, interpretation | ||
| L+ | Linking facts, ideas and notions | L- | Repeating information without making inferences or offering an interpretation. |
| L+ | Generating new data from information collected | L- | Stating that one shares the ideas or opinions stated, without taking these further or adding any personal comments. |
| J+/- | Justification | ||
| JP+ | Providing proof or examples | JP- | Irrelevant or obscuring questions or examples |
| JS+ | Justifying solutions or judgements | JS- | Offering judgements or solutions without explanations or justification |
| JS+ | Setting out advantages and disadvantages of situation or solution | JS- | Offering several solutions without suggesting which is the most appropriate. |
| C+/- | Critical assessment | ||
| C+ | Critical assessment/evaluation of own or others contributions | C- | Uncritical acceptance or unreasoned rejection |
| CT+ | Tutor prompts for critical evaluation | CT- | Tutor uncritically accepts |
| P+/- | Practical utility (grounding) | ||
| P+ | relate possible solutions to familiar situations | P- | discuss in a vacuum (treat as if on Mars) |
| P+ | discuss practical utility of new ideas | P- | suggest impractical solutions |
| W+/- | Width of understanding (complete picture) | ||
| W+ | Widen discussion. E.g. problem within a larger perspective, intervention strategies within a wider framework | W- | Narrow discussion. E.g. address bits or fragments of situation, suggest glib, partial, interventions |
For full details of the content analysis technique, see Newman, Webb & Cochrane (1995)
For this reason, we did not include students outside the class in the computer conferences, even though we could have used Network Telepathy to link our students to expert practitioners and other teachers and students outside Queen's University, since Telepathy is also an off-line reader, a program that can dial up a central conferencing system like CIX, Compuserve or a Usenet server, download the new messages, and store them on our server for students to study later. The students were divided into seminar groups at the beginning of the semester, and stayed together in the same groups for both their face-to-face and computer mediated seminars. So the group sizes (11, 19 and 19) were the same for both technologies.
At the end of the semester, we gave all the students questionnaires on their experience of seminars and computer conferences. They were asked to indicate how much they agreed with a set of 17 questions based on Garrison's stages. The replies were then analysed using principal component factor analysis, to identify relationships and see how far they corresponded to Garrison's stages. This method was used in the development of the original Approaches to Studying Inventory (Entwhistle and Ramsden 1983) and by Biggs (1987 and 1993) in the development of the Study Process Questionnaire. Varimax orthogonal rotation was used to allow each factor to load highly on only one factor while all factors remained uncorrelated and distinct, given the small number of variables (17) and cases (26-28). For details of the methodology and results, see our earlier paper presented at the 1st International Symposium on Improving Student Learning (Webb et al, 1994) and a more recent update (Webb et al., 1995).
We also tape recorded and transcribed the face-to-face seminars. The Network Telepathy system automatically keeps a transcript of all conference messages. The lecturer and a research assistant went through the transcripts marking any statements that were obvious examples of +ve or -ve critical thinking indicators listed in Fig. 4. These statements (phrases or sentences) could be marked on more than one criterion. If a statement was not obviously important or unimportant, critical or uncritical, ... it was ignored. The marked statements in each transcript were counted, producing counts of R+, R-, JS+, JS- and so on.
We analysed these counts by converting the counts for each pair of indicators to a depth of critical thinking ratio, producing a -1 (all surface learning) to +1 (all deep learning) scale by dividing the excess of +ve to -ve counts by the total for each pair of indicators. For details of the methodology, see our IPCT-J article (Newman, Webb & Cochrane, 1995). For detailed results see Newman, Johnson, Cochrane and Webb 1996.

Fig. 5. Overall depth of CT ratio for each group.
The content analysis even showed that the overall depth of critical thinking (calculated from the total counts of statements obviously indicating deep and surface learning), was higher when learning took place on the computer conferencing system (see Fig. 5). This difference was significant at 4% on a matched-sample t-test.
The two evaluation techniques allow us to say more about the type and quality of learning taking place.
| Computer conferencing | Face-to-face seminars | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| F1 | F2 | F3 | F4 | F5 | F1 | F2 | F3 | F4 | F5 | F6 | |
| Problem identification | |||||||||||
| .83 | Aroused interest | .74 | |||||||||
| .89 | Triggered a desire to understand | .77 | |||||||||
| .80 | Aware of issues | .76 | |||||||||
| Problem description | |||||||||||
| .51 | Clarified subject | .87 | |||||||||
| -.80 | Identify personal experience | .71 | |||||||||
| Problem exploration | |||||||||||
| .59 | Explore new ideas | .85 | |||||||||
| .70 | Develop new solutions | .51 | |||||||||
| .73 | Understand issues | .78 | |||||||||
| .60 | Disentangle ideas | .65 | |||||||||
| Problem applicability | |||||||||||
| .86 | Critical assessment | .58 | |||||||||
| .73 | Judge solutions | .83 | |||||||||
| .90 | Critically evaluate | .76 | |||||||||
| .61 | Assess practical utility | .44 | |||||||||
| Problem integration | |||||||||||
| .69 | Previous knowledge | .62 | |||||||||
| .53 | Test solutions | .85 | |||||||||
| .78 | Apply ideas | .78 | |||||||||
| .71 | Concerns relating to project | .77 | |||||||||
| (highest loading on each factor for all variables, varimax orthogonal rotation, factors with eigenvalues > 1) | |||||||||||
Although the factor analysis of our small sample of student questionnaires did not correspond exactly to Garrison's stages of critical thinking, 4 factors from each questionnaire were highly loaded with respect to questions deriving from Garrison's stages 1, 3, 4 and 5: i.e. problem identification, exploration, applicability and integration (see Fig. 6). Of these, problem exploration-related factors were more important in face-to-face seminars, whereas problem integration-related factors came on top in computer conferencing.

Fig. 7. Average patterns of depth of CT ratios for different technologies.
![]() Fig. 8. Seminar depth of CT for different indicators. |
![]() Fig. 9. CC depth of CT for different indicators. |
From the content analysis, we found more positive critical thinking ratios for bringing in outside information (from personal experience, course material and elsewhere), linking ideas and interpretation, and important ideas in the computer conference transcripts, but with a slightly lower ratio for new information, ideas and solutions (see Fig. 7). The outside information depth of CT ratios were consistently greater in computer conference rather than face-to-face seminars (significant at 0.3% in an analysis of variance). The differences in other ratios were not significantly consistent across each group, as shown in Figs. 8-9. Indeed, Fig. 8 shows that the depth of CT ratios for importance and linking ideas are strongly affected by the subject discussed in the seminar, with the privacy seminars turning up more unimportant, unrelated, statements. A more detailed account of the content analysis results will appear in a forthcoming paper (Newman et al 1996).
Together, these results suggest that the face-to-face seminars were better for creative problem exploration and idea generation, and that the computer conferencing environment better supported the later stages of linking ideas, interpretation and problem integration.
It seems that the seminars produced more spontaneous interaction, stimulating more new ideas, and greater participation. But the CC encouraged a worthier, more considered, style of interaction, leading to more important statements, and making it easier to link ideas together.
Furthermore, a question on how much the discussion medium helped the students identify personal experience made a strongly negative contribution to factor 5 in the computer conferencing questionnaire analysis, but contributed positively to factor 1 in the seminar questionnaire. This suggests that the computer conferencing discouraged input of personal experience as opposed to the bringing in of impersonal related material and course references that contributed to the O+ counts and ratios in the content analysis. Again, it is as if they wrote for their teachers but spoke for themselves.
From this theory and Henri's work, we have developed two techniques for evaluating critical thinking in group learning: a student questionnaire and a content analysis scoring system. With these tests we can measure the relative amount of critical thinking on different media, and get some idea of which critical thinking activities are encouraged or hindered by particular technologies.
These methods have been piloted in a small scale test, but need more work in other learning contexts to validate and improve them. The learning contexts should not be limited to seminar equivalents, but should cover a range of group learning situations, using different CSCL technologies, but always with control groups. The latter point is particularly important for content analysis work, as any scorer biases (e.g. a reluctance to identify shallow statements) will apply equally to the test and control groups.
In our experiment, we found, to our mild surprise, evidence for critical thinking in both face-to-face seminars and computer conferences. When used as a conventional seminar equivalent, our Network Telepathy conferences helped students link ideas and integrate problems back into the world and their knowledge, but it was easier to participate and generate new ideas in face-to-face seminars.
There are a number of factors which might affect this, that suggest lines for future research:
1. The human-computer interface. The students learned how to use Network Telepathy in the same semester as they used it. From their comments on the questionnaire, a number found this difficult enough to distract them from the subject under discussion. We have since changed to the Windows interface for the same conferencing system, PowWow (see Fig. 10). A promising environment for the future is to run discussion groups on our World-Wide Web server, as students have found WWW browsers like Netscape extremely easy to use. University College Dublin has already developed a WWW environment for teaching (WEST, 1995), although at present it only supports individual student work, so to support critical thinking tutors would need to give enough frequent, detailed feedback on assignments to match the interaction found in group learning. There are also WWW-based discussion lists and group editing tools being developed.

Fig. 10. PowWow user interface
2. It is possible that the lack of immediacy in computer conferencing is inhibiting spontaneous ideas generation and bringing in personal experience. We can test this by doing controlled comparisons between synchronous CSCL tools, like Internet Relay Chat, and asynchronous ones (like PowWow).
3. The software we have been using was not designed to support particular stages of critical thinking, nor proven group learning techniques. By using or designing second or third level groupware, these issues can be addressed directly. For example, to facilitate problem exploration, ideas mapping software (like CM/1) might help. With such software, a group carefully map out issues, their positions and arguments on an issue across a computer screen (or board).To support problem description group hypertext could suit, rather like the archetypal Intermedia system used for learning English literature in Brown University over 10 years ago. Today we would attempt to implement group hypertext in the World-Wide Web. Some educational voting and ranking software could support the critical judgement tasks in the problem applicability stage. Other groupware can be designed to support the techniques used to form and reform groups within large classes, such as pyramids (1 then 2 then 4...).
4. Since Garrison considers critical thinking as problem solving, we may find better evidence for his stages when student groups are carrying out explicit problem-solving tasks, such as group projects and case studies, rather than seminars, which are not so tightly focused on problem-solving.
5. In neither the seminars nor the computer conferences were the learning techniques optimised to make use of the environment. There is a need for further studies that compare optimised situations, be it the introduction of explicit creativity techniques such as Synectics in face-to-face seminars, or the use of computer technologies to do things impossible in face-to-face meetings.