DOI - ResearchGateкод для вставки
2011 Third International Conference on Intelligent Networking and Collaborative Systems Emotion Measurement in Intelligent Tutoring Systems: What, When and How to Measure Michalis Feidakis1, firstname.lastname@example.org 1 Thanasis Daradoumis1,2 email@example.com Santi CaballГ©2 firstname.lastname@example.org Department of Cultural Technology and Communication (D.C.T.C), School of Social Sciences, University of Aegean, Mytilini, Greece 2 Department of Computer Science, Multimedia and Telecommunications, Open University of Catalonia, Barcelona, Spain вЂў Emotion (derives from the Latin prefix emot=moving away) refers to a вЂњshakingвЂќ of the organism as a response to a particular stimulus (person, situation or event), which is generalized and occupies the person as a whole. It is usually an intense experience of short duration - seconds to minutes - and the person is typically well aware of it. вЂў Affect is a synthesis of all likely effects of emotion (cognitive, organic, etc) and includes their dynamic interaction, but is not evened individually with any of them. вЂў Feeling is always experienced in relation to a particular object of which the person is aware. It may have various levels of intensity, and its duration depends on the length of time that the representation of the object remains active in the mind of the individual. вЂў Mood tends to be subtler, longer lasting, less intensive, more in the background, giving the affective state of a person a tendency in positive or negative direction. In general, affect is the effect of emotion in the organism. Mood is a result as well as an influencing factor of emotion. The definition of what we want to measure is the first and most prominent step. Next step is to confine our measurement, by taking into account the following issues: вЂў Consciousness: Emotion research is susceptible to the risk to be focused on subjective emotional experience . On the other hand, an accurate evaluation of what is felt can only come from the subject itself . There is still a debate between scientists, about the degree of consciousness when experiencing emotions , if cognitive appraisal is a necessary pre-condition for affective arousal, or not . As a result, there is a hesitance if we can assess emotions by simply asking: вЂњHow do you feelвЂќ or by вЂњWhat do you preferвЂќ. More scientists prefer not to ask at all. They вЂњplugвЂќ human subjects into sensors, and start measuring their physiological reactions. вЂў Duration: An emotional experience can last for only a couple of seconds up to several hours or even longer. Emotions unfold over time, and yet individuals are AbstractвЂ”Affective or Emotion oriented computing constitutes an emerging research field that is still in its early stages. The lack of empirical results together with the complexity that attributes emotions, subjects research to a diversity of theories, models and tools. In the current paper we present a critical review of the state of the art on emotion measurement models, methods and tools and we suggest some informal rules towards their realistic use in education settings. Keywords: Emotion, affect, affective computing, emotion measurement, detection, models, tools I. INTRODUCTION Emotions and affectivity in learning technology is a hot topic in the research agenda. Numerous studies are struggling to reliably collect information about students emotions and the so called вЂњemotional cartographyвЂќ . And although there have been already some promising results we are still in the first steps of this new field . The quiver of tools that measure emotions has been enhanced by the advancement of Intelligent Computing and Neuroscience. TodayвЂ™s instruments range from simple penand-paper rating scales to dazzling high-tech equipment that measures brain waves or eye movements . There are not however, so far, adequate empirically proven strategies to address the appropriateness of each method in relation to the measurementвЂ™s needs and singularities. In the current paper we are moving to that direction. In section II, we present models and theories for basic emotion and emotional dimension approach of emotion research, based on previous work . In section III, we extend more in depth our review on emotion measurement models and tools, and we identify advantages and drawbacks of each method depending on the application context, time, cost, etc. In section IV, we provide some informal rules towards their auspicious implementation. II. EMOTION MEASUREMENT THEORY Despite the few attempts to understand and define emotion, literature is still lacking from a widely acceptable definition to discriminate it from affect or mood. In line with Davou  and Zimmernann , we suggest the following discrimination: 978-0-7695-4579-0/11 $26.00 В© 2011 IEEE DOI 10.1109/INCoS.2011.82 807 likely to differ dramatically in the time to recover from a negative emotion, such as anger . Measurement has to be either precise (capturing emotion signals by using sensors) or retrospective (by using self-reporting) . вЂў Distinction: Although it is quite clear to humans of what they usually feel, it is difficult however to find the correct word-tokens to express it. вЂњEveryone knows what (emotion) is until they are asked to define itвЂќ . Emotions constitute a rather primary, non-verbal way of communication. They are stored in cell constellations that have been significantly developed in humanвЂ™s early years, when their verbal system didnвЂ™t even exist. From three months after conception until five-years-old, humanвЂ™s emotional repertoire has almost complete its mature cycle . Scherer  has distinguished three major schools of emotion research: the basic emotion, the emotional dimension, and the eclectic approach. We focus on the first two and we are presenting models and theories for each category. humiliated-proud) that may arise in the course of learning together with a four quadrant model, relating phases of learning to positive and negative emotions (dimension of valence). Figure 1: Affective Circumplex Model  A. Basic Emotions Patterns are equivalent with basic emotions that can be easily recognised universally. The list of models and theories that examine basic emotions is quite long . In a preliminary study , our attempt to classify fundamental models and theories of basic emotion, resulted in ten basic emotions: anger, happiness, fear, sadness, surprise, disgust and love, anticipation, joy and trust. Figure 2: Learning Cycle Model  вЂў Csikszentmihalyi  has identified a zone, where most of the people have concentrated their attention so intensely on solving a problem or doing things that they lose track of time. Such flow is optimal experience that leads to happiness and creativity. If a task is not challenging enough, boredom sets in, while too great a challenge results in anxiety, and both cases result in task, and thus learning, avoidance. Steels  developed an architecture that conceptualises flow. B. Emotional Dimensions In the literature, learning theories and models usually adopt the following dimensions : вЂў Arousal (deactivating/activating) вЂў Valence (negative/positive) вЂў Intensity (lowвЂ“intense) вЂў Duration (shortвЂ“long) вЂў Frequency of its occurrence (seldomвЂ“frequent) вЂў Time dimension (retrospective like relief, actual like enjoyment, prospective like hope). Researchers are striving to combine the above dimensions in a multi-dimensional emotional space that accurately projects subjectвЂ™s emotion experience. Below we refer to three widely accepted and used models: вЂў RussellвЂ™s  two-dimension вЂњcircumplex model of affectвЂќ has served as a fundamental emotional model for many subsequent theories in emotion research. According to his model, emotions are seen as combinations of arousal (high activation/low activation) and valence (positive/negative). вЂў Kort and Reilly  have developed a model of a learning cycle that integrates affect, providing a framework about the role of emotions in learning. They have suggested six possible emotion axes (anxiety-confidence, ennui-fascination, frustrationeuphoria, dispirited-enthusiasm, terror вЂ“excitement, Figure 3: Architecture of Flow  808 2) Non-Verbal Self-Reporting: It includes unobtrusive, language-independent tools that can be used in different cultures. They claimed to be less subjective than verbal self-report instruments , because they are not limited by studentвЂ™s vocabulary. On the other hand the range of emotions that they can assess is limited. a) Self-Assessment Manikin-SAM : Non-verbal scale that is used to rate the dimensions of valence, arousal and dominance. b) PrEmo : Non-verbal self-report instrument that measures a set of 14 emotions, 7 pleasant (i.e. desire, pleasant surprise, inspiration, amusement, admiration, satisfaction, fascination), and 7 unpleasant (i.e. indignation, contempt, disgust, unpleasant surprise, dissatisfaction, disappointment, and boredom). c) International Affective Picture System : It provides a large set of standardized, emotionallyevocative, internationally-accessible, color photographs that includes contents across a wide range of semantic categories. III. EMOTION MEASUREMENT TOOLS Emotion measurement tools can be grouped into three areas : Psychological, Physiological and Behavioral. Each group has its strengths and weaknesses and the final choice depends on the educational settings (in lab, learning, class, test), the issues of measurement we want to cope with (consciousness, duration, distinction), the time and money that we are able to spent, and in some cases, the independent variables we wish to investigate (gender, studentвЂ™s academic level, location of residence, parentsвЂ™ educational level, etc.). In the majority of the studies, multimodal integration is preferred (combination of the three methods). A. Psychological tools Psychological (self-reporting) They originate from Clinical Psychology and employ verbal and non-verbal descriptions of emotions. They are inexpensive tools that measure the subjective experience of emotions in an unobtrusive and non-invasive. It is the only way to measure userвЂ™s subjective feelings, although users are often reluctant to disclose their inner feelings to researchers in order to avoid embarrassment . They cannot be easily used in parallel with the user task, only in very specific cases where mannequins and imaginaries are used for quick and short answers. Further classification includes: 1) Verbal Self-Reporting: Subjects report on their emotions with the use of questionnaires with pre-defined, openended questions, verbal rating scales or verbal protocols. Also interviews, conductive chat and logbooks (like an emotion diary) are used, so that subjects could indicate their affective state in their own words. They can be assembled to represent any set of emotions or mixed emotions . They meet language and cultural barriers though . Examples: a) The Academic Emotions Questionnaire (AEQ): It is developed by Pekrun et al.  is a likert type questionnaire and has 5 degree ranging from (1) Strongly Agree to (5) Strongly Disagree. It is used to measure the academic emotions (anger, anxiety, hopelessness, shame, joy, hope, pride, boredom) and more specifically: 77 items of the questionnaire measure academic emotions about learning, 84 items about test and 81 items about class. The higher score taken from each factor shows that the student has the academic emotion related to that factor. b) The Semantic Differential Scale : The respondent is asked to choose between two bipolar adjectives, e.g. unhappy-happy. It measures valence, arousal and dominance dimensions. c) The Positive and Negative Affect Schedule-PANAS : It includes 20 items of 5-likert scales, providing assessment of positive and negative affect. d) The Affect Grid : A two-dimensional model (valence, arousal), that has been designed to assess core affect by positioning a cross in a 9 x 9 grid. The more central the cross, the weaker their affective experience. It allows for easy, very fast and repeated assessments. B. Psychological tools Physiological (use of sensors) By using sensors, scientists are able to measure subjectвЂ™s physiological reactions. Usually, the subjectвЂ™s affective state is projected in an emotional space, determined by emotional dimensions (arousal, intensity, control etc.). Research findings, however, have shown that they are more reliable for arousal than for emotional valence . Most of these measures based on recordings of electrical signals produced by brain, heart, muscles, and skin. For example: 1) Electromyogram (EMG) that measures muscle activity. 2) Electroencephalography (EEG) that measures brain activity. Figure 4: EMG and EEG  3) Electrodermal Activity or Skin Conductance (EDA or SC) that measures the hydration in the epidermis and dermis of the skin. It is typically recorded from the surface of the hand or wrists. 4) Electrocardiogram (EKG or ECG) that measures heart activity (heart rate, inter-beat-interval, heart rate variability). Figure 5: EDA  and ECG  809 5) Electrooculogram (EOG) measuring eye pupilвЂ™s size and movement. linked to six basic emotions: (anger, disgust, fear, joy, sadness, and surprise). 2) Voice modulation/intonation: Sound features like pitch, tempo-rythm, volume, modulation, intonation, vibration are used to differentiate affective states. 3) Hand tracking-Body posture that can be analysed through observation with the help of video-recording or by using special devices like the Body Posture Measurement System (BPMS), developed by Tekscan . 4) Mouse-keyboard movements by recording data (mouse movements, buttons pressed, idle time e.t.c.) from log files or by using special devices like pressure-force sensitive mouse and keyboard. Figure 6: EOG  6) Blood Volume Pulse (BVP) measures blood pressure. Figure 7: Hand and Head BVP  7) Respiration, where rate of respiration and depth of breath are the most common measures. Figure 9: Posture analysis seat and IBM BlueEyes video camera  5) CorrugatorвЂ™s activity that in combination with the activity of the zygomaticus muscle can give us information about subjectвЂ™s valence. Motor-Behavioural tools can pick up emotion cues that cannot be measured by self-reporting or physiological signals. However, they require experience and objectivity from the observer. These methods are tested almost exclusively on вЂњproducedвЂќ affect expressions. Recognition accuracy would drop heavily in natural situations. Furthermore, video cameras are considered obtrusive . Figure 8: Measuring respiration  Physiological sensors provide an objective measure of physiological signals. A substantial advantage of psychophysiological measures is that they provide continuous monitoring of user state and, usually, are not disruptive of task performance . A major pitfall is that they are often obtrusive or even invasive, troubling userвЂ™s experience with the interface. Furthermore, they necessitate specialised and frequently expensive equipment and technical expertise to run the equipment . Moreover, because of the sensitivity of the sensors to confounding factors (e.g. heat, lighting), they have blamed to produce noisy data. D. Measuring Emotional Intelligence (EQ) In the literature, emotion and learning has been mainly ascribed by the term Emotion Intelligence (EQ). Different trends in EQ have led to the development of various instruments for the assessment of the construct, and while some of these measures may overlap, most researchers agree that they tap different constructs. EQ measurement can be divided into two trends : 1) The Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) which is reliant to the work of Peter Salovey and John Mayer. The MSCEIT based on a series of emotion-based problem-solving items and it measures four EQ types of abilities: a) Perceiving emotions: The ability to detect and identify emotions. b) Using emotions: The ability to harness emotions c) Understanding emotions: The ability to comprehend emotion language and to appreciate complicated relationships among emotions. d) Managing emotions: The ability to regulate emotions in both ourselves and in others. The updated MSCEIT V2.0  includes 141-item scales. C. Motor-Behavioural Motor-behaviour expression is the most common way humans employ to evaluate oneвЂ™s affective state in everyday life . These tools measure behavioural expressions and changes in physical body states that communicate oneвЂ™s emotion experience. Their major asset is that they provide the ability to evaluate subjectвЂ™s affective state by using traditional devices like a PC camera or a microphone, or the traditional mouse and keyboard, though special software is needed . This area also uses sensors that are less obtrusive and invasive and more discreet than the physiological tool. For example: 1) Facial expressions: For example, the Facial Action Coding System  analyses 44 face muscles that are 810 2) Social and Emotional Learning-SEL that is based on the writing of Daniel Goleman . The SEL Competencies assessment tools evaluate five core SEL competencies, namely self-awareness, self-management, social awareness, responsible decision-making and relationship/social skills. In the Compendium of SEL and Associated Assessment Measures , more than 50 tools that assess SEL of preschool and elementary school students (i.e. 5-10 years old) are reviewed, along with aspects of the contexts in which they learn and their learning behaviours. Additionally, CASEL has published a Safeand-sound guide that reviews 242 health, prevention, and positive youth development programs to assist schools in choosing SEL programs that best meet their needs . Another rule states that sensors are mostly preferred when evaluating the impact of negative emotions, e.g. in stress, fear conditions. Neuroscience has proven that negative emotions such as fear or anger are triggered before the Pre-Frontal Cortex has even received the signal to be processed . Human brain is able to sense fear before human can think of it . The short duration of emotions (especially the negative ones) indicate the use of sensors, in contrast with the long-lasting mood that can be examined through self-reporting. The main drawbacks of sensors that capture physiological signals are that they are obtrusive or even invasive. They have designed mostly for lab experiments and they lack of вЂњecologicalвЂќ validity, although less intrusive methods of gathering physiological data are being developed. The association between physiological measures and вЂњtraditionalвЂќ methods can offer interesting solutions, like for example the use of sensors embedded in an office chair to detect heart rate, sensors in glasses to detect facial muscle activity, sensors in a computer mouse to collect measures of skin temperature , computer bracelet for SC or emotion sensitive pendant for heart rate. MIT has developed a series of wearable computers, enabling measurement devices to be deployed comfortably without encumbering daily activity (e.g. iCalmвЂќ sensor can be easily worn in daily life to wirelessly gather electrodermal, temperature, and motion data) . IV. DISCUSSION Emotions have a stigma in science as they are believed to be inherently non-scientific , and there is one reason for that: How real are subjectsвЂ™ reactions when they know that they are taking part in a lab experiment that is trying to explore their deep emotional thoughts? Literature has produced successful studies where studentsвЂ™ affective states have been evaluated with high accuracy [2, 3]. Nevertheless, affect detection by intelligent systems is still in its infancy . There is not a golden rule for which method or tool is more suitable, in which context and when is better to be applied. A fundamental criterion is the availability of resources. Sensors are more precise but cost more money and time. Self-reporting on the other hand is free of charge but usually out of context. One informal rule is that selfreporting is more suitable when investigating the impact of discreet basic or secondary emotions or affective states. Sensors that capture physiological and motor-behavioural signals can be more beneficial to project affective states into emotional dimensions. Data validity posits that measurements have to be in context, in parallel with the task without interrupting studentвЂ™s flow of interaction . However, self-reporting is mostly taking place right before or after the task, except from cases where non-verbal tools provide short answers, without diverting the userвЂ™s attention to aspects irrelevant to the task. Brevity in assessment allows minimized disruption of associated task performance and can be more easily accommodated in repeated measure research designs . Subjective feelings can only be measured through selfreport. Non-verbal self-reporting constitutes a more studentfriendly way (emoticons and mannequins are often used by today adolescents), which can be easily applied in class or in school labs as it requires short answers that do not consume much task time. Self-reports can evaluate a minimum range of emotions, though. Verbal questions can be used when there is not time limit, while studying at home for example, or in the form of pre and post test. In some cases we need students to indicate their affective state in their own words. The subjectivity of this method can be mitigated through indirect questions . V. CONCLUSION It is possible to measure almost anything, but the concern is whether the measure is meaningful, useful and valid . We have tried a critical review on the state of the art of emotion measurement models, methods and tools. We have also proposed some informal rules towards their realistic use in education settings. Future work entails the implementation of case studies to refine the presented framework. ACKNOWLEDGMENT This work has been partially supported by the European Commission under the Collaborative Project ALICE "Adaptive Learning via Intuitive/Interactive, Collaborative and Emotional Systems", VII Framework Programme, Theme ICT-2009.4.2 (Technology-Enhanced Learning), Grant Agreement n.257639. REFERENCES  ADI Instruments, retrieved August 3rd, 2011 from http://www.adinstruments.com/.  I. Arroyo, D. Cooper, W. Burleson, B. P. Woolf, K. Muldner, and R. Christopherson, вЂњEmotion sensors go to school,вЂќ Proc. of International Conference on Artificial Intelligence in Education (AIED), July 6th-10th, 2009, Brighton, UK, IOS Press, p. 17-24.  R. Calvo, вЂњIncorporating affect into educational design patterns and technologies,вЂќ Proc. of the 9th IEEE international conference on advanced learning technologies, July 14-18, Riga, Latvia, 2009.  Collaborative for Academic, Social, and Emotional Learning, retrieved August 3rd, 2011 from http://www.casel.org/. 811  M. Csikszentmihalyi, вЂњFlow: The psychology of optimal experience,вЂќ New York: Harper and Row, 1990.  R. J. Davidson, K. R. Scherer, and H. H. Goldsmith, вЂњHandbook of Affective Sciences,вЂќ Oxford: Univ. Press, 2003.  B. Davou, вЂњInteraction of emotion and cognition in the processing of textual material,вЂќ Meta:journal des traducteurs / Meta: TranslatorsвЂ™ Journal, Vol. 52, No 1, 2007, p. 37-47.  P. M. A. Desmet, вЂњMeasuring emotions: Development and application of an instrument to measure emotional responses to products,вЂќ In M.A. Blythe, A.F. Monk, K. Overbeeke, and P.C. Wright (Eds.), Funology: From Usability to Enjoyment. Dordrecht: Kluwer Academic Publishers, 2003.  P. Ekman, and W.V. Friesen, вЂњFacial Action Coding System: A technique for the measurement of facial movement,вЂќ Palo Alto, CA: Consulting Psychologists Press, 1978.  Emotional intelligence (Wikipedia), retrieved August 3rd, 2011 from http://en.wikipedia.org/.  M. Feidakis, and T. Daradoumis, вЂњA five-layer approach in collaborative learning systems design with respect to emotion,вЂќ Proc. of the International Conference On Intelligent Networking and Collaborative Systems (INCOS 2010), Thessaloniki, Greece, 2010.  M. Feidakis, T. Daradoumis, & S. CaballГ©, вЂњEndowing erd learning systems with emotion awareness,вЂќ Proc. of the 3 International Conference on Networking and Collaborative Systems (INCOS), Fukuoka, Japan Nov 30- Dec 2, 2011 (submitted for publication).  Find me a cure, retrieved August 3rd, 2011 from http://findmeacure.com  L. Feldman-Barrett, and J. A. Russell, вЂњIndependence and bipolarity in the structure of current affect,вЂќ Journal of Personality and Social Psychology, Vol. 74, 1998, pp. 967вЂ“ 984.  D. Goleman, вЂњEmotional intelligence,вЂќ New York: Bantam Books, 1995.  T. Hascher, вЂњLearning and emotion: perspectives for theory and research,вЂќ European Educational Research Journal, Vol. 9, 2010, pp. 13-28.  International Affective Picture System, retrieved August 3rd, 2011 from http://csea.phhp.ufl.edu/  B. Kort, and R. Reilly, вЂњAnalytical models of emotions, learning and relationships: towards an affect-sensitive cognitive machine,вЂќ Proc. of the International Conference on Virtual Worlds and Simulation (VWSim), San Antonio, Texas, 2002.  P. J. Lang, вЂњBehavioral treatment and bio-behavioral assessment: Computer applications,вЂќ In: Sidowski, J.B., Johnson, J.H., Williams, T.A. (Eds.). Technology in mental health care delivery systems, Ablex, Norwood, NJ, 1980, pp. 119-l37.  J. E. LeDoux, вЂњThe emotional brain: the mysterious underpinnings of emotional life,вЂќ New York: Simon & Schuster, 1996.  J. D. Mayer, P. Salovey, D. R. Caruso, and G. Sitarenios, вЂњMeasuring emotional intelligence with the MSCEIT V2.0,вЂќ Emotion, Vol. 3, 2003, pp.97-105.  Mind Media B.V. retrieved August 3rd, 2011 from http://www.mindmedia.nl  A. Mehrabian, and J A. Russell, вЂњAn approach to environmental psychology,вЂќ Cambridge, MA, USA; London, UK: MIT Press, 1974.  R. Pekrun, T. Goetz, A. C. Frenzel, and R. P. Perry, вЂњMeasuring emotions in studentsвЂ™ learning and performance:             812 the Achievement Emotions Questionnaire (AEQ),вЂќ Contemporary Educational Psychology, Elsevier, Vol. 36, Issue 1, 2011, p. 36-48. P. Petta, C. Pelachaud, and R. Cowie, вЂњEmotion-Oriented systems: The Humaine handbook,вЂќ Berlin: Springer ed. ISBN: 3642151833, 2011 R. W. Picard, вЂњAffective computing,вЂќ Cambridge MA, USA: MIT Press, 1997. R. W. Picard, вЂњEmotion research by the people, for the people,вЂќ Cambridge MA, USA: MIT Press, 2010. R. W. Picard, S. Papert, W. Bender, B. Blumberg, C. Breazeal, D. Cavallo, T. Machover, M. Resnick, D. Roy, & C. Strohecker, вЂњAffective Learning-a manifesto,вЂќ BT Technology Journal, Vol. 22, 2004, pp.253-269. J. A. Russell, вЂњA circumplex model of affect,вЂќ Journal of Personality and Social Psychology, Vol. 39, 1980, pp.1161вЂ“ 1178. J. A. Russell, A. Weiss, and G. A. Mendelsohn, вЂњAffect Grid: A single item scale of pleasure and arousal,вЂќ Journal of Personality and Social Psychology, Vol. 57, No 3, 1989, pp. 493-502. K. R. Scherer, вЂњWhich emotions can be induced by music? What are the underlying mechanisms? And how can we measure them?вЂќ Journal of New Music Research: Vol. 33, Issue 3, 2005, pp. 239-251. L. Steels, вЂњAn Architecture of Flow,вЂќ In: M. Tokoro and L. Steels. A Learning Zone of One's Own. Amsterdam: IOS Press, 2004, pp. 137-150. P. Verduyn, I. Van Mechelen, & F. Tuerlinckx, вЂњThe relation between event processing and the duration of emotional experience,вЂќ Emotion, Vol 11, No 1, Feb 2011, pp. 20-28. D. Watson, & A. Tellegen, вЂњToward a consensual structure of mood,вЂќ Psychological Bulletin, Vol. 98, 1985, pp. 219вЂ“235. M. Wong, вЂњEmotion assessment in evaluation of affective interfaces,вЂќ Master thesis, University of Waterloo, Ontario, Canada, 2006. P. G. Zimmermann, вЂњBeyond usability-Measuring aspects of user experience,вЂќ Doctoral dissertation, Swiss Federal Institute of Technology, Zurich, 2008.