Group Communication Analysis A computational linguistics approach - - PowerPoint PPT Presentation

group communication analysis
SMART_READER_LITE
LIVE PREVIEW

Group Communication Analysis A computational linguistics approach - - PowerPoint PPT Presentation

Group Communication Analysis A computational linguistics approach for detecting sociocognitive roles in multi-party interactions Dr. Nia Dowell Postdoctoral Research Fellow School of Information Digital Innovation Greenhouse University of


slide-1
SLIDE 1

A computational linguistics approach for detecting sociocognitive roles in multi-party interactions

  • Dr. Nia Dowell

Postdoctoral Research Fellow School of Information Digital Innovation Greenhouse University of Michigan ndowell@umich.edu http://niadowell.com

Group Communication Analysis

slide-2
SLIDE 2
slide-3
SLIDE 3

Social Roles Research

Psychology Sociology Education Social Computing Business Management Team Effectiveness Work Place Meetings Collaborative Learning Group Cognition Collaborative Problems Solving Social Media

slide-4
SLIDE 4

Prominent Perspectives on Roles

Role Concept Assigned Roles

A position to which a person is assigned and then performs the behavior associated with that position Concerns

  • Dysfunctional group roles
  • What is actually captured in role assignment research?
  • Disregards the dynamic and interactive way in which roles are created, negotiated,

and evolve among group members during social interaction

  • A single role inhibits role and group flexibility, and the potential advantages of this
slide-5
SLIDE 5

Prominent Perspectives on Roles

Role Concept Assigned Roles

A position to which a person is assigned and then performs the behavior associated with that position

Emergent Roles

Develop naturally out of the interpersonal interaction without any prior instruction or assignment, and are characterized by their behavioral proximity (similarities and differences) to other interactional partners

slide-6
SLIDE 6

High Effort Low Effort Low Effort High Effort Ghost Free-rider Captain Over-rider “I” “We”

Group Orientation Strijbos & De Laat (2010) Marcos-Garcia et al., 2015

slide-7
SLIDE 7

Conversational Patterns

Language and Discourse

Sociocognitive Processes Social Roles

Online Multi- party Interaction

Can we automatically identify the roles students take on during collaborative interactions?

slide-8
SLIDE 8

Time Speaker Discourse

In Infer semanti tic relati tionship among stu tudents ts’ co contributions

How do we go from this semi-unstructured data to something meaningful, something that allows us to capture the important sociocognitive processes taking place within the interaction.

slide-9
SLIDE 9

Latent Semantic Analysis

This similarity measure represents the semantic and conceptual meanings of individual words, utterances, texts, and larger stretches of discourse based on the statistical regularities between words in a large corpus of naturalistic text

Discourse Cohesion

slide-10
SLIDE 10

Communication Density Newness Internal Cohesion Social Impact Overall Responsivity Participation Dynamics Responsivity Measures Discourse Cohesion Analyses Group Communication Analysis

slide-11
SLIDE 11

Jamie Pennebaker + Team

slide-12
SLIDE 12

Participants: 840 undergraduates in an introductory-level psychology course Groups: 184 randomly assigned groups

Talking Questions

slide-13
SLIDE 13

Student Level Group Level

Proportion of

  • n-topic

discussion Pre-test Post-test

Measuring Performance

[% Posttest - % Pretest] / [1 - % Pretest]

slide-14
SLIDE 14

Pre Clustering

Testing Training

Hopkins statistic = .15

Cluster Tendency Multicollinearity

Detecting Emergent Roles

slide-15
SLIDE 15

Op Optimal Number of Clusters

WSS

200 400 600 800 1000 1 2 3 4 5 6 7 8 9 10

Total within sum of squares Number of Cluster k

The disadvantage of elbow and similar methods is that, they measure a global clustering characteristic

  • nly

2 4 6 8 10 2 3 4 5 6 10

Frequency among all indices Number of Clusters k

Optimal number of clusters using K-means

2 4 6 8 10 2 3 4 5 6 9 10

Frequency among all indices Number of Clusters k

Optimal number of clusters using PAM

Majority rule

slide-16
SLIDE 16

Cluster Evaluation and Validation

Internal Validation Theoretical Justification Stability Validation External Validation

4 Cluster Model & 6 Cluster Model

slide-17
SLIDE 17

From Model to Meaning

  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 CENTROIDS

Participation Social Impact Overall Responsivity Internal Cohesion Newness Communication Density

Ta Task-Le Leader Ov Over-rid rider Lu Lurker Soci Socially- De Detached ed Dr Driver er Fo Follower

slide-18
SLIDE 18

Student Roles and Learning

slide-19
SLIDE 19

Linear Mixed Effect Models Individual Learner Dependent Variables Independent Variables Random Variables

Proportional learning gains

Group

Proportion of topic- relevant discussion Group Learner and Group Identified roles Proportional

  • ccurrence of each

identified role

slide-20
SLIDE 20

Linear Mixed Effect Models Individual Learner Dependent Variables Random Variables

Proportional learning gains

Group

Proportion of topic- relevant discussion Group Learner and Group

Null Models

slide-21
SLIDE 21

Li Linear r Mixed Effect Mo Models Ev Evaluation

Akaike Information Criterion (AIC) Log Likelihood (LL) Likelihood ratio test Marginal (R2m) Conditional (R2c)

slide-22
SLIDE 22

Ho How do do le learne ners’ s’ ro roles in influe luenc nce indiv individua idual l le learne rners’ s’ pe perf rform rmanc nce?

  • 0.5
  • 0.4
  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4

LME COEFFICIENTS (B)

Driver Task-Leader Over-rider Lurker Follower Socially Detached

* p < .05, ** p < .01, *** p < .001; N = 704 χ2(7) = 14.93, p = .001, R2m = .02, R2c = .95. * ** ** **

slide-23
SLIDE 23

Ho How do do le learne ners’ s’ ro roles in influe luenc nce overall ll group up pe perf rform rmanc nce?

χ2(3) = 20.92 p < .001, R2m = .13, R2c = .89 χ2(3) = 23.62, p < .001, R2m = .15, R2c = .90

  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

Drivers Task-Leaders Socially-Detached Over-riders Lurkers Followers

** * * * **

Productive roles model

  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2 LME COEFFICIENTS (B)

Drivers Task-Leaders Socially-Detached Over-riders Lurkers Followers

** * * * **

Unproductive roles model

slide-24
SLIDE 24

Take Home

  • Optimal group composition ≠ simply high participating learners
  • Optimal group composition = high and low participators aware of and

invested in the social climate of the group interaction

  • Effect size differences

Driver Task-leader Socially- detached Over-rider Follower Lurker

  • Roles influence student and group outcomes
  • Drivers > Lurkers
  • Drivers = Task leaders and Socially-detached

learners

  • Difference in learning is not a result of the

students simply being more prolific

slide-25
SLIDE 25

How well the identified clusters generalize to held

  • ut and completely different computer-mediated

collaborative learning contexts?

slide-26
SLIDE 26

SMOC: Synchronous Massive Online Class

  • Intro psychology course
  • Students randomly assigned to groups
  • 200-300 groups of 4-5 students per day
  • learner N = 1,713, group N = 3,380
  • Interactions last 3-9 minutes, averaging 5

minutes

  • Over 26 different chat topics
  • Similar to the Traditional CSCL dataset, but

larger and more distributed in terms of people and topics

  • Students were in 9 chats groups

throughout the semester

slide-27
SLIDE 27
  • Land Science is an interactive urban-planning

simulation with collaborative problem-solving in an simulation environment

  • Interns receive instructions and coaching from

Mentors

  • Interns participate in collaborative problem

solving chat sessions to achieve collective goals

  • learner N = 38, group N = 630

Land Science: A Virtual Internship

slide-28
SLIDE 28

Traditional CSCL Land Science SMOC Traditional CSCL Testing Data Traditional CSCL Training Data SMOC Testing Data SMOC Training Data Land Science Testing Data Land Science Training Data

slide-29
SLIDE 29

Traditional CSCL Training Land Science Traditional CSCL Testing SMOC Predict

cluster
slide-30
SLIDE 30

Pr Prediction Evaluat ation

Cross-tabulation assessment Adjusted Rand Index (ARI)

Cramer V

  • computes the proportion of agreement between 2 cluster partitions & penalizes for any

randomness in the overlap

  • Steinley (2004) considers ARI values greater than 0.90 - excellent, values greater than 0.80 -

good, values greater than 0.65 - moderate, and values less than 0.65 - poor

  • Effect size for the strength of the relationship between 2 cluster partitions
slide-31
SLIDE 31

Traditional CSCL Training Traditional CSCL Testing Predict Training Testing

slide-32
SLIDE 32

Cross-tabulation of the predicted and actual cluster assignments Traditional CSCL Training Traditional CSCL Testing Predict

ARI = .83; Cramer V = .92

Testing Clusters Training Predicted Clusters Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 1 32 Cluster 2 2 29 Cluster 3 15 2 1 Cluster 4 18 Cluster 5 4 1 13 Cluster 6 19

slide-33
SLIDE 33

Traditional CSCL Land Science SMOC Traditional CSCL Testing Data Traditional CSCL Training Data SMOC Testing Data SMOC Training Data Land Science Testing Data Land Science Training Data

slide-34
SLIDE 34

Internal & External Generalization

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Predictor Traditional CSCL Predictor SMOC Predictor Land Science Traditional CSCL SMOC Land Science

Six-Cluster Model

slide-35
SLIDE 35

Internal & External Generalization

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Predictor Traditional CSCL Predictor SMOC Predictor Land Science

Adjusted Rand Index

Traditional CSCL SMOC Land Science

Six-Cluster Model

slide-36
SLIDE 36

Internal & External Generalization

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Predictor Traditional CSCL Predictor SMOC Predictor Land Science

Adjusted Rand Index

Traditional CSCL SMOC Land Science

Six-Cluster Model

slide-37
SLIDE 37

Take-Away

  • The GCA method appears to be a robust method for identifying

conversational roles

  • We see good generalization of the roles both within and between

datasets

  • But the roles seem to be context dependent, which is seen in how they do not

generalize as well to the Land Science collaborative problem solving interactions

  • This does not mean the GCA is not a valid approach for identifying roles, just

that care should be taken when transferring roles from one type of interaction to another

slide-38
SLIDE 38

Onward and Upward: Preliminary findings

If roles are indeed an emergent property of interactions, then they will exhibit certain properties:

  • 1. They should not be consistently or highly associated with trait-

based characteristics

  • 2. They will not be static, but instead will change in different context

Dowell, et al., in prep

slide-39
SLIDE 39

Cl Claim 1.

  • 1. They sh

shou

  • uld not
  • t be con
  • nsi

sistently or

  • r highly

assoc associated with trai ait-ba based d ch charact cteristi tics cs

Traditional CSCL Big Five Personality Measures

  • 1. Openness to Experience
  • 2. Conscientiousness
  • 3. Extroversion
  • 4. Agreeableness
  • 5. Neuroticism

GCA Measures & Social Roles Association

Correlation and Linear Discriminate Function Analyses

slide-40
SLIDE 40
  • 0.15
  • 0.1
  • 0.05

0.05 0.1 0.15 0.2

Openness Conscientiousness Extraversion Agreeableness Neuroticism

Correlation r

Participation Social Impact Overall Responsivity Internal Cohesion Newness Communication Density DFA Driver Task-Leader Over-rider Lurker Openness to Experience *** *** ** * ** *

slide-41
SLIDE 41

Cl Clai aim 2. . Th They will not t be stati tic, c, but t instead ad will ch chan ange in different t con

  • ntexts

ts

50 100 150 200 250 300 350 400 450 500

1 2 3 4 5 6 7 8 9 Role Frequency

Chat Day

SMOC Social Roles Over Time Over-rider Lurker Driver Task-Leader

SMOC Data Set

  • 1. Qualitative look at the data
  • 2. State Transition Networks
slide-42
SLIDE 42

Take-Away

  • The roles do not appear to be highly or consistently related to trait-

based characteristics

  • The roles are not static, but instead change in different contexts
  • Most of those transitions appear to support a more emergent perspective

Trait-based Random

Emergent

slide-43
SLIDE 43

Conclusions

  • The GCA appears to be a robust method for identifying conversational roles
  • The identified roles have practical value in adding to our understanding of

why some groups and students perform better than others

Conversational Patterns Sociocognitive Processes Social Roles

slide-44
SLIDE 44

Next Steps: Diving Deeper

  • Temporal dynamics
  • Right now we are looking at averages
  • It is possible that an individual shifts roles throughout an interaction or over longer

periods of time as they gain experience

  • Other variables
  • Internal (linguistic)
  • Affect, Topic Relevance
  • External (individual/contextual)
  • #s of resources viewed
  • Other demographic variables
  • Other contexts and outcomes
  • Crowd sourcing design interactions
  • OPEN IDEO- creativity
slide-45
SLIDE 45

Many Th Thanks!

Jamie Pennebaker Tristan Nixon Art Graesser Zhiqiang Cai

slide-46
SLIDE 46