On network analysis and user behavior Ramayya Krishnan iLab, The H. - PowerPoint PPT Presentation

On network analysis and user behavior Ramayya Krishnan iLab, The H. John Heinz III College Carnegie Mellon University Pittsburgh, PA rk2x@cmu.edu

Outline • Two examples – Intra-organizational KM – the role of triadic closure or cliques in determining user behavior – Product adoption – the role of social influence vs. homophily • Key points – Multi-disciplinary perspective that blends computational and social science is needed – New estimation methods to work with novel data sets – Need for new methods to design and conduct experiments in a networked world

Example 1: Social Media and Knowledge Management in a Global Organization

Sample data posting of query and responses

Sample Query • Query on: Singleton class and threads in Java • Responses: 1. Singleton class means that any given time only one instance of the class is present, in one JVM. So, it is present at JVM level. 2. The thing is if two users(on two different machines which has separate JVMs) are requesting for singleton class then both can get one-one instance of that class in their JVM.

Data description • Message level and thread-level data from forum • Message characteristics – Posting time, EmployeeID, Thread, Type of message (query or response), content of message etc. • User characteristics – EmployeeID, Tenure at firm, Age, Gender, Location, Division, Job Title

Network structure evolution Sequence of Actions:  User 301 posts a 301 641 query Q1000  Users 502, 641 post responses  User 900 posts a 502 900 query Q1001  Users 301, 641 post responses Directed Response Graph

Network structure Asymmetric tie: • A as responded to B’s query but B has not responded to A Sole-symmetric tie: • Users have responded to each other, but not as part of a clique Simmelian Tie: • Users are part of a ‘clique’, whose members have all responded to one another

Simmelian Ties Research Questions 1. Can Simmelian ties be established in an electronic communications medium with repeated interactions? Will they matter? 2. Do these ties depend upon the context? Do more instrumental contexts result in weaker Simmelian ties or less effective Simmelian ties? 3. Do both current context (what type of query) or past context in which the tie was established matter?

Dyadic QAP Regression Results Dependent variable: Number of response by A to B in period two

Dyadic QAP Regression Results Dependent variable: Number of response by A to B in period two Explanatory Variables: Dyadic Homophily Measures, Structural Properties in period one

Example 2: Social Influence vs. Homophily in product/service adoption • Focus on identifying users that can help diffuse “information” over the network • Learn about the power of “social influence” as trigger for the diffusion process • Learn about how social influence is associated to “contagious churn”

Research Question  Can we predict consumers’ product purchase decisions…  Using social network information? 17

Theoretical Foundation  Homophily (Mcpherson et al. 2001)  “Birds of a feather flock together” Looks good Like this? Looks good

The Challenge  Large-scale network Adam I like it Bob ? No, I don’t Chris 19

Literature  A rich literature on networks from various fields (e.g. Kleinberg 1999, Brin and Page 1998)  Network-based marketing  Network Neighbors: Hill, Provost, Volinsky (2006)  Viral Marketing: Richardson and Domingos (2002)  Classification: Macskassy and Provost (2003, 2007)  What about unobserved product taste ?  For small, tightly connected groups: Hartmann (2010)  But what about large-scale networks of arbitrary connection structure? 20

This Study  Model correlated purchase behaviors of consumers in a large social network…  Using Gaussian Markov Random Field (GMRF) to characterize latent product taste  Handle networks of arbitrary topology  Encapsulate conditional independence  Estimation result confirms the positive taste correlation among connected people  Predictive performance better than existing LR based models, and better than SVM based models, too. 21

Data  Obtained from a large Asian telecom company  231,416 customers  6 month period  Detailed phone call data  Who called whom, when  Demographics information: gender, age  Purchase records of caller ringback tone (CRBT)  Who purchased what, when  Can we predict CRBT adoption decisions? 22

Descriptive Statistics Mean SD Min Max Gender Male 218017 Female 13399 Age 40.56 13.67 Number of Consumers Called by Each Consumer 13.73 22.9 1 2858 Number of Phone Calls Per Consumer 410.4 942.7 1 59016 Adoption Number Percentage Number of Consumers 231416 Number of Consumers Who Adopted CRBT 79505 34.36% Adoption Percentage by Gender Male 34.50% Female 31.89% Preliminary analysis: gender doesn’t help much in prediction… 23

Data – Preliminary Analysis Age doesn’t help much, either… Adoption By Age 80000 0.45 0.4 70000 Number of Consumers Adoption Percentage 0.35 60000 0.3 50000 0.25 40000 0.2 30000 0.15 20000 0.1 10000 0.05 0 0 <20 20-29 30-39 40-49 50-59 >=60 Age Number of Consumers Adoption Percentage 24

Data – Preliminary Analysis Node degree helps a lot (need for social network)! Consumer Adoptions By Degree 1000000 0.7 0.6 Number of Consumers 100000 Adoption Percentage 0.5 10000 0.4 1000 0.3 100 0.2 10 0.1 0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90+ Degree Number of Consumers Adoption Percentage 25

Data – Preliminary Analysis Can we do better? B A C D Non-Adopter Adopter Maybe, but need the discipline of a model 26

Model There are I consumers in a social network C  Connection matrix: [ ij ] c  1 if consumers and are connected i j   c ij  0 otherwise  1 if consumers adopts the product i  Adoption decision:  D i  0 otherwise 27

Adoption Probability Binary Probit Model    Pr( 1 ) Pr( 0 ) D U i i       U X i i i i  ~ N ( 0 , 1 ) Random disturbance i Observed individual characteristic X i (gender, age, connection degree)  Unobserved product taste i Modeled as a GMRF! 28

Gaussian Markov Random Field (GMRF)   T Definition (GMRF) : A random vector is called GMRF w.r.t. the undirected ( 1 ,... ) x x x n  and precision matrix   with mean   graph if and only if its ( { 1 .. }, ) 0 G V n E Q density has the form:  1              / 2 1 / 2 n T ( ) ( 2 ) | | exp( ( ) ( )) x Q x Q x 2 And     0 { , } , , Q ij i j E i j  A multivariate normal vector  Connection structure encoded in its precision matrix  Non-zero off-diagonal elements correspond to connections 29

Properties of GMRF  Can model connections of arbitrary topology  Better than using in-group correlation  Encodes conditional independence     | 0 , , x x x Q i j  i j ij ij e.g. 1 2 3 Consumers 1 and 3 should be correlated But conditional on consumer 2, they should be independent  Model parameters have intuitive explanations 30

Model Latent Product Taste Using GMRF           1       1   ... ~ ( ... , ) N Q [ ] , where 0 if 0 Q q q c ij ij ij             I    Precision ( |  ) Straightforward Interpretation : q i i ii      Cor ( , | ) q / q q  i j ij ij ii jj Parameterization (base model, model B ):        0 ... r r      Conditional correlation between   r 0 ... 0 r   connected consumers     0 0 ... Q r      Conditional precision  ... ... ... ...          0 ... r r 31

On network analysis and user behavior Ramayya Krishnan iLab, The H. - PowerPoint PPT Presentation

On network analysis and user behavior Ramayya Krishnan iLab, The H. John Heinz III College Carnegie Mellon University Pittsburgh, PA rk2x@cmu.edu Outline Two examples Intra-organizational KM the role of triadic closure or cliques

APPLIED BEHAVIOR ANALYSIS Specialization Overview Agenda What is Applied Behavior Analysis

Identifying Web Spam Identifying Web Spam With User Behavior Analysis With User Behavior

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

BEHAVIOR @ HOME Behavior Basics Simple strategies that can make a big difference! Presented by

DNA Interaction Follow Network Network User-Product Network Nonuniform network comm costs

Eugene Agichtein g g Emory University Eugene Agichtein RuSSIR 2009: Modeling User Behavior and

Summary User-centric Social Social Multimedia Multimedia Computing From Users: user-perceptive

voice Kate Howland End-user programming? End-user programming? End-user programming?

User Pays User Committee User Pays User Committee 8 th August 2011 1 2 Agenda

3.1. Strategic Behavior Matilde Machado 3.1. Strategic Behavior The analysis of strategic

APPLIED BEHAVIOR ANALYSIS IN SCHOOLS Nancy Rosenberg, University of Washington What is Applied

Why actor analysis? Actor and network analysis Bert Enserink Network map of linked Network map

UX/UI What is UX and UI? UX Process User Research User Research Creating User

Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with Behavior Data Eugene

Screen 1 Go to www.myenroll.com < Click Request User ID and Password> Acquire USER ID and

#TZA2018 THAILAND SOCIAL MEDIA SUMMARY 49 13.6 12 Million User Million User Million User

and Private Certification Timothy Simcoe Boston University & NBER Michael Toffel Harvard

SENSATA FIRST QUARTER 2019 EARNINGS PRESENTATION MAY 1, 2019 Forward-Looking Statements and

Exact Statistical Inference after Model Selection. Jason D Lee Dept of Statistics and Institute

Introduction to Matching and Allocation Problems (II) Scott Duke Kominers Society of Fellows,

Outcome Effectiveness of the Widely Adopted EFNEP Curriculum: mart Being Active Eating S

Ideas worth spreading: How does network position influence the spread of research topics?

QPAT Pension Workshop Names You Should Know Retraite Qubec merged administrative body

Statistical Ranking Problem Tong Zhang Statistics Department, Rutgers University Ranking

Sambuz

Useful Links

Newsletter

Mail Us

On network analysis and user behavior Ramayya Krishnan iLab, The H. - PowerPoint PPT Presentation

On network analysis and user behavior Ramayya Krishnan iLab, The H. John Heinz III College Carnegie Mellon University Pittsburgh, PA rk2x@cmu.edu Outline Two examples Intra-organizational KM the role of triadic closure or cliques

APPLIED BEHAVIOR ANALYSIS Specialization Overview Agenda What is Applied Behavior Analysis

Identifying Web Spam Identifying Web Spam With User Behavior Analysis With User Behavior

RUN groupadd -r user &amp;&amp; useradd -r -g user user USER user $ docker run --read-only debian

BEHAVIOR @ HOME Behavior Basics Simple strategies that can make a big difference! Presented by

DNA Interaction Follow Network Network User-Product Network Nonuniform network comm costs

Eugene Agichtein g g Emory University Eugene Agichtein RuSSIR 2009: Modeling User Behavior and

Summary User-centric Social Social Multimedia Multimedia Computing From Users: user-perceptive

voice Kate Howland End-user programming? End-user programming? End-user programming?

User Pays User Committee User Pays User Committee 8 th August 2011 1 2 Agenda

3.1. Strategic Behavior Matilde Machado 3.1. Strategic Behavior The analysis of strategic

APPLIED BEHAVIOR ANALYSIS IN SCHOOLS Nancy Rosenberg, University of Washington What is Applied

Why actor analysis? Actor and network analysis Bert Enserink Network map of linked Network map

UX/UI What is UX and UI? UX Process User Research User Research Creating User

Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with Behavior Data Eugene

Screen 1 Go to www.myenroll.com &lt; Click Request User ID and Password&gt; Acquire USER ID and

#TZA2018 THAILAND SOCIAL MEDIA SUMMARY 49 13.6 12 Million User Million User Million User

and Private Certification Timothy Simcoe Boston University &amp; NBER Michael Toffel Harvard

SENSATA FIRST QUARTER 2019 EARNINGS PRESENTATION MAY 1, 2019 Forward-Looking Statements and

Exact Statistical Inference after Model Selection. Jason D Lee Dept of Statistics and Institute

Introduction to Matching and Allocation Problems (II) Scott Duke Kominers Society of Fellows,

Outcome Effectiveness of the Widely Adopted EFNEP Curriculum: mart Being Active Eating S

Ideas worth spreading: How does network position influence the spread of research topics?

QPAT Pension Workshop Names You Should Know Retraite Qubec merged administrative body

Statistical Ranking Problem Tong Zhang Statistics Department, Rutgers University Ranking

Sambuz

Useful Links

Newsletter

Mail Us

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

Screen 1 Go to www.myenroll.com < Click Request User ID and Password> Acquire USER ID and

and Private Certification Timothy Simcoe Boston University & NBER Michael Toffel Harvard