Network Text Analysis of R Mailing Lists UseR! Rennes 2009 Angela - - PowerPoint PPT Presentation

network text analysis of r mailing lists
SMART_READER_LITE
LIVE PREVIEW

Network Text Analysis of R Mailing Lists UseR! Rennes 2009 Angela - - PowerPoint PPT Presentation

Network Text Analysis of R Mailing Lists UseR! Rennes 2009 Angela Bohn, Ingo Feinerer, Kurt Hornik, Patrick Mair, Stefan Theul 7/10/2009 A mailing list social network R -help mailing list: Jan 2008 to May 2009 Number of authors: 5326


slide-1
SLIDE 1

Network Text Analysis of R Mailing Lists

UseR! Rennes 2009 Angela Bohn, Ingo Feinerer, Kurt Hornik, Patrick Mair, Stefan Theußl 7/10/2009

slide-2
SLIDE 2

A mailing list social network

R-help mailing list: Jan 2008 to May 2009 Number of authors: 5326 Number of mails: 41457 Avg. degree: 4.4 Diameter: 7 Legend:

Author A Author B

answered

slide-3
SLIDE 3

Combine SNA and TM

◮ Goal: Combine social network analysis (SNA) and text mining

(TM) to find out more

◮ Data: Mailing lists R-help and R-devel ◮ Packages: sna and tm ◮ Results:

◮ “Interest maps” of R users ◮ Detection of bottlenecks in communication

slide-4
SLIDE 4

Data preparation for social network analysis

◮ Create a social network from e-mail headers (tm):

From: dwinsemius at comcast.net (David Winsemius) Date: Thu, 30 Apr 2009 18:49:55 -0400 Subject: [R] Extracting Element from S4 objects In-Reply-To: <23302265.post@talk.nabble.com> Message-ID: <A6039F4E-ABF4-41C5-B03E-FFF32E07C37A@comcast.net>

Author A: ''Hallo, I have a question.'' Author B: ''This is the answer.'' Author C: ''This, too.'' Author D: ''And this.'' Author D: ''And this.'' Author A: ''Thank you.'' Author B Author C Author D Author A

◮ Find aliases:

knoblauch at lyon.inserm.fr (knoblauch) knoblauch at lyon.inserm.fr (Ken Knoblauch) ken.knoblauch at inserm.fr (Kenneth Knoblauch) ken.knoblauch at inserm.fr (Ken Knoblauch)

Levensthein Distance: agrep(base)

slide-5
SLIDE 5

Data preparation for text mining

E-mail subjects:

[R] passing args from the command line [R] navigating ggplot viewports [R] how to go to a line in R ...

Term Frequencies (termFreq(tm)):

question data

help function

plot

using

sub package matrix

error

values frame vector list file regression

test model multiple

issue font time variable loop table lattice fedora

value rendering kde plasma plotting plots axis read column

  • utput

code

  • bject

text variables analysis columns rows string

packages windows glm functions line anova simple row bug dataframe files creating distribution names random reading linux ggplot2 linear series set date factor graph lines

size name logistic results sweave character histogram labels subset xyplot create running installing legend format message graphics numeric install lmer color correlation missing newbie models
  • bjects
extracting latex array convert summary vectors adding mean frames pdf nls zoo lme sas finding errors write questions mixed subject confidence standard ubuntu argument calculating fitting memory apply add command image matrices normal version based matching converting elements sample barplot grid factors graphs method library repeated sum panel predict programming density getting boxplot effects change csv type fit mac calculate coefficients pkgs class curve save system extract arguments
  • ptimization
difference expression script simulation distance excel equivalent spss rgl changing design matlab
  • perator
binary loading building survival title combining log request dates
  • dp
source device estimation formula variance levels loops selecting subsetting nested generate remove tables means
  • ptim
paste index nonlinear trouble binomial length scale user draw specific strings statistics run sampling merge warning select double reshape tests gam multi word call unique axes behavior intervals logical characters cluster merging replace bar bit interval load bootstrap please sets documentation html tapply window aggregate bars label multivariate vec likelihood basic colors max maximum pca regular response setting step writing ggplot check contour editor hmisc comparing importing parameter sort interaction power calling graphic lapply null count equation methods nlme conditional installation sequence update chart inside level box dataframes empty rodbc search split symbols cross generating pairs parameters sorting testing xml negative stats access dataset default limit lists manipulation map print single specify splitting display square distributions result datasets integer rpart syntax arima combine fwd gui sem true unable replacing limits lme4
  • dfweave
passing removing selection symbol weighted average cmd key prediction unix zero clustering false classes computing cor directory estimates position aov efficient expressions interpreting measures processing comparison element frequency query solve strange barchart par sig advice arrays categorical course png statistical trellis words build fortran sql accessing behaviour cran figure program quantile range regarding statement fill import instead parsing tinn tree component custom doesn effect randomforest server algorithm cox inverse median path plus positive applying compute condition drawing
  • bservations
statements avoid batch efficiency example nas process rcmdr alternative calculation counting curves gamma howto looking partial passed poisson res return rmysql saving scatter transform colours scatterplot understanding fixed form grep heatmap information pattern start base complex coxph cumulative moving printing significance smooth space survey allocate automatically document generation
  • utside
pairwise regex solving via vista aic filter intercept levelplot sink spatial tick txt according background coefficient compare covariance ignore insert mode postscript precision smoothing transformation conversion following forest history java ratio strsplit updating decimal download emacs igraph lrm maps mle weights environment estimate generalized input page product chi constrained coordinates family fast license messages modifying multiplication quick reference solved var book constraints control defined defining interactive interface
  • ptions
parallel surface utf classification database definite deviation existing exporting google match produce resolution slow stata usage assign calculations compile dependent dimensional fails handle handling histograms including integrate internal looping rmpi samples snow structure automatic comments export formatting inconsistency join layout solution trying validation wrong assigning equations faster finance greek include invalid link proper spec workspace console don fitted local main post related spline xls 2009 boxplots containing contrasts fits found guis hist indexing labeling measure mixture normality prettyr scripts transforming colnames combinations cons correct delete diff longitudinal numerical permutation permutations repository separate strptime times x11 abline attributes circular compiling connection dimensions pros randomly scaling sequences speed stepaic stuff tcl warnings xtable available bwplot cbind convergence day digits entries equal etc hazard integers iso issues percentages rbind residuals sens types web weibull changes dll functional graphical horizontal interpretation intersection left logit mailing
  • riginal
parse perl previous recursive scales score shape singular svm terms transfer trees assignment colour contingency difficult duplicated fisher goodness impossible integration naming
  • ls
panels simulating spam substitute upgrade bivariate displaying exclude external failure huge increase modelling plotmath rdata squares university beginner biplot comparisons compilation crosstable estimating filled improving kernel language meaning modeling polynomial probability rgui stepwise suggestions try char clusters confusion created facet manipulations popular probabilities recoding regresion segments skip specifying starting strip structures versus xlim differences dummy executing feature figures gee generic gsub loess margins
  • utputs
posixct predicting reduction rep root stack unexpected ancova auto avoiding censored computation dendrogram else eval fft forecasting hash hoc identical individual manipulate network
  • perations
pages parametric simulate site study style subscript survreg vertical accuracy checking controlling covariates creation discriminant duplicate exponential finite fourier identify mirror
  • verdispersion
pass pre qplot r2winbugs rank recursively seq shapiro stacked thanks truncated unclassified 8859 acf ascii beta components constraint correction criteria deleting distances downloading ess frequencies gradient little neural plm sapply subsets union unit win behavioural block breaks cat charts col conditionally contrast decomposition duplicates elegant factorial failed gnu jri kruskal letter mgcv multinomial nesting
  • peration
  • sx
  • verlapping
persp quotes rjava rownames rprofile similar support training unequal analyzing applied collection expand free garch importance incomplete installed iterations machine manipulating mapping
  • ptimize
  • verlay
proportions quality raster regressions relative risks scope solaris suggestion task tcltk week wise 100th advanced chisq coords counts dimension encoding fatal ifelse inserting jpeg legality lower maptools marks mfrow
  • dd
  • ption
pie plotrix renaming retrieving scores settings sign ticks timeseries transferring trend urgent competing concatenating days dealing difftime filtering items labelling menu multidimensional nnet paired prcomp producing proportional quote searching significant tabulation troubles tutorial vegan weird wireframe x86 central common concatenate constructing copy correlated debian descriptive devices division dynamic fonts gaussian initialize manual
  • btaining
permanent principal rbinom replicate residual robust roc scatterplot3d sec sensitivity spearman specification statistic storing suppressing theorem tkgraph topic wide 2008 atomic bad bottom crash crashes daily diagram efficiently expected gls

R-help

package

bug

windows

error

function

packages

cmd using check

data install

names code file

building plot files help

suggestion matrix patch behaviour class

argument date text true fails method build functions

factor frame library methods source list version surprising incorrect typo message warning example rfc x11 default documentation numeric

crashes devel false mac memory linux axis multiple

  • bject

parallelr sequence graphics foo missing question add dll errors

  • utput

size closed extending free segfault time wishlist device name behavior proposed sprintf ubuntu vector embedded lines 2009 calling change compiling integer value vignettes crash intel match type wrong

description level loading summer table google request call format issue path test blas configure cran installing key license pdf write cut development environments extensions glm html infinite shlib view arguments binary checking compile fortran installation null update attributes posixct read rmpi running slot warnings bad conflict minor random solaris vista alias changelog loop print returns 13570 base causes external result tests anrpackage api decreasing empty extension fix generic line loaded mandatory results seed segfaults set sweave unique usage variable cat computer dir environment foreign gui length limit
  • bjects
parameter performance png readable rtools28 safe status strange sub unix window exe leak legend libraries link namespace produces suggestions svn vignette binding collection describe don garbage grid java jpeg lapack report tcltk x86 buggy column console generate importing item levels model nas
  • perator
  • ptimization
reading related reset return roxygen sapply string times writing 10744 adding bin called compilation debian definition digits dlls dput found image interactive interrupting label local programming qqnorm saveplot skeleton system zoo create cross custom distance enable examples index indexing invalid linking macports note particulars references regular streams subsetting unknown values versions 2008 beta buglet cairo changes compiler copy correction creates dates derivs drop embedding feature gsub logical loosing negative postscript simulate snow space start style 13671 aix bit close containing directory etc failure frames fwd generated input labels mkl nls
  • verflow
par platform rgui spss uninstall user v10 win adds aggregate alpha analyzer arima array assign automatically callcc cbind comments comparison cook cpu finding freebsd header inaccurate inf lexical locale makefile messages native
  • sx
reference row rprofile rsitesearch setdiff symbol testing thread var vectors via wish xyplot zero 11281 9682 access available bugs classes command convert crashy cycle dim duplication dynload encoding expressions fail faq forge handle harddisk hardy heron idea interface intro issues lib64 limits loops mean merge namespaces news
  • desolve
plots readline recent rhel rng roracle search shortpathname significant simple soc09 stack 13515 anova application assigning automatic buffer capturing character clash colour current defining detected dos dropping ebimage endings exec execution exits faster fault figure filename formulae getclass inheritance log10 log2 matching means nan
  • ption
  • verplot
page proc program questions rbuildignore readbin registry require rgl segmentation serialize sun survfit usr won wrapper xlim 10571 10746 addcomment based binomial breaks changed compiled congiure definevar dnbinom doc engine equivalent event export extract failing fitted frf2 gfortran gpl gram info maintained makevars matplot median modification mpi multi parallel passing paste persists platforms posixt quantile quartz rcmdr remove rowsums rscproxy runs save sig stated strings summary swig tarball title unable underscore updating variables winxp 10600 11493 12540 13161 actions actual alternative amd64 avoid bounded bundle catch clean closure col colsums computing config conversion detectable dev differences downloading embed experiments funtion gamma gctorture getting git guide holtwinters iconv ignores import inconsistency installed javareconf minimal patched pnorm posting preserve progress r45591 rank recursive revision rjava rmath rrors schemes scope script snapshot stringsasfactors thrown top upper utf windialogstring wrapping 10589 13284 48590 accuracy adj aka aliasing app assignment chol2inv clobbered coerce condition conditionally constructor crasher demo descriptor directories dislikes display editor effect effects elements expression field focus folder fools formula friendly gdb getsrclines gregexpr hist incoherent incorrectly influences installer locating manual math2 maximum mess miktex
  • dd
  • ffset
  • ptim
  • ptional
parser party perl please plotting popup posixlt printed recommended repositories returning rodbc routine rows rushing setting sexp spaces splinefun standalone storage third treatment trellis undefined unicode upside util xtfrm ylim zlib 1000 11491 12742 13283 13391 13538 13646 ability aborting amount args biobase bizarre calls caused child choose clarification combination configured conform constraint couldn dependencies dependency dimension dispatch docs doesn eval exporting feasible fonts forcing formatting giv hook identify ignored intentionally internal lapply libs linkingto listing literate lookup macos mass mishandled net noncentrality numerical
  • penmpi
  • ptimisation
  • ptions
  • utdated
  • verlap
  • verriding
  • verwriting
parent postinstall preserved prod produced puzzled quote range recursion recursions regions relative removing reproducible resuts rin rnorm rout rterm s4method section selection sessioninfo sgi shouldn signal sort split standard str support suppress switch symbolic tarballs textconnection threads tiff trivial unexpected unsigned unsorted valgrind violation zip 000 10565 10635 10666 10776 10945 11470 11497 11499 11511 11527 12770 13231 13487 13686 13703 8192 able action allocation arithmetic attribute auto autocompletion availability bar bayesian bounds broken browser c99 cache calculating callbacks car cdf centos changing closing cmp color compilers conflicts connections correct correctly costs creating creation dead deep defined depending despite dimnames distortion distributing dotcall double download drop1 dump dup embeding enhancement esc executable externalptr extrapolating filter fitdistr fitting generics gsoc handling heatmap identifying ifelse improvement including iqr irrelevant itemize iterator jri lend licensing links locales locate loses machine macintosh maintainer manova matrices max merging messing mistakenly models nlminb normal nulls
  • ps
panel parsing pass passed perils polr poly predict preferred primitive printing programs quality query r44608 randomly raw rcmd release rimage rmysql roadmap rpart sample scatter scoping scripts selected seq single smooth solve sometimes specifying speed speedup splus sslogis strategy strsignif subscripting suggested svnversion symbols timestamp tkcanvas touch transperancy unchange unlist unserialize upgrading weighted weights weird whitespace writting xpd yscrollcommand 10000 10592 10801 10953 11034 11036 11054 11192 11231 11334 11399 11495 11537 12016 12112 12520 12636 12931 13292 13361 13454 13475 13494 13504 13533 13551 13620 13631 13699 13711 2003 2846 accessing acf addition additional address ann antialias applied arith article basename blah blocking boxplot broke bugfix builder builds calculation characters checkout chi chisq clipping cmyk cocoa codes concerning conf contain core counting deal debugger defaultfont defaults

R-devel

slide-6
SLIDE 6

Word Networks

“lattice”

Thierry ONKELINX Jim Lemon

Deepayan Sarkar

Weidong Gu

Felix Andrews

Sundar Dorai Raj

Bert Gunter John Kane Stefan Grosse

Dieter Menne Saptarshi Guha

Brian Ripley hadley wickham

Alex Brown Paul Hiemstra Thomas Zumbrunn

Henrique Dallazuanna

Ron Bonner Tom Cohen Levi Waldron Patrick Connolly Michael Kubovy

K Elo

Qian R Aaron Arvey Paulo Cardoso

Charilaos Skiadas

RINNER Heinrich Ola Caster Andrewjohnclose

Gabor Grothendieck

Douglas Bates Mark Leeds Steven McKinney Karl Ove Hufthammer Malte Brockmann John Smith Ben Bolker Troels Ring Gavin Simpson Michael Hopkins Mark Difford

Chuck Cleland

baptiste auguie

Judith Flores Daniel Ott Surai

stephen sefick

Tom Bonen John Fox Richard Cotton Alex Karner

Ferry

Paul Boutros Dhruv Sharma Gary Nelson Paul Murrell Marianne Promberger Rolf Turner Scott R Waichler steve Robbie Heremans John Poulsen Greg Snow Ryan Hafen Jon Loehrke Dimitri Liakhovitski remko duursma Christos Argyropoulos Andrew Beckerman Leandro Marino Iago Mosqueira Juliet Hannah Mike Lawrence Mark Wilkinson

Dan Kortschak Dimitris Rizopoulos Naomi Robbins

jimdare

Steve Friedman

Coltrey Mather Thomas Roth Andrew McFadden Ranjan Maitra Gustaf Rydevik Rebecca Fisher Daniel Kornhauser Pieter van der Spek Tyler Smith Sebastien Bihorel rhelp 20 trevva Einar Arnason William Deese Don McKenzie Samuel B Civ USAF AFMC AFRL RVBXI Cable David Winsemius Alex Reynolds Benjamin Tyner francesc montane Bram Kuijper Wolfram Fischer Jim Price audrey Mark Coletti Katell HAMON Richard and Barbara Males Henning Wildhagen Kyle Roberts G Draisma GOUACHE David David Chosid Andreas Krause Patrick Hausmann Chris Barker Brian Desany Joachim Heidemeier Dr PALMIER Patrick CETE NP INFRA TRF Seth W Bigelow Javier PB lost river Mark Heckmann ravi
slide-7
SLIDE 7

Word Networks

“ggplot”

hadley wickham

Paul Murrell Domenico Vistocco P E David Thompson

Thierry ONKELINX

ba8 John Kane Rainer M Krug

Felipe Carrillo

Uwe Ligges Martin Rittner Chris Friedl Tribo Laboy

Pedro de Barros

Sebastian Weber Ptit Bleu

guillaume chaumet Brian Ripley Xavier Chardon

mihalicza peter

Ronggui Huang Michael Friendly Carsten Jaeger Michael Frumin

Ben Bolker Aric Gregson Megh Dal david f Sorn Norng

baptiste auguie

stephen sefick

Eric Gabor Grothendieck Tylere Couture

galneweinhaw

David Hajage Dave Murray Rust Juliet Hannah steve

Josep Maria Campanera Alsina

Wayne F

Harsh Jason Rupert Tom Cohen Avram Aelony Ian Fiske Christopher David Desjardins Ista Zahn Dieter Menne Bernd Weiss haettulegur Marianne Promberger Andreas Christoffersen MUHC Research Sunil Suchindran

Paul Emberson

Mike Lawrence Zeljko Vrba Ivan Alves Tena Sakai Ian Fellows Felix Andrews jiho Matias Gallego Liberman simeon duckworth Bernd Engelmann Bernd Ebersberger Jason Law Ingo Michaelis Mikhail Spivakov Thorsten Vogel Williams Scott Edna Bell Henning Wildhagen Tom Bonen Albin Blaschka btcruiser Adam Marsh Elena Schulz Spinu Vitalie Jacob Etches Erich Studerus levyofi Etches Jacob stephen bond
slide-8
SLIDE 8

Word Networks

“legend”

Henrique Dallazuanna Tom Snowden Marc Schwartz tom soyer John Kane Peter Dalgaard

Lauri Nikkinen

Stanley Ng

Earl Glynn

Duncan Murdoch

Deepayan Sarkar

hadley wickham

Gabor Grothendieck Jim Lemon

Thomas Steiner Thierry ONKELINX Pedro de Barros

Gavin Simpson

Greg Snow

Georg Otto Hans Joerg Bibiko Uwe Ligges

Yasir Kaheil Paul Johnson Carsten Jaeger Markus Gesmann Tomas Lanczos Matthew Pietrzykowski Alexandre Aguiar

baptiste auguie

stephen sefick

Julia Liu Bernardo Rangel Tura Joachim Heidemeier Dr Felix Andrews Christophe Genolini Mathieu Ribatet

Simone Giannerini

Martin Maechler Brian Ripley

Dimitri Liakhovitski David Winsemius Julien Beguin Dieter Menne Avram Aelony Aparna Vemuri Rodrigo Aluizio

Jorge Velez

Samuel B Civ USAF AFMC AFRL RVBXI Cable Sherri Heck jimdare Rebecca Fisher Michael Head Christophe Dutang Olivier ETERRADOSSI Mathew Fox Simon Pickett anisha sinnarkar

Steve Murray Sarah Goslee

Peter Flom Mike Lawrence Felipe Carrillo Zeljko Vrba Tom Boonen steve rhelp 20 trevva Tom Cohen Bernd Engelmann ming kung Neuer Arkadasch kate yvo Georg Ehret Levi Waldron Agus Susanto Mark Farnell Anne Marie Ternes ba8 Tom Bonen Patrick Hausmann Nelson B Villoria yk Lavan Sarah Vandome mbr Michael Kubovy PALMIER Patrick CETE NP INFRA TRF Paul Emberson valeria pedrina Erich Studerus Giovanni Petris John Luis Ridao Cruz Etches Jacob Wade Wall
slide-9
SLIDE 9

Word Networks

“boxplot”

Brian Ripley

Henrique Dallazuanna Petr Pikal Jim Lemon Chuck Cleland

David Hewitt

Gabor Grothendieck

Marc Schwartz John Kane

Sharon Kuhlmann B

HBaize

Thierry ONKELINX Chris Friedl

hadley wickham

Alex Reynolds Felipe Carrillo Matthias Kohl Peter Alspach Sherri Heck Thomas Adams Stephan Kolassa Marlin Keith Cox Sebastian Luque Michael Kubovy Bill Venables Phil taylor mihalicza peter Birgit Lemcke S Ellison jim holtman

stephen sefick

Megan J Bellamy Erik Iverson Leandro Marino Bernardo Rangel Tura

Uwe Ligges

Frank E Harrell Adaikalavan Ramasamy Jorge Velez

cathelf Greg Snow

Yihui Xie Keith Ponting Coey Minear Antje Ben Bolker Peter Dalgaard goran brostrom

Deepayan Sarkar

Kenneth Roy Cabrera Torres steve Phillip Porter Ken Knoblauch Petter Hedberg Dimitri Liakhovitski David Winsemius Annette Heisswolf Pooja Jain Coen van Hasselt Andreas Christoffersen Gabriel Rodriguez Mike Lawrence Dieter Menne Zeljko Vrba Richard Yanicky Corinna Schmitt rich Karin Lagesen Marc Bernard Cornelis de Gier Marcin Kozak Tom Cohen Giulio Di Giovanni Marek Bartkuhn Georg Ehret Tubin Paul Adams Sebastian Merz mm745 Rajasekaramya Chad Junkermeier Daniela Garavaglia Murlidharan T Nair Aldi Kraja James Lenihan Amit Patel Samor Gandhi Kenneth Takagi
slide-10
SLIDE 10

Centrality Measures

Notion lattice ggplot legend boxplot

Thierry ONKELINX Jim Lemon Deepayan Sarkar Weidong Gu Felix Andrews Sundar Dorai Raj Bert Gunter John Kane Stefan Grosse Dieter Menne Saptarshi Guha Brian Ripley hadley wickham Alex Brown Paul Hiemstra Thomas Zumbrunn Henrique Dallazuanna Ron Bonner Tom Cohen Levi Waldron Patrick Connolly Michael Kubovy K Elo Qian R Aaron Arvey Paulo Cardoso Charilaos Skiadas RINNER Heinrich Ola Caster Andrewjohnclose Gabor Grothendieck Douglas Bates Mark Leeds Steven McKinney Karl Ove Hufthammer Malte Brockmann John Smith Ben Bolker Troels Ring Gavin Simpson Michael Hopkins Mark Difford Chuck Cleland baptiste auguie Judith Flores Daniel Ott Surai stephen sefick Tom Bonen John Fox Richard Cotton Alex Karner Ferry Paul Boutros Dhruv Sharma Gary Nelson Paul Murrell Marianne Promberger Rolf Turner Scott R Waichler steve Robbie Heremans John Poulsen Greg Snow Ryan Hafen Jon Loehrke Dimitri Liakhovitski remko duursma Christos Argyropoulos Andrew Beckerman Leandro Marino Iago Mosqueira Juliet Hannah Mike Lawrence Mark Wilkinson Dan Kortschak Dimitris Rizopoulos Naomi Robbins jimdare Steve Friedman Coltrey Mather Thomas Roth Andrew McFadden Ranjan Maitra Gustaf Rydevik Rebecca Fisher Daniel Kornhauser Pieter van der Spek Tyler Smith Sebastien Bihorel rhelp 20 trevva Einar Arnason William Deese Don McKenzie Samuel B Civ USAF AFMC AFRL RVBXI Cable David Winsemius Alex Reynolds Benjamin Tyner francesc montane Bram Kuijper Wolfram Fischer Jim Price audrey Mark Coletti Katell HAMON Richard and Barbara Males Henning Wildhagen Kyle Roberts G Draisma GOUACHE David David Chosid Andreas Krause Patrick Hausmann Chris Barker Brian Desany Joachim Heidemeier Dr PALMIER Patrick CETE NP INFRA TRF Seth W Bigelow Javier PB lost river Mark Heckmann ravi hadley wickham Paul Murrell Domenico Vistocco P E David Thompson Thierry ONKELINX ba8 John Kane Rainer M Krug Felipe Carrillo Uwe Ligges Martin Rittner Chris Friedl Tribo Laboy Pedro de Barros Sebastian Weber Ptit Bleu guillaume chaumet Brian Ripley Xavier Chardon mihalicza peter Ronggui Huang Michael Friendly Carsten Jaeger Michael Frumin Ben Bolker Aric Gregson Megh Dal david f Sorn Norng baptiste auguie stephen sefick Eric Gabor Grothendieck Tylere Couture galneweinhaw David Hajage Dave Murray Rust Juliet Hannah steve Josep Maria Campanera Alsina Wayne F Harsh Jason Rupert Tom Cohen Avram Aelony Ian Fiske Christopher David Desjardins Ista Zahn Dieter Menne Bernd Weiss haettulegur Marianne Promberger Andreas Christoffersen MUHC Research Sunil Suchindran Paul Emberson Mike Lawrence Zeljko Vrba Ivan Alves Tena Sakai Ian Fellows Felix Andrews jiho Matias Gallego Liberman simeon duckworth Bernd Engelmann Bernd Ebersberger Jason Law Ingo Michaelis Mikhail Spivakov Thorsten Vogel Williams Scott Edna Bell Henning Wildhagen Tom Bonen Albin Blaschka btcruiser Adam Marsh Elena Schulz Spinu Vitalie Jacob Etches Erich Studerus levyofi Etches Jacob stephen bond Henrique Dallazuanna Tom Snowden Marc Schwartz tom soyer John Kane Peter Dalgaard Lauri Nikkinen Stanley Ng Earl Glynn Duncan Murdoch Deepayan Sarkar hadley wickham Gabor Grothendieck Jim Lemon Thomas Steiner Thierry ONKELINX Pedro de Barros Gavin Simpson Greg Snow Georg Otto Hans Joerg Bibiko Uwe Ligges Yasir Kaheil Paul Johnson Carsten Jaeger Markus Gesmann Tomas Lanczos Matthew Pietrzykowski Alexandre Aguiar baptiste auguie stephen sefick Julia Liu Bernardo Rangel Tura Joachim Heidemeier Dr Felix Andrews Christophe Genolini Mathieu Ribatet Simone Giannerini Martin Maechler Brian Ripley Dimitri Liakhovitski David Winsemius Julien Beguin Dieter Menne Avram Aelony Aparna Vemuri Rodrigo Aluizio Jorge Velez Samuel B Civ USAF AFMC AFRL RVBXI Cable Sherri Heck jimdare Rebecca Fisher Michael Head Christophe Dutang Olivier ETERRADOSSI Mathew Fox Simon Pickett anisha sinnarkar Steve Murray Sarah Goslee Peter Flom Mike Lawrence Felipe Carrillo Zeljko Vrba Tom Boonen steve rhelp 20 trevva Tom Cohen Bernd Engelmann ming kung Neuer Arkadasch kate yvo Georg Ehret Levi Waldron Agus Susanto Mark Farnell Anne Marie Ternes ba8 Tom Bonen Patrick Hausmann Nelson B Villoria yk Lavan Sarah Vandome mbr Michael Kubovy PALMIER Patrick CETE NP INFRA TRF Paul Emberson valeria pedrina Erich Studerus Giovanni Petris John Luis Ridao Cruz Etches Jacob Wade Wall Brian Ripley Henrique Dallazuanna Petr Pikal Jim Lemon Chuck Cleland David Hewitt Gabor Grothendieck Marc Schwartz John Kane Sharon Kuhlmann B HBaize Thierry ONKELINX Chris Friedl hadley wickham Alex Reynolds Felipe Carrillo Matthias Kohl Peter Alspach Sherri Heck Thomas Adams Stephan Kolassa Marlin Keith Cox Sebastian Luque Michael Kubovy Bill Venables Phil taylor mihalicza peter Birgit Lemcke S Ellison jim holtman stephen sefick Megan J Bellamy Erik Iverson Leandro Marino Bernardo Rangel Tura Uwe Ligges Frank E Harrell Adaikalavan Ramasamy Jorge Velez cathelf Greg Snow Yihui Xie Keith Ponting Coey Minear Antje Ben Bolker Peter Dalgaard goran brostrom Deepayan Sarkar Kenneth Roy Cabrera Torres steve Phillip Porter Ken Knoblauch Petter Hedberg Dimitri Liakhovitski David Winsemius Annette Heisswolf Pooja Jain Coen van Hasselt Andreas Christoffersen Gabriel Rodriguez Mike Lawrence Dieter Menne Zeljko Vrba Richard Yanicky Corinna Schmitt rich Karin Lagesen Marc Bernard Cornelis de Gier Marcin Kozak Tom Cohen Giulio Di Giovanni Marek Bartkuhn Georg Ehret Tubin Paul Adams Sebastian Merz mm745 Rajasekaramya Chad Junkermeier Daniela Garavaglia Murlidharan T Nair Aldi Kraja James Lenihan Amit Patel Samor Gandhi Kenneth Takagi

Most central persons Deepayan Sarkar, Sundar Dorai Raj, baptiste auguie hadley wickham, Thierry ONKELINX Duncan Murdoch, hadley wickham, Greg Snow Gabor Grothendieck, hadley wickham

lattice ggplot legend boxplot Deepayan Sarkar Sundar Dorai Raj baptiste auguie hadley wickham Thierry ONKELINX Duncan Murdoch Greg Snow Gabor Grothendieck

slide-11
SLIDE 11

Results: Interest maps

question data help function plot using sub package matrix error frame values vector list file regression test model multiple issue time variable loop table lattice value plotting plots axis read column

  • utput
  • bject
code text variables analysis columns rows string windows packages glm line functions anova simple bug dataframe row files creating distribution random names ggplot2 reading series date linear set graph factor lines size name logistic results histogram labels character xyplot subset create running legend installing format graphics message install lmer numeric color correlation models missing
  • bjects
summary array convert vectors frames adding pdf zoo mean lme nls errors write mixed memory standard argument image command version add apply sample based elements normal factors grid library barplot matrices matching panel sum graphs programming boxplot predict change getting method density csv mac fit save calculate type extractclass curve system arguments expression rgl script loading equivalent log title dates request source device building levels formula tables loops generate means binomial length index specific draw strings paste run
  • ptim
warning reshape multi select call unique word axes characters bar bit load replace sets aggregate bars label tapply vec window ggplot step colors regular setting checkgraphic editor max lapply count chart level box equation empty search sequence symbols cross split map limit manipulation integer lists print result gui cmd symbol key cor average expressions position processing efficient element par build behaviour program range sig statement import

Gabor Grothendieck

Henrique Dallazuanna

jim holtman

Duncan Murdoch

Peter Dalgaard Wacek Kusnierczyk

Brian Ripley

Greg Snow

hadley wickham

Deepayan Sarkar

Ted Harding

Jorge Velez

Rolf Turner John Kane John Fox

Frank E Harrell Jim Lemon David Winsemius

Erik Iverson Uwe Ligges Daniel Nordlund Bert Gunter

Douglas Bates

Ben Bolker Henrik Bengtsson tom soyer Charilaos Skiadas Andrew Robinson Philippe Grosjean Wensui Liu esmail bonakdarian Kingsford Jones Sherri Heck Eik Vettorazzi Ravi Varadhan Muhammad Azam
slide-12
SLIDE 12

Results: Communication bottlenecks

Gabor Grothendieck Henrique Dallazuanna jim holtman Duncan Murdoch Peter Dalgaard Wacek Kusnierczyk Brian Ripley Greg Snow hadley wickham Deepayan Sarkar Ted Harding Jorge Velez Rolf Turner John Kane John Fox Frank E Harrell Jim Lemon David Winsemius Erik Iverson Uwe Ligges Daniel Nordlund Bert Gunter Douglas Bates Ben Bolker Henrik Bengtsson tom soyer Charilaos Skiadas Andrew Robinson Philippe Grosjean Wensui Liu esmail bonakdarian Kingsford Jones Sherri Heck Eik Vettorazzi Ravi Varadhan Muhammad Azam

slide-13
SLIDE 13

Results: Communication bottlenecks

Good.

B A

Can be improved.

C D

slide-14
SLIDE 14

Thank you!

Packages: sna: Carter T. Butts (2008). Social Network Analysis with

  • sna. Journal of Statistical Software 24/6.

tm: I. Feinerer, K. Hornik, and D. Meyer (2008). Text Mining Infrastructure in R. Journal of Statistical Software 25/5. References:

  • C. Bird, A. Gourley, P. Devanbu, M. Gertz, and A.
  • Swaminathan. Mining email social networks. In Proceedings
  • f the 2006 international workshop on Mining software
  • repositories. ACM, New York, 2006.

Contact: angela.bohn@gmail.com, www.angela-bohn.de