Quantifying Program Complexity and Comprehension Quantifying Program - PowerPoint PPT Presentation

Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension Michael Hansen, Andrew Lumsdaine, Rob Goldstone, Raquel Hill, Chen Yu Michael Hansen, Andrew Lumsdaine, Rob Goldstone, Raquel Hill, Chen Yu Dissertation Proposal Dissertation Proposal Indiana University, November 22 2013 Indiana University, November 22 2013

Big Question Big Question How do we quantify the psychological or cognitive complexity of a program? Motivation Motivation Explicit psychological theory of programming Automated identification of error-prone or potentially confusing code More objective design decisions for tools, programs, libraries, and languages Constrain code generators to produce less complex programs Medium-sized Questions Medium-sized Questions What is cognitive complexity in the context of programming? Which aspects of a program/programmer should affect this cognitive complexity? How might we quantify a program's cognitive complexity? Knowing a program's cognitive complexity, what could we predict?

Cognitive Complexity in the Context of Programming Cognitive Complexity in the Context of Programming Software complexity is "a measure of resources expended by a system [human or other] while interacting with a piece of software to perform a given task." — Basili, 1980 One feature which all of these [theoretical] approaches have in common is that they begin with certain characteristics of the software and attempt to determine what effect they might have on the difficulty of the various programmer tasks. A more useful approach would be first to analyze the processes involved in programmer tasks, as well as the parameters which govern the effort involved in those processes. From this point one can deduce, or at least make informed guesses, about which code characteristics will affect those parameters. — Cant et. al, 1995

X threshold i c i > c ⃗ = f ( code ) Of Models and Metrics Of Models and Metrics Cognitive complexity is... Cognitive complexity is... Function of source code (complexity metrics) is bad Poor support for user activities (Cognitive Dimensions of Notation) Resistance to change, hidden dependencies, etc. Programming languages are used to try out ideas Unfamiliar schemas/implicit rule violations (Détienne/Soloway) Explains novice/expert differences Don't , IF/THEN rules Properties of a cognitive model trace (Cognitive Complexity Metric, Mr. Bits) Cognitive resource constraints + effects of notation Task time, eye movement metrics, contents of memory, etc.

Thesis Contributions Thesis Contributions 1. Thorough review of relevant literature in software complexity and the psychology of programming 2. Analysis of code/cognitive/demographic factors affecting programmer output predictions 3. Methodology and Python library for analyzing programmers' responses and eye movements 4. Analysis of collected eye movement data 5. Design, prototype, and evaluation of quantitative process model (output prediction task)

Presentation Overview Presentation Overview link 1. Measuring Software Complexity ( ) Kinds of complexity link 2. Psychology of Programming ( ) Cognitive models of program comprehension link 3. Experiments ( ) Aspects of code/programmer that affect comprehension link 4. Modeling: Mr. Bits ( ) Quantifying resource expenditure link 5. Conclusion and Research Timeline ( ) Finish by Spring 2015 at the latest

1. Measuring Software Complexity 1. Measuring Software Complexity Completed work Literature review Complexity vs. reuse experiment Proposed work Cohesive write-up Readability vs. complexity (Buse)

L c n n c ) ) < O ( O ( c ) < O ( n ) < O ( Kinds of Software Complexity Kinds of Software Complexity Computational Complexity Computational Complexity Problem/computational complexity Bounds on computing resources as a function Complexity of underlying problem or of input size domain Usually considered fixed Representational complexity Cognitive/psychological complexity Kolmogorov Complexity Kolmogorov Complexity Computational resources needed to specify an object Size of smallest program in language Not computable in the general case

Kinds of Software Complexity Kinds of Software Complexity Source Code Metrics Source Code Metrics Problem/computational complexity Syntactic - Size/Spatial/Graph/Counter-Factual Representational complexity Lines of Code Physical form of the program Function Complexity Language, formatting, naming, etc. Inheritance Depth Problem representation Minimum Description Length Cognitive/psychological complexity Readability Line length Number of identifiers/identifier length Indentiation/blank lines Concepts and beacons (stack, queue, etc.) Formal Concept Analysis (lattice) Concept Identification (Biggerstaff)

(∃ P , Q )(| P | + | Q | < | P ; Q |) (∀ P )(| P | = |rename( P )|) (∃ P )(| P | ≠ |permute( P )|) (∃ P , Q , R )(| P | = | Q | and | P ; R | ≠ | Q ; R |) (∀ P , Q )(| P | ≤ | P ; Q | and | Q | ≤ | P ; Q |) (∃ P , Q )( P ≡ Q and | P | ≠ | Q |) (∃ P , Q )(| P | = | Q | and P ≠ Q ) c (∀ c )({ P | | P | = c } is finite) (∃ P , Q )(| P | ≠ | Q |) P | P | Weyuker's Properties (1988) Weyuker's Properties (1988) Proposed properties of syntactic software complexity measures is the complexity of program Property Description Not all programs should have the same complexity The set of programs whose complexity is is finite Some programs share the same complexity Functional equivalence does not imply complexity equivalence Concatenation cannot decrease complexity Context matters for complexity after concatenation The order of statements matters Identifier and operator names do not matter Concatenated programs may be more complex than the sum of their parts

Kinds of Software Complexity Kinds of Software Complexity Qualitative Models Models Problem/computational complexity Integrated Metamodel (von Mayrhauser, 1995) Representational complexity Program/situation models + Cognitive/psychological complexity top-down planning Influenced by problem, Cognitive Dimensions of Notation (Blackwell representational complexity and Green, 1995) Function of programmer experience, Programming languages are used to mental resource constraints try out ideas Task dependent: reuse vs. debugging Hidden dependencies, viscosity, vs. modification consistency, ... Rules of Discourse (Soloway and Ehrlich, 1984) Unwritten rules internalized by experts Expectations that drive understanding process

f ( chunking ) + g ( tracing ) f ( weights ) Kinds of Software Complexity Kinds of Software Complexity Quantitative Models Models Problem/computational complexity Cognitive Weights (Chhabra, 2011; Shao et. al, Representational complexity 2003) Cognitive/psychological complexity Assign weights to syntactic & Influenced by problem, semantic elements representational complexity Complexity = Function of programmer experience, Cognitive Complexity Metric (Cant et. al, 1995) mental resource constraints "Process" model based on chunking & Task dependent: reuse vs. debugging tracing vs. modification Terms for chunk size, control structures, boolean expressions, etc. Complexity = Mr. Bits Embodied process model based on eye movements, memory, spatial reasoning, inference Task is to predict printed output Complexity = time spent, steps taken, representation, etc.

Readability vs. Complexity Readability vs. Complexity Readability is "accidental" while complexity is "essential" Problem/computational complexity Readability is local, line-by-line (Buse, 2010) Number of identifiers Line length Indentation Software Readability Ease Score (SRES) Like Flesch score (FRES) Tokens = syllables, statements = words, units = sentences

2. Psychology of Programming 2. Psychology of Programming Completed work Literature review Onward! workshop paper (cognitive architectures) Proposed work Review literature on text understanding models (Kintsch, 1978) Consider recent eye-tracking studies of programming

Many claims are made for the efficacy and utility of new approaches to software engineering - structured methodologies, new programming paradigms, new tools, and so on. Evidence to support such claims is thin and such evidence, as there is, is largely anecdotal. Of proper scientific evidence there is remarkably little. Furthermore, such as there is can be described as "black box", that is, it demonstrates a correlation between the use of certain technique and an improvement in some aspect of the development. It does not demonstrate how the technique achieves the observed effect. — Software Design - Cognitive Aspects (Détienne, 2001)

Periods of Research Periods of Research Early: 1960-1980 Early: 1960-1980 Importing of experimental techniques to CS Correlations between task performance and PL/human factors Novice participants on toy programs Contradictory and confusing results Later: 1980-Present Later: 1980-Present Use of cognitive models to explain internal processes Verbal reports, real-time code changes, gaze patterns, etc. Experienced/professional participants on real-world programs Models are largely qualitative Early Study Example Early Study Example Effect of variable naming on code understanding No effect for simple programs, positive effect for complex programs Experienced programmers recognize schemas (Soloway and Ehrlich, 1984)

Quantifying Program Complexity and Comprehension Quantifying Program - PowerPoint PPT Presentation

Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension Michael Hansen, Andrew Lumsdaine, Rob Goldstone, Raquel Hill, Chen Yu Michael Hansen, Andrew

Comprehension Skills: Teacher Presentation Book, Comprehension Skills: Teacher Presentation Book,

Literacy Strategies Literacy Strategies What is comprehension? What is comprehension? Simply

Quantifying and Measuring Morphological Complexity Max Bane bane@uchicago.edu Department of

End of Year Exam (SA2) Components 1) Language Usage and Comprehension 2) Oral 3) Listening

(Age 7-11) A new solution for guided reading Agenda Why a comprehension programme? What is Bug

Quantifying error and Quantifying error and modeling accuracy & uncertainty modeling

Quantifying Temporal and Spatial Quantifying Temporal and Spatial Localities Localities Florida

Quantifying the Necessity of Quantifying the Necessity of Risk Mitigation Strategies Risk

Hi Hierarchical Models for hi l M d l f Quantifying Uncertainty in Quantifying Uncertainty in

Quantifying relative effects of Quantifying relative effects of protecting different stages

Quantifying Surface Brightness Quantifying SB profiles Non-Parametric Parametric CSB : 0

Quantifying the incompatibility of Quantifying the incompatibility of quantum measurements

Theories, Methods and Tools in Program Comprehension: Past, Present and Future Margaret-Anne

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Complexity and Character of Human Languages The Faculty of Language Informatics 2A: Lecture 28

Stratus: Clouds with Microarchitectural Resource Management Kaveh Razavi and Animesh Trivedi

HOW TO USE JAVA STREAMS TO ACCESS EXISTING DATA WITH ULTRA-LOW LATENCY PER MINBORG, CTO,

Designing Computer Systems for Software 2.0 Kunle Olukotun Stanford University SambaNova

Overview for today Natural Language Processing with NNs [~15m] Supervised

Exploiting Modern Hardware Features via Lightweight Profiling Probir Roy Scalable Tools

CHERI JNI: Sinking the Java security model into the C David Chisnall , Brooks Davis, Khilan Gudka,

Participatory Networking: An API for Application Control of SDNs Andrew Ferguson, Arjun Guha,

@odin odinthe thener nerd not the god Auto-Intern GmbH 1 @odinthenerd A possible future

Quantifying Program Complexity and Comprehension Quantifying Program - PowerPoint PPT Presentation

Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension Michael Hansen, Andrew Lumsdaine, Rob Goldstone, Raquel Hill, Chen Yu Michael Hansen, Andrew

Comprehension Skills: Teacher Presentation Book, Comprehension Skills: Teacher Presentation Book,

Literacy Strategies Literacy Strategies What is comprehension? What is comprehension? Simply

Quantifying and Measuring Morphological Complexity Max Bane bane@uchicago.edu Department of

End of Year Exam (SA2) Components 1) Language Usage and Comprehension 2) Oral 3) Listening

(Age 7-11) A new solution for guided reading Agenda Why a comprehension programme? What is Bug

Quantifying error and Quantifying error and modeling accuracy &amp; uncertainty modeling

Quantifying Temporal and Spatial Quantifying Temporal and Spatial Localities Localities Florida

Quantifying the Necessity of Quantifying the Necessity of Risk Mitigation Strategies Risk

Hi Hierarchical Models for hi l M d l f Quantifying Uncertainty in Quantifying Uncertainty in

Quantifying relative effects of Quantifying relative effects of protecting different stages

Quantifying Surface Brightness Quantifying SB profiles Non-Parametric Parametric CSB : 0

Quantifying the incompatibility of Quantifying the incompatibility of quantum measurements

Theories, Methods and Tools in Program Comprehension: Past, Present and Future Margaret-Anne

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Hans Vangheluwe Modelling and Simulation Causes of Complexity Dealing with Complexity

Complexity and Character of Human Languages The Faculty of Language Informatics 2A: Lecture 28

Stratus: Clouds with Microarchitectural Resource Management Kaveh Razavi and Animesh Trivedi

HOW TO USE JAVA STREAMS TO ACCESS EXISTING DATA WITH ULTRA-LOW LATENCY PER MINBORG, CTO,

Designing Computer Systems for Software 2.0 Kunle Olukotun Stanford University SambaNova

Overview for today Natural Language Processing with NNs [~15m] Supervised

Exploiting Modern Hardware Features via Lightweight Profiling Probir Roy Scalable Tools

CHERI JNI: Sinking the Java security model into the C David Chisnall , Brooks Davis, Khilan Gudka,

Participatory Networking: An API for Application Control of SDNs Andrew Ferguson, Arjun Guha,

@odin odinthe thener nerd not the god Auto-Intern GmbH 1 @odinthenerd A possible future

Quantifying error and Quantifying error and modeling accuracy & uncertainty modeling