Scripting for All . Lindsay Popowski '21, Kewei Zhou '21, Zach - - PowerPoint PPT Presentation

scripting for all
SMART_READER_LITE
LIVE PREVIEW

Scripting for All . Lindsay Popowski '21, Kewei Zhou '21, Zach - - PowerPoint PPT Presentation

Scripting for All . Lindsay Popowski '21, Kewei Zhou '21, Zach Dodds Harvey Mudd College ZD CS for All ? ZD CS for All ~ Scripting for All ZD Different views of "The World" ... "Scripting for World Domination" ZD


slide-1
SLIDE 1

Scripting for All .

ZD Lindsay Popowski '21, Kewei Zhou '21, Zach Dodds Harvey Mudd College

slide-2
SLIDE 2

CS for All ?

ZD

slide-3
SLIDE 3

CS for All ~ Scripting for All

ZD

slide-4
SLIDE 4

Different views of "The World" ...

slide-5
SLIDE 5

"Scripting for World Domination"

ZD

slide-6
SLIDE 6

"All" ?

Scripting for All Disciplines

ZD

slide-7
SLIDE 7

Scripting for All Colleges

Claremont's Petri Dish

ZD

slide-8
SLIDE 8

Premise ~ Computing is (becoming) a professional literacy. Challenges: [Q1] How can CS departments support computing skills? … while encouraging students to retain and grow in other academic identities? [Q2] What does a CS-for-All college curriculum look like? Our answer:

From Specialty to Literacy

ZD

College Computing ~ College Writing

slide-9
SLIDE 9

Literacy, not a specialty

College Writing

Many ways in Many ways through Many ways from Writing department?

ZD

slide-10
SLIDE 10

Literacy, not a specialty

College Computing

Many ways in Many ways through Many ways from Computing department?

ZD

slide-11
SLIDE 11

Claremont ~ 2025

Discipline-owned School-owned

College Computing Data 8 Computing for Insight

Who owns this?

ZD

What is this?

slide-12
SLIDE 12

Since 2009...

CS1 "gold" CS1 "black" CS1 "green"

students new to CS students with some CS Biology-owned CS1 course Intro to CS in Python (breadth) Intro to CS in Python (more breadth)

ZD

slide-13
SLIDE 13

Biology-owned CS1

Projects Recursion Iteration

CS1 in which all practice is biologically motivated… Lectures: ½ CS ½ Bio

ZD

slide-14
SLIDE 14

What happens beyond CS1 … ?

CS1 "gold" CS1 "black"

students new to CS students with some CS

CS1 "green" CS2 CS3

Java & Racket C++

CS for Insight

Python

ZD - LP

slide-15
SLIDE 15

Intro CS Course taken... CS majors Biology majors Black 27.6% 2.0% Gold 17.2% 3.1% Green 15.0% 18.7%

Biology-owned CS1 ~ Aftermath

Intro CS Course taken... Avg # CS courses taken Avg # Bio courses taken Black 5.6 1.6 Gold 4.0 1.8 Green 4.3 3.8

9 years of data: ~300 green students: ~400 black students, and ~2000 gold students

LP

slide-16
SLIDE 16

Biology-owned CS1 ~ Aftermath

There is room to make CS1 half biology -- not only without harm but with considerable benefit...

9 years of data: ~300 green students: ~400 black students, and ~2000 gold students

LP

Intro CS Course taken... CS majors Biology majors Black 27.6% 2.0% Gold 17.2% 3.1% Green 15.0% 18.7% Intro CS Course taken... Avg # CS courses taken Avg # Bio courses taken Black 5.6 1.6 Gold 4.0 1.8 Green 4.3 3.8

slide-17
SLIDE 17
  • 100 students requested to take the biology-flavored CS1
  • 71 students did not request to take the biology-flavored CS1

either they opted for a different section or did not express a preference

Of the students who took CS5 green (biology's CS1)...

How did interest in the biology facets of the course impact the academic journey of these students? We asked

LP

slide-18
SLIDE 18

Course fraction who chose cs5green taking course fraction who didn't choose cs5green taking course p value BIOL054 HM 0.38 0.24 0.046 BIOL113 HM 0.35 0.17 0.008 CSCI060 HM 0.68 0.67 0.854 CSCI070 HM 0.45 0.42 0.664 CSCI081 HM 0.20 0.14 0.297 Course average grade (4-pt-scale) of those who chose CS5 Green average grade (o a 4-pt-scale) of those who didn't choose CS5 Green p value BIOL052 HM 2.98 2.76 0.137 BIOL054 HM 3.63 3.63 0.978 BIOL113 HM 3.42 3.31 0.542 CSCI060 HM 3.59 3.47 0.148 CSCI070 HM 2.93 3.08 0.290 CSCI081 HM 3.13 2.63 0.088

no evidence of a significant difference p < 0.05 Subsequent-course selection

Results ~ paths chosen vs. paths unchosen

Subsequent-course grades LP

slide-19
SLIDE 19

Course fraction who chose cs5green taking course fraction who didn't choose cs5green taking course p value BIOL054 HM 0.38 0.24 0.046 BIOL113 HM 0.35 0.17 0.008 CSCI060 HM 0.68 0.67 0.854 CSCI070 HM 0.45 0.42 0.664 CSCI081 HM 0.20 0.14 0.297 Course average grade (4-pt-scale) of those who chose CS5 Green average grade (o a 4-pt-scale) of those who didn't choose CS5 Green p value BIOL052 HM 2.98 2.76 0.137 BIOL054 HM 3.63 3.63 0.978 BIOL113 HM 3.42 3.31 0.542 CSCI060 HM 3.59 3.47 0.148 CSCI070 HM 2.93 3.08 0.290 CSCI081 HM 3.13 2.63 0.088

no evidence of a significant difference p < 0.05 Subsequent-course selection

Results ~ paths chosen vs. paths unchosen

There is room to make CS1 half biology -- even for students not predisposed to biology !

Subsequent-course grades LP

slide-20
SLIDE 20

Results ~ all paths

Goal: Not to make everyone be the same, but to make everyone experientially confident

LP

slide-21
SLIDE 21

CS2 identities, past decade

Beyond CS1… ?

ZD

slide-22
SLIDE 22

CS2 growth, past decade CS2 raw enrollments (also F/M)

If you build it, they will come...

Beyond CS1… ?

ZD

slide-23
SLIDE 23

CS2 for non-majors: 2016, 2017, 2018

Homework Subject Topics Assignments

Text & File Analysis Python review, Reading/writing text files, GitHub Ongoing scavenger hunt across a broad, deep directory utilizing particular skills learned in each week 1 Webscraping and APIs Retrieving data from Google Maps, iTunes, and USGS Earthquake API 2 Web Technologies HTML/CSS, Text annotation 3 Data Visualization Matplotlib, Distinguishing human-generated and batch-mode inputs Evaluating data in relation to Benford's Law 4 Machine Learning K nearest neighbors using scikit-learn library Neural networks using scikit-learn library 5 Machine Learning Decision trees & random forests using scikit-learn library Neural networks, TensorFlow 6 Natural Language Processing Using NLTK, gensim (Google's vector representation

  • f word meanings), and TextBlob libraries

Predicting Amazon product review scores using sentiment analysis 7 Computer Vision Pixel processing, Steganography, Green-screening "Photoshopping" text algorithmically 8 Computer Vision K-means image posterization/implementation Reading pictures of letters with pixel processing and neural networks

ZD

slide-24
SLIDE 24

CS2 for non-majors: 2016, 2017, 2018

Homework Subject Topics Assignments

Text & File Analysis Python review, Reading/writing text files, GitHub Ongoing scavenger hunt across a broad, deep directory utilizing particular skills learned in each week 1 Webscraping and APIs Retrieving data from Google Maps, iTunes, and USGS Earthquake API 2 Web Technologies HTML/CSS, Text annotation 3 Data Visualization Matplotlib, Distinguishing human-generated and batch-mode inputs Evaluating data in relation to Benford's Law 4 Machine Learning K nearest neighbors using scikit-learn library Neural networks using scikit-learn library 5 Machine Learning Decision trees & random forests using scikit-learn library Neural networks, TensorFlow 6 Natural Language Processing Using NLTK, gensim (Google's vector representation

  • f word meanings), and TextBlob libraries

Predicting Amazon product review scores using sentiment analysis 7 Computer Vision Pixel processing, Steganography, Green-screening "Photoshopping" text algorithmically 8 Computer Vision K-means image posterization/implementation Reading pictures of letters with pixel processing and neural networks

" C S f

  • r

I n s i g h t "

N e i t h e r C S n

  • r

S E f

  • r

i t s

  • w

n s a k e

  • b

u t f

  • r

a m p l i f y i n g

  • t

h e r p a t h s .

ZD

slide-25
SLIDE 25

Building "Overlaps" with other disciplines

Professor A. Sinha Government Dept. Claremont McKenna College Professor L. Connolly Physics Dept. Harvey Mudd College

NYT Webscraping Data analysis

ZD - KZ

slide-26
SLIDE 26
  • Using scripting to scan a wide breadth of

texts for historical information

  • Internet Archive and New York Times APIs
  • Examining word frequencies over time
  • Finding clusters of words that frequently

appear together

  • Can generate questions for further and

closer investigation...

NYT article scraping

KZ

slide-27
SLIDE 27

… into government and international relations, via Professor Sinha How can we improve process of data collection and analysis in non-STEM fields? Goal ~ speed up / automate it, while: a) Maintaining accuracy, b) Keeping the process transparent, and c) Offering new insights or paths ...

NYT Scraping for Insight...

KZ

slide-28
SLIDE 28

Connective computing

  • Scanning for articles related to a specific

topic (e.g., India) or of a specific type (e.g., Letters to the editor)

  • Using Google Cloud Natural Language API

for more detailed text analysis

○ Finding overall article sentiments and entity sentiments ○ Analyzing syntax of the texts, including using morphology and dependency tree KZ

slide-29
SLIDE 29

Connective computing

  • Scanning for articles related to a specific

topic (e.g., India) or of a specific type (e.g., Letters to the editor)

  • Using Google Cloud Natural Language API

for more detailed text analysis

○ Finding overall article sentiments and entity sentiments ○ Analyzing syntax of the texts, including using morphology and dependency tree KZ

slide-30
SLIDE 30
  • How do we maintain accuracy?

We want an effective search that removes articles we don’t want What would a human be looking for or sorting out?

  • How do we keep transparency?

We run into this issue with Google Natural Language Processing; The more advanced and complicated the analysis, the less transparent

  • What additional insight can we offer?

Text analysis and identifying key phrases/sentences

Insights for Computing in Non-STEM majors

KZ

slide-31
SLIDE 31
  • Switch to python
  • Leveraging matplotlib, scikit-

learn, and Google's suite

  • Curve fitting, plus pixel

processing, spreadsheet handling, … computing! Challenge: How to incorporate computing clearly into a non-CS course without distracting from the primary academic focus, physics.

Physics Lab - Data Analysis

Don't overrun!

LP

slide-32
SLIDE 32

Physics Lab ~ Physics, not CS

LP

slide-33
SLIDE 33
  • What parts do we have the students do

manually? We let them choose - based on time

  • How do we teach the students enough

so that they can replicate the analyses, without distracting from the lab?

  • What is the best way to introduce

python libraries' capabilities, while keeping the lab clear and concise?

Physics Lab - Challenges & Decisions

LP

Colab: Live-python Google Docs

slide-34
SLIDE 34

We asked, "How can computing serve all disciplines well ?" Our project has 1. Explored and assessed our current landscape: CS's transition-to-service 2. Developed tools to support and encourage collaborations 3. Verdict The future's directive is

Looking back, looking forward...

ZD

Overlap! Don't Overrun.

slide-35
SLIDE 35

We asked, "How can computing serve all disciplines well ?" Our project has 1. Explored and assessed our current landscape: CS's transition-to-service 2. Developed tools to support and encourage collaborations 3. Verdict The future's directive is

Looking back, looking forward...

Overlap! Don't Overrun.

Computing-as-Literacy ~ More and more, IntroX will be/include CS1

slide-36
SLIDE 36

Slides we're not using...

slide-37
SLIDE 37

Expanding HMC Introductory CS courses

slide-38
SLIDE 38

Benford's Law!

It works - remarkably well! Introducing dictionaries

slide-39
SLIDE 39

R versus Python

Introducing R as a programming language Side-by-side Comparisons

slide-40
SLIDE 40

Survey of 50 "influential" computer science programs*

Big-picture Landscape

Most common Least common

*Niche ranking 2017

Flavored CS1 CS2 for non-CS majors Interdisciplinary Majors CS+X Majors & Courses

21 out of 50 . 8 out of 50 . 27 out of 50 . 50 out of 50 .

slide-41
SLIDE 41

Survey of 50 "influential" computer science programs*

Big-picture Landscape

Most common Least common

*Niche ranking 2017

Flavored CS1 CS2 for non-CS majors Interdisciplinary Majors CS+X Majors & Courses

21 out of 50 . 8 out of 50 . 27 out of 50 . 50 out of 50 .

C

  • m

p u t i n g i s g r

  • w

i n g

  • u

t w a r d ~ f a s t e r t h a n C S d e p a r t m e n t s c a n a d a p t !

slide-42
SLIDE 42

Survey of 50 "influential" computer science programs*

Big-picture Landscape

Most common Least common

*Niche ranking 2017

Flavored CS1 CS2 for non-CS majors Interdisciplinary Majors CS+X Majors & Courses

21 out of 50 . 8 out of 50 . 27 out of 50 . 50 out of 50 .

L

  • c

a l e x p e r i m e n t s i n t h i s p a r t

  • f

C S s p a c e . . .

slide-43
SLIDE 43

Total Number of Students

Choosing biology...

slide-44
SLIDE 44

CS2 for non-CS-majors: First try

Homework Subject Current Topics New Assignments

Text & File Analysis Python review, Reading/writing text files, GitHub Ongoing scavenger hunt across a broad, deep directory utilizing particular skills learned in each week 1 Webscraping and APIs Retrieving data from Google Maps, iTunes, and USGS Earthquake API 2 Web Technologies HTML/CSS, Text annotation 3 Data Visualization Matplotlib, Distinguishing human-generated and batch-mode inputs Evaluating data in relation to Benford's Law 4 Machine Learning K nearest neighbors using scikit-learn library Neural networks using scikit-learn library 5 Machine Learning Decision trees & random forests using scikit-learn library Neural networks, TensorFlow 6 Natural Language Processing Using NLTK, gensim (Google's vector representation of word meanings), and TextBlob libraries Predicting Amazon product review scores using sentiment analysis 7 Computer Vision Pixel processing, Steganography, Green-screening "Photoshopping" text algorithmically 8 Computer Vision K-means image posterization/implementation Reading pictures of letters with pixel processing and neural networks

Machine Learning Files: Local & Online Applications

slide-45
SLIDE 45

Interviews

Maduka Ogba Chemistry @ Pomona Robin Melnick Linguistics @ Pomona Vivien Hamilton History @ Mudd Takeaways:

  • Non-CS majors want computational skills.
  • CS can come across as a challenge to students’ and profs’ academic identities
  • Richest applications of computing happen after students have developed academic identities
  • Opportunity to pursue overlap, while preserving -- and supporting -- deeply held priorities.
slide-46
SLIDE 46

Should there be a CS2 for non-CS majors?

slide-47
SLIDE 47

Opportunities to “bridge” elsewhere...