ASTR633 Astrophysical Techniques Course slides Big Data - - PowerPoint PPT Presentation
ASTR633 Astrophysical Techniques Course slides Big Data - - PowerPoint PPT Presentation
ASTR633 Astrophysical Techniques Course slides Big Data https://www.youtube.com/watch?v=VXIWeUBb2Jk&t=1713s 2 Your thoughts (>1 person) Open access to data => broadening of talent pool / large collaborations (the sky is
2
https://www.youtube.com/watch?v=VXIWeUBb2Jk&t=1713s
Your thoughts (>1 person)
- Open access to data => broadening of talent pool / large
collaborations (“the sky is flat”)
- Four paradigms of science discovery
- Challenge of visualizing high dimensional data
- Data-driven science is not about data, its about knowledge
extraction
- Computer science is the new mathematics
- The limitations of human bandwidth and memory
- Theory produces datasets now, not formulae
- Any individual will only do a small fraction of the science with
a (large) dataset that is inherently there
3
Your thoughts (1 person)
- Growing need to bridge gap between astronomy and computer
science (X-informatics)
- You can show at most ~10 (probably less) dimensions on paper
- Applied CS and IT is creating a new scientific methodology
- Science may progress more through incremental advances in
collaborations more than paradigm shifts by a super genius
4
Some additional thoughts
- for the first time in history, most data will not be seen by humans
- data mining algorithms don't extrapolate to large datasets well
=> new interesting research areas
- most data and constructs cannot be comprehended by humans
directly => we need machine assisted discovery
- but we still need good scalable data exploration and mining tools
- effective visualization is the bridge between quantitative
information and human intuition; how do you do this for multi- dimensional space?
5
(some yours, some mine)
Some additional thoughts
- Implications for education and training from open/flat skies?
- the complexity of data sets and meaningful constructs is
starting to exceed the cognitive capacity of the human brain “science on the carbon-silicon interface”
- data fusion + data mining + machine learning = 4th paradigm
6
(some yours, some mine)
Some additional thoughts
- computer science is the "new mathematics" and the key to
interdisciplinary science / crossing 19th century science boundaries
- dealing with data is just the first step; knowledge discovery
should be the primary focus
- we need good scientists to lead archives & virtual observatories
7
(some yours, some mine)
Some additional thoughts
- Science is changing focus from ownership of data to
- wnership of expertise
- implications for our “unfair” Hawaii advantage?
8