SLIDE 30 O’Reilly Strata Data Conference, New York, 2019 strataconf.com #stratadata
30
TO SMOOTH OR NOT TO SMOOTH…
It depends…
Data Science or Business Use? Are you quantifying uncertainty or forecasting unseen data? Is the purpose simplification for heuristics or audience understanding?
Proceed
Are you going to be making inferences from the data? Will the in-sample smoothed data be used for more analysis/ model? Are you sure you’re not fooling yourself? Do you believe it will support simpler conclusions than raw data? Do you believe that the smoothed data will be more accurate than the raw data? Are you accounting for uncertainty of smoothing in results?
Proceed with Caution
Some methods may be more appropriate than others
Not Recommended
High chance of statistical malpractice
Do you have exogenous data or theory to distinguish signal from noise?
Not Recommended
Compromising veracity
Not Recommended
Errors underestimated. Simulates higher certainty than warranted
YES NO YES NO YES NO YES NO BIZ DS YES NO
Proceed with Caution
Some methods may be more appropriate than others
NO YES NO YES NO YES NO YES
Are small scale structures in the data very important?
Not Recommended
Small scale structures are likely not preserved
NO YES
Should I smooth the data? Will the data be undergoing any other statistical procedures?
NO YES DS