Statistics, the science of data analysis, is the applied mathematics in the 21st century.
Data is increasing in volume, velocity, and variety.
My favorite definition of a data scientist:
A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.
Huber (1994); (1996)
Data Size | Bytes | Storage Mode |
---|---|---|
tiny | \(10^2\) | piece of paper |
small | \(10^4\) | a few pieces of paper |
medium | \(10^6\) (MB) | a floppy disk |
large | \(10^8\) | hard disk |
huge | \(10^9\) (GB) | hard disk(s) |
massive | \(10^{12}\) (TB) | hard disk(s); RAID storage |
Huber, P. J. (1994). Huge data sets. In COMPSTAT 1994 (Vienna) (pp. 3–13). Heidelberg: Physica.
Huber, P. J. (1996). Massive data sets workshop: The morning after. In Massive data sets: Proceedings of a workshop (pp. 169–184). Washington: National Academy Press.