When data is so huge that the usual approaches don’t work anymore.
- Hadoop has made storing a lot of data reliably (“resiliently”) very cheap.
- This data needs to be analysed to gain knowledge.
- This process should work automatically and in production.
- This process should work in real-time (streaming).