Hello world, hello Hadoop!
Hadoop is a framework for distributed storage and processing of big data.
Big data is data that contains greater variety, arriving in increasing volumes and with more velocity.
Hadoop provides massive storage for any kind of data.
Hadoop also provides enormous processing power.
With Hadoop, you can store and process billions of records.
Hadoop ecosystem includes many tools like HDFS, MapReduce, YARN, Spark, etc.
Spark is a fast and general engine for large-scale data processing.
Spark can run on Hadoop, Mesos, standalone, or in the cloud.
Spark provides high-level APIs in Java, Scala, Python, R, and SQL.
