Java can be said to be the most basic programming language for big data, and a large part of the big data development I came into contact with was transferred from Jave Web development.
Second, the big data Hadoop system
Hadoop is an open source distributed computing platform developed in Java language, which is suitable for the distributed storage and computing platform of big data. Hadoop is a widely used big data platform and the result of research and development of big data platform. Hadoop is a common big data support platform at present.
Third, Scala Golden Language and Spark
Scala is very similar to java. Both of them are languages running in jvm, and they can call each other seamlessly during the development process.
Spark is a fast general-purpose computing engine specially designed for large-scale data processing. Spark is a substitute for MapReduce, and it is compatible with HDFS and Hive, and can be integrated into Hadoop ecosystem to make up for the deficiency of MapReduce.
Fourth, the actual combat of big data projects.
Data collection, data processing, data analysis, data display and data application.