Details

Understanding Spark and Sparkling Water

Understanding Spark and Sparkling Water
0/5 based on 0 votes.
Spark is a library of code that can be used to process data in parallel on a cluster. The basic idea of Spark is parallelism, meaning Spark breaks the data into pieces, sends the pieces to differnt computers for processing, then sends the results back and process the combination to get the final result. More specifically, the basic computing paradigm is: distribute a large data set on multiple nodes, map functions row by row, group data by a key, and then perform aggregate operations. Sparkling Water, the happy marriage between open source technologies Apache Spark and H2O. It combines the advanced machine-learning algorithms from H2O with the execution power of Spark.
Submitted by elementlist on Dec 01, 2016
341 views. Averaging 0 views per day.

Post Reply


Please login or register if you wish to leave a comment.

Quick Search

Statistics

3,012 listings in 21 categories, with 2,255,661 clicks. Directory last updated Sep 12, 2023. Welcome Amara Fatima, the newest member.