Loading...

Modern distributed computing problems

Modern distributed computing problems need a programming abstraction that is better than map-reduce, as an abstraction that wraps Map Reduce into a single higher-level concept

Modern distributed computing problems need a programming abstraction that is better than map-reduce, as an abstraction that wraps Map Reduce into a single higher-level concept Hadoop brought in Map-Reduce framework of processing , which was a fundamental new abstraction given to programmers to look into distributed computing problems. Similarly, Spark is new programming abstraction for distributed computing problems. It is better and faster in several areas of distributed-processing as compared to Map-reduce.

Dotnet core features

A need to solve iterative algorithms which was a tough challenge to solve using traditional map-Reduce abstraction. Distributed computing algorithms like interactive and graph algorithms are those which work on the data again and again. Hadoop is designed for transformation of data and then reduction of data and then move ahead to the next step. And this, primarily, gave rise to Spark and Spark has a new abstraction called RDD.

In June of 2010, Matei Zaharia released his research paper on the project, titled as "Spark: Cluster Computing with Working Sets," following up later that year by releasing the project as open source. In 2013, Spark collaborators founded Databricks, a company focused on supporting Spark as an open source product. Spark switched its license to Apache.

Step By Step process on new technologies