Loading...

Dotnet core features top 10

The Spark documentation defines an RDD as a collection of elements partitioned across the nodes of the cluster that can be operated-on in parallel.

The Spark documentation defines an RDD as a collection of elements partitioned across the nodes of the cluster that can be operated-on in parallel. An RDD can be thought of as a collection, similar to a list or an array. Everything from the loading of the text file, to manipulating and saving the data, is done with an RDD. This interface makes it easy for the users to think in terms of collections. This distribution of work and data, also means that even if one point fails, the rest of the system continues processing, while the failure can be restarted immediately elsewhere. This design that makes fault-tolerance easy, is due to the fact that most functions in Spark are lazy.

Dotnet core features

So, instead of immediately executing the functions-instructions, those same instructions are stored for later use in a DAG, or Directed Acyclic Graph. This graph of instructions continues to grow through a series of calls to the transformations, such as map, filter, etc. It is this lineage awareness that makes it possible for Spark to handle failures so gracefully. Each RDD in the graph knows how it was built, which allows it to choose the best path for recovery. Operations such as collect, count, reduce, and other methods trigger the DAG execution and result in some final action against the data.

Actions will trigger a execution of the graph. An RDD may be operated on like any other collection. But it's really distributed across your cluster, to be executed in parallel. The Driver application is just like any other application, atleast until an Action is triggered, at which point the Driver and its Spark-Context distributes the tasks to each node, which transform their respective chunks as quickly as they can. Once all the nodes have completed their tasks, then the next stage of the DAG can be triggered.

Step By Step process on new technologies