topic

Apache Spark Architecture

Post author By nidisoft_vishain
Post date July 9, 2024
No Comments on Apache Spark Architecture

Spark works in a master-slave architecture where the master is called the “Driver” and slaves are called “Workers”. When you run a Spark application, Spark Driver creates a context that is an entry point to your application, and all operations (transformations and actions) are executed on worker nodes, and the resources are managed by Cluster Manager.

				
					//Create RDD from parallelize    
val dataSeq = Seq(("Java", 20000), ("Python", 100000), ("Scala", 3000))   
val rdd=spark.sparkContext.parallelize(dataSeq)

				
					
//Create RDD from external Data source
val rdd2 = spark.sparkContext.textFile("/path/textFile.txt")

Get in touch

Quick Links

Support

Contact Us

Take Your Learning To The Next Level.

Leave a Reply Cancel reply

Get in touch

Quick Links

Support

Contact Us

Take Your Learning To The Next Level.