Spark SQL is one of the most used Spark modules which is used for processing structured columnar data format. Once you have a DataFrame created, you can interact with the data by using SQL syntax. In other words, Spark SQL brings native RAW SQL queries on Spark meaning you can run traditional ANSI SQL on Spark Dataframe. In the later section of this Apache Spark tutorial, you will learn in detail using SQL select, where,group,join,union e.t.c
val groupDF = spark.sql("SELECT gender, count(*) from PERSON_DATA group by gender")
groupDF.show()