Skip to main content

Quiz: Spark basics

Below are a few questions that should come handy in the first go :

  • Spark Architecture ? Cluster types, modes and spot instances ? Mounting storage ? Job vs Stage vs Task ?
  • Actions vs Transformations ? Directed Acyclic Graphs? Lazy Evaluation ?
  • RDD vs Dataframe vs Dataset ? Parquet file vs Avro file ?
  • StructType vs StructField? Delta lake ? Time travel ?
  • Syntax errors vs Exceptions ?
  • startsWith() vs endsWith() ? withColumn vs select vs withColumnRenamed ? Map vs FlatMap ? Why to use ‘literals’ ?
  • .collect() ? show vs display ? How to display full values of a column ?
  • Create RDD from a list ? Create RDD from a textfile ? Current_date vs current_timestamp ?
  • Reading and writing a file ? Create empty dataframe ?
  • Convert dataframe to rdd and rdd to dataframe ?
  • Broadcast variable, explode, coalesce and repartition ?
  • Merge or union two dataframes with different number of columns ?
  • Iterate through eachrow of dataframe in pyspark ?
  • How to handle NULL values ?