Photo by Alejandro Escamilla on Unsplash

Databricks Certified Associate Developer for Apache Spark — Preparation Series

Bilal Maqsood
6 min readDec 28, 2020

--

I have eyes set upon Databricks’ Spark certification for some time now and I have finally started preparing for the exam. However, this time I have decided to document my preparation in a series so others can benefit as well. FYI, This is Part I of the series.

For those of you who don’t know, Databricks is a company founded by the original creators of Apache Spark and they offer a cloud based solution for businesses to get maximum benefit from Spark — You can find more details at databricks.com.

In this part I will explain certification exam details, prerequisites, and what you’ll need to prepare. There’s also a resource for practice exams shared at the end of this article.

Exam Details:

  • Exam costs USD 200 and there are no free re-takes.
  • You will have 60 multiple choice questions to answer in 120 minutes.
  • Passing score is 70 % which means you must answer 42 questions correctly.
  • Exam is online and proctored.
  • PDF version of Spark documentation in preferred language (Python / Scala) will be available during exam.
  • Un-official results will be shared immediately after exam and certificate will be available on Databricks academy in 1 week.
  • The certificate does not expire but it is tied to a particular Spark version (e.g. Databricks certified associate developer for Apache Spark 3.0).
  • Exam is only available in English.
  • Databricks does not offer any practice exams for now.

How to register for the exam:

  • Click on the Certifications Tab to see all available certificate exams.
  • Click the view button — for the exam you would like to take — and click Register at the bottom of the next page. You will be taken to Webassessor’s page where you need to sign up. FYI, Webassessor is a third party that will help in conducting and proctoring the exam for Databricks.
  • Click on “Register for an exam” and choose the exam that you want to appear in:
  • Select a suitable date and time, acknowledge the terms of service and pay on the next page. Congrats, you have registered.
  • Once registered, try not to change it, however, as per FAQs you can reschedule 24 hours prior to exam by logging into Webassessor account.

Topics covered in Exam:

The exam consists of 60 multiple-choice questions. There are three main categories:

  • Spark Architecture (~17%): You will only need conceptual understanding of Spark Architecture. I will write about it in upcoming posts in this series. Related resources are covered in the preparation section of this article.
  • Applied Spark Architecture (~11%): This section will test your knowledge on applied understanding of Spark’s architecture. These will be slightly advanced so you should be prepared. Spark architecture approximately covers ~30% of exam’s questions which is roughly 18 questions out of 60.
  • Spark DataFrame API Applications (~72%): Major portion of the exam will be related to Spark’s Dataframe API. This means we must put full focus on this while preparing. We should be well familiar with common transformations, built-in functions, and their usage in a Spark application. This way we will not lose time searching these up in documentation.

Difference between Spark 2.4 and Spark 3.0 exams:

As per Databricks FAQs, both exams are very similar conceptually due to minimal changes in Spark 2.4 and Spark 3.0 as covered in exam syllabus. One major change is the Adaptive Query Execution in Spark 3.0 which is covered in this blog post by Databricks.

Pre-requisites:

Anyone can register for this exam however the candidate should cover the basic pre-requisites as given below:

  • Basic understanding of Spark’s architecture (for Spark 3.0 this includes Adaptive Query Execution)
  • Comfortable with application and data manipulation of Spark’s dataframe API. This includes column operations, aggregation, join operations and working with UDFs.
  • Working knowledge of the programming language that you chose for exam (Python or Scala).
  • It is also desirable that you have a working experience — in Spark — of six months.

Preparation:

Now let’s talk about the preparation. As per Databricks, following resources will be helpful to prepare for exam.

  • Apache Spark Programming with Databricks: This is a Databricks Academy course. Luckily, I have access to this on my academy account. I will share the important aspects covered in this course in my upcoming articles, so stay tuned.
  • Spark Architecture — Quick Reference: This is also a Databricks Academy course and in fact I took that as well. However, I will write about Spark’s architecture from the certification’s point of view in upcoming posts in this series.
  • Spark — The Definitive Guide (Sections I, II and IV): This book has been written by original creators of Apache Spark and it covers every topic in detail. I am using my Oreilly’s subscription to benefit from it. Section I is about Big data and Spark, Section II covers Dataframe API and Section IV covers Spark in production. You can get the book online and code samples used in the book are here. I will try to cover these aspects in a separate post.
  • Learning Spark: This is also a great — and slightly easier — book on Spark. It is also available on Oreilly and its code sample can be found here.

So, this is all you have to know about Databricks’ Spark certification. I will write in detail about the preparation in upcoming posts in this series, rest of the aspects are quite straight forward. I encourage you to go ahead and start exploring these resources.

Edit: 30 May 2021
I have found some practice exams on Udemy. You can follow the below link to access. It is good practice to appear in a mock exam before the actual attempt. This way you can know how prepared you are for exam and you can also work on your time management skills so you don’t waste any time in your actual attempt.

Databricks Certified Developer for Apache Spark 3.0 Practice Exams

I hope that you will find this series useful. You can find Part II here.

Happy learning!

Gain Access to Expert View — Subscribe to DDI Intel

--

--

Bilal Maqsood

Contemplate, Create, Contribute || Data Engineer @Quinyx