Dataflow and Notebook - Performance
Hi everyone,
I just conducted a test and would like to hear your opinions. I created two distinct pipelines: one using Dataflow (PowerQuery) and the other using notebooks with Spark. The task is quite simple, but I am using two large tables in this process. Basically, I perform a join (merge) and expand the columns, unifying the two tables.
The problem is that this simple process takes 30 minutes to run in Dataflow, while it only takes 3 minutes using notebooks. My question is: if we consider that an analyst has the option to use either Dataflow or notebooks to build this pipeline, I understand that the performance of notebooks will always be better. Is my assumption correct? What is your opinion?
Thank you.
3
5 comments
Washington Ribeiro
2
Dataflow and Notebook - Performance
Learn Microsoft Fabric
skool.com/microsoft-fabric
Helping passionate analysts, data engineers, data scientists (& more) to advance their careers on the Microsoft Fabric platform.
Leaderboard (30-day)
Powered by