V-Order
Hey All, We had good discussion around V Order in the recent monthly call. As now I also got answers to some open questions I had, so sharing all my learnings here with you all. [Long post ahead]
What is V Order?
  • V Order is an optimization for parquet files, solely within fabric.
  • As Delta table holds parquet files underneath, it is applied to those parquet files as well.
Idea of V Order:
  • It applies some additional sorting and compression on the parquet files while WRITING (consuming ~15% more time), and making the READ very fast (up to 50%) for the fabric engines (Spark, SQL, PowerBI).
Key points:
  • Any parquet file which you "write" (not copied, not uploaded, not shortcut-ed) in fabric, will get V order optimization applied by default.
  • For example, if you write a parquet file using a copy data activity in data pipeline, the resulted parquet file will be v ordered.
  • If you write a parquet file using a spark notebook, the resulted file will be v ordered here as well.
  • In both the above examples, with format as delta also this holds true.
How to disable it: (Screenshot attached)
  1. For spark notebook, you can use spark conf. command and turn it to false.
  2. For data pipelines, you can use file format settings and untick the v order option. (You will only get this option if file format is parquet)
  3. For data flows, it only writes as delta table - couldn't find any option to disable it.
How to check if a parquet file is v ordered or not? (Screenshot attached)
  • V ordered parquet files looks no different than a normal parquet file. Only difference can be seen in the metadata of the parquet file.
  • You can read the metadata of the parquet file using code.
  • Or you can also use a parquet viewer to open and read the file directly.
  • You will NOT find the highlighted key "com.microsoft.parquet.vorder.enabled" in the metadata of a normal parquet file.
PS: These are just based on my findings, please correct me in case of inaccuracies😊
7
11 comments
Vinayak K
5
V-Order
Learn Microsoft Fabric
skool.com/microsoft-fabric
Helping passionate analysts, data engineers, data scientists (& more) to advance their careers on the Microsoft Fabric platform.
Leaderboard (30-day)
Powered by