Jul '24 (edited) • Technical
Loading file with Great Expectations
I'm trying to use the instructions from "End-to-end data validation strategies in Microsoft Fabric" video with my own data. I'm a newbie here and to the notebooks concept. From the video, I'm on the first phase of Schema Validation (and am thankful for the guidance in the video)
I have a parameter node where I've defined the parameter like:
fileToTest = "Files/_folder_/_fileName_.csv"
Second node is
%pip install --q great_expectations
--- update---
It seems that the pip Install command is causing at least part of my problems. If I comment it out. I can utilize the fileToTest parameter in subsequent cells, but I can't use the dependent functions.
Third node is:
import great_expectations as gx
gxContext = gx.get_context()
validator = gxContext.sources.pandas_default.read_csv(fileToTest)
I consistently receive a "NameError" for the reference to "fileToTest".
I wanted to validate that the fileToTest parameter works so in the same Parameter node I added (started with dragging the file to the notebook)
df = spark.read.format("csv").option("header","true").load(fileToTest)
display(df)
This works, displaying the content of the CSV.
What am I missing?
0
4 comments
John Nickell
2
Loading file with Great Expectations
Learn Microsoft Fabric
skool.com/microsoft-fabric
Helping passionate analysts, data engineers, data scientists (& more) to advance their careers on the Microsoft Fabric platform.
Leaderboard (30-day)
Powered by