Loading file with Great Expectations · Learn Microsoft Fabric

Jul '24 (edited) • Technical

Loading file with Great Expectations

I'm trying to use the instructions from "End-to-end data validation strategies in Microsoft Fabric" video with my own data. I'm a newbie here and to the notebooks concept. From the video, I'm on the first phase of Schema Validation (and am thankful for the guidance in the video)

I have a parameter node where I've defined the parameter like:

fileToTest = "Files/_folder_/_fileName_.csv"

Second node is

%pip install --q great_expectations

--- update---

It seems that the pip Install command is causing at least part of my problems. If I comment it out. I can utilize the fileToTest parameter in subsequent cells, but I can't use the dependent functions.

Third node is:

import great_expectations as gx

gxContext = gx.get_context()

validator = gxContext.sources.pandas_default.read_csv(fileToTest)

I consistently receive a "NameError" for the reference to "fileToTest".

I wanted to validate that the fileToTest parameter works so in the same Parameter node I added (started with dragging the file to the notebook)

df = spark.read.format("csv").option("header","true").load(fileToTest)

display(df)

This works, displaying the content of the CSV.

What am I missing?

4 comments