Need help with connectivity from SQL Dedicated pool to fabric

Please help if you can. I have been running pyspark notebooks in fabric that load data from a azure dedicated sql pool to a fabric lakehouse . I am using spark.read.option as shown below. Code using this access method has been running successfully for over a month. I was using it to load our new Enterprise Data Warehouse and it performed flawlessly. It quit working Monday afternoon 5/20 central US.

import com.microsoft.spark.sqlanalytics

from com.microsoft.spark.sqlanalytics.Constants import Constants

from pyspark.sql.functions import col

df = (spark.read.option(Constants.SERVER, "my-server-eastus-001.sql.azuresynapse.net")

.option(Constants.USER, "myfabriclogin")

.option(Constants.PASSWORD, "myfabricloginpassword")

.option(Constants.DATABASE, "dw")

.option(Constants.DATA_SOURCE, "my_fabric_spark")

.option(Constants.QUERY, "select * from dw.Dealer")

.synapsesql()

)

df.count()

I have changed all the values of course for server, login, password, and data source.

The code errors out at the df.count() statement, I believe this is because spark uses lazy evaluation and that is where it actually has to have the data.

The head of the exact error message is here:

Py4JJavaError: An error occurred while calling o4666.count. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in stage 13.0 failed 4 times, most recent failure: Lost task 6.3 in stage 13.0 (TID 38) (vm-e1e05730 executor 1): java.lang.NullPointerException

The last part of the error message references a null pointer, I think the data frame is empty. If I try to do a "head" on the data frame I get the same sort of error. I am able to login directly to the sql dedicated pool using SSMS with the user, password, and database shown above and everything works.

A word about the "DATA_SOURCE", this is used to setup a temporary holding place in ADLS2 for data being imported from the dedicated pool. The data source exists as an "EXTERNAL DATA SOURCE" within the dedicated sql pool. It uses a database scoped credential also created on the dedicated sql pool. The database scoped credential is designed to be a temporary access method. This failure occurred exactly two months after we initially started using this. I thought this was the problem but creating a new credential and a new data store did not fix the issue, it did however slightly change the error messages.

Sorry for the very long post, I would greatly appreciate any new ideas, as I am all out of them.

I originally got the idea to do this from the following youtube video

Microsoft Fabric: Import Azure Synapse Dedicated Pool FAST via Spark Notebook!

https://www.youtube.com/watch?v=7DsKL_t0XN0

0 comments