During an ongoing implementation of Fabric, I came across a drawback while Ingesting data from an on-prem MS SQL into Fabric, using spark.read.format("jdbc") code. In order to enable the on-prem to be accessible in Fabric, something called as "on-premise data gateway" is configured (I am not an expert in networking, hence can't explain the nitty gritting of this gateway). While doing data ingestion using Dataflow Gen2, in addition to the credentials of SQL server database, we have to provide above gateway too and the ingestion runs smoothly.
However, the same is not achievable in ingestion using spark.read.format("jdbc"), as there is no provision to specify this gateway. Hence, suggestion from my side is to avoid notebook based approach for ingestion from an on-prem source.
Also, the inputs needed from fellow Fabricators:
- Is there a better and alternate approach, apart from this "on-premise data gateway", to address this and achieve data ingestion using notebook? Has anyone tried it successfully ?