Fabric Notebooks give great connectivity options to the default Lakehouse for Spark and SparkSQL, but they are more limited when it comes to connecting securely and smoothly to a SqlEndpoint, be that LH or DW.
We demand two principles
- programmatically retrieve connection string rather than having to manually specify or pass as parameter.
- Authenticate to SqlEndPoint via AAD token instead of using a service principal or having to pass a username/password without MFA.
Sample code below and also on GitHub link below. Paste this into a notebook which is connected to a default lakehouse and has a data warehouse in same workspace.
https://github.com/ProdataSQL/Fabric/blob/main/10_SqlEndPoint/SQL%20Endpoint.py
dw_name="FabricDW" # Change this to your DW Name
import sempy.fabric as fabric
import struct
import sqlalchemy
import pyodbc
import pandas as pd
from notebookutils import mssparkutils
#Function to Return sqlalchemt ODBC Engine, given a connection string and using Integrated AAD Auth to Fabric
def create_engine(connection_string : str):
token = mssparkutils.credentials.getToken('https://analysis.windows.net/powerbi/api').encode("UTF-16-LE")
token_struct = struct.pack(f'<I{len(token)}s', len(token), token)
SQL_COPT_SS_ACCESS_TOKEN = 1256
return sqlalchemy.create_engine("mssql+pyodbc://", creator=lambda: pyodbc.connect(connection_string, attrs_before={SQL_COPT_SS_ACCESS_TOKEN: token_struct}))
# Get ODBC Connection String for Default LH ijn this Notebook
tenant_id=spark.conf.get("trident.tenant.id")
workspace_id=spark.conf.get("trident.workspace.id")
lakehouse_id=spark.conf.get("trident.lakehouse.id")
lakehouse_name=spark.conf.get("trident.lakehouse.name")
sql_end_point= fabric.FabricRestClient().get(f"/v1/workspaces/{workspace_id}/lakehouses/{lakehouse_id}").json()['properties']['sqlEndpointProperties']['connectionString']
connection_string = f"Driver={{ODBC Driver 18 for SQL Server}};Server={sql_end_point}"
print (f"connection_string={connection_string}")
engine = create_engine(connection_string)
with engine.connect() as alchemy_connection:
#Run TSQL Query on a LH End Point
query = f"exec {lakehouse_name}.[sys].[sp_server_info] 2"
df = pd.read_sql_query(query, alchemy_connection)
print (df)
#Run TSQL Query on a DW End Point
query = f"exec {dw_name}.[sys].[sp_server_info] 2"
df = pd.read_sql_query(query, alchemy_connection)
print (df)
#Execute a TSQL Stored Procedure or DDL/DML on a Fabric DW
connection = engine.raw_connection()
cursor = connection.cursor()
sql= f"USE {dw_name};CREATE TABLE tmpTable (Column1 INT NULL);DROP TABLE tmpTable"
cursor.execute(sql)
connection.commit()
How could Microsoft Improve this ?
- While there is an official DW connector for Notebooks, it currently only supports Scala for like the one person on the moon using Scala. So if this could be improved to give PySpark and support like SparkSQL has that would be amazing
https://learn.microsoft.com/en-us/fabric/data-engineering/spark-data-warehouse-connector
- Currently mssparkutils.credentials.getToken() doesnt return a token that is immediately usable by ODBC or other database libraries. Would be great if that could happen so we dont have to manually mess with the token structure and also if Fabric SqlEndPoint was an audience option.
https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/microsoft-spark-utilities?pivots=programming-language-python#get-token - Currently the AAD token is issued under the security context of the person who last saved the notebook. ideally we need managed identity so a notebook can run under a system and not user account. This is my biggest concern security wise.
This is not currently on the published roadmap for notebooks ;-(
https://learn.microsoft.com/en-gb/fabric/release-plan/data-engineering
Credits
Thanks to Aidan@prodata.ie who did most of the coding and testing.
Also thanks to https://x.com/mim_djo for pushing me to publish this workaround 😉

polished! 99 2025 Performance of API for GraphQL Web Activity vs Script Activity in Fabric Data Pipelines noble