Help Listing files in 4 deep subdirectories below a top level directory using Notebook
Hey notebook gurus, maybe I’m making this harder than it needs to be. In a Fabric notebook I want to input the name of a top level directory and have it return all parquet file paths and names in the multiple subdirectories under the top level. The subdirectories have Year, month, day hierarchies folders with the files being under the lowest directory. So four levels down. I have tried functions off Google using dbutils, os.walk and mssqlutils.fs.ls (which doesn’t support recursive folder searches as far as I know ) but nothing seems to work. Does anybody have a method or function they can share to get this list of file names/paths. I want to load this into an array variable and load each individual file into downstream dataframe tasks. Any help would be greatly appreciated
1
4 comments
Help Listing files in 4 deep subdirectories below a top level directory using Notebook