Hey everyone,
I wanted to know if anyone can provide some advice to a problem I have encountered.
Scenario
I conducting a review our institutions recruitment campaigns.
To do this I have been given access to a SharePoint folder which contains a daily extract of recruitment data. These extracts are csv files and are around 50MB.
Every day at around 5am a csv extract appears with yesterdays recruitment data.
I want to create from these files a simple dataset, with the application, offer and new registration numbers for each date and possibly show this with some other dimensions, like domicile country etc.
Data Flow Gen 2 won't work as it can't handle such a large volume of data.
I thought a data pipeline would work as if I can perform the aggregation to one csv extract I could then use the for each loop to perform this on all of the other extracts.
Any help please?