Activity
Mon
Wed
Fri
Sun
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
What is this?
Less
More

Memberships

Learn Microsoft Fabric

Public • 8.8k • Free

Fabric Dojo 织物

Private • 313 • $49/m

26 contributions to Learn Microsoft Fabric
Quick question - delta tables
I am working on a function to clean and optimize multiple delta tables in a lakehouse. Do we OPTIMIZE before VACis UUM or there other way around? Which one is the best approach?
0
3
New comment 28d ago
3 likes • 28d
Typically optimize then vacuum. I've never heard of anyone doing it in reverse. Vacuum removes the older, now unreferenced files. If you vacuum before optimize, those current small files are still present.
DAX Help - Trailing average for the previous 12 weeks
Our team needs a little DAX help from any experts out there... We are looking to do for each row a trailing avg for prev 12 weeks for a metric lifts per hour = successful lifts / time on clock. I attached an image of the grid and put in what our DAX query looks like at the moment. Code: Trailing_12_Week_Avg_Lifts_Per_Hour = var CurrentDate = MAX(dim_date[event_date]) var PreviousDate = CurrentDate - 84 var Result = CALCULATE( [lifts_per_hour], DATESINPERIOD( dim_date[event_date], MAX(dim_date[event_date]), -84, DAY ) ) RETURN Result Any assistance would be appreciated!
1
2
New comment Feb 14
DAX Help - Trailing average for the previous 12 weeks
0 likes • Feb 13
@Justin Sweet Can you provide the DAX for your [lifts_per_hour]? If it's not something like lifts_per_hour = DIVIDE(SUM( 'Table'[Lifts]), SUM('Table'[Time])) That might be the issue. If you're trying to look at every individual day rather than the true average (total lifts / total time), in which case you'd need to being using AVERAGEX in your Trailing_12_Week_Avg_Lifts_Per_Hour.
ERROR: FAILED TO RENDER CONTENT
So, apparently I have been trying to ingest over 7 million rows from an API to a lakehouse table using a notebook, and due to rate limiting, it has taken quite a while (more than expected). Two challenges: 1. "Failed to render content" screen whenever I try to access the pipeline while it runs. 2. The pipeline times out after 12 hours Is this normal?
2
5
New comment Feb 13
2 likes • Feb 12
@Wilfred Kihara Can you provide a little more context? What SKU are you running on? 7 million shouldn't be enough to challenge any SKU F8 and above. You mention both a notebook and a pipeline. Are you running the notebook from the pipeline? If so, have you successfully tested the notebook from outside the pipeline? If so, are you passing parameters into the notebook?
How you manage your study time?
Hello, I was just wondering how people manage their study time between work and family. Do you study before/after work, maybe you study in the evening til quite late after putting the kids to bed? Do you stay late at work and study after hours before going home? How do you deal with the frustration of not getting in any study for a few days because of life commitments? Do you have an arrangement with your family that study time is study time? I'd be really interested in how any of you shape or organise your busy lives around study time.
8
14
New comment Feb 12
4 likes • Feb 9
Hi @Paul Williams. This seems more like a productivity / self-development question, but I'm here for it! Jim Rohn is my favorite motivational speaker when it comes to this stuff, with a lot of good content on time management and prioritization. Ali Abdaal and Dan Martell also have some really good content, with the latter emphasizing "energy management". However, my #1 recommendation would be a book called the 12 Week Year. It's all about connecting your larger life goals to your short-term goals, then to your daily habits, and introducing a sense of urgency. If you read the concept of the book and like it, let me know!
0 likes • Feb 11
@Paul Williams It sounds like you've got a few folks you could reach out to and create some kind of accountability group!
Gen2 DF issues, G1 != G2
Hi folks, having quite a few errors with G2 Dataflows, even when using a .pqt and importing the exact G1 df to a G2 its failing substantially, Ive been trying a bunch of stuff with switching of Fast copy/Enable Staging, and writing to a LH or not, even when its at its most simple form it seems to fail. Granted there is a bit going on in this DF, originally it was 4 merges in each of 2 queries and then a join. Ive moved half to a notebook but there still seems to be considerable issues and differences, just wondering if others are experiencing the same issues? my final state will move all of the transform to a NB, but surely DF's should still perform as G1! Similar issues: https://community.fabric.microsoft.com/t5/Service/Dataflow-Gen2-Issue/m-p/3275868 https://community.fabric.microsoft.com/t5/Dataflow/Couldn-t-refresh-the-entity-because-of-an-issue-with-the-mashup/m-p/3506887 https://community.fabric.microsoft.com/t5/Dataflow/Gen2-Dataflow-Failed-to-insert-a-table/m-p/3372458 Additional questions: do Dataflows use Datafactory in the background? this would make sense if the schemas are not perfect, it needs the same treatment as delta schema rules etc. why cant the errors be more helpful?! Cheers Ross
1
6
New comment Feb 11
2 likes • Feb 10
@Ross Garrett Not sure about your primary issue, but I do have some insight on your others. My understanding is that Dataflows are effectively a point and click to generate Spark under the hood. Alex Powers, a Fabric Project Manager, has said this to our user group several times. As for errors not being helpful... Welcome to Microsoft? 😅 I feel like it's always been this way. "Arithmetic overload ..." Okay... What column??? "Truncation ..." Okay... What column??? For some reason, Microsoft just doesn't do a good job of helping us debug in SQL or GUIs. This is one of the reasons I prefer PySpark. I feel I'm able to debug much faster because I get better error messages, though still not as much detail as I'd like.
0 likes • Feb 11
@Ross Garrett Out of curiosity, what do you find time-consuming about the notebooks? Do you have a standardized ETL notebook that you pass metadata into? Or are you customizing each individual ingestion?
1-10 of 26
Anthony Kain
4
83points to level up
@anthony-kain-6916
Lead Data Engineer with 5 heavy years of experience in TSQL and PySpark. Connect with me on LinkedIn: https://www.linkedin.com/in/tonykain/

Active 14d ago
Joined Sep 9, 2024
INTJ
St. Louis, MO
powered by