Hi everyone,
I'm sure many of you are deeply involved in experimenting with or even implementing various RAG (Retrieval-Augmented Generation) systems. You've likely noticed how challenging it is to compare different systems or to make definitive statements about their effectiveness.
I'm curious to know what evaluation methods you've tried and which ones you prefer.
I'm looking forward to hearing your recommendations!