Batch processing best practices · AI Developer Accelerator

Batch processing best practices

Right now I'm designing a pipeline/workflow using CrewAI flows - but the tech stack may not be important.

My workflow takes in a list of elements as its input, a list of json objects for example, performs the pipeline/workflow on each input and then returns the complete results.

My question is: What is the best practice for when to split the inputs and gather the results. In my immediate example, do I create the flow to act on 1 input and have my runner (fastapi in this case) kick off a new instance of the crew for each input and then gather the results of all the flow instances into 1 result? Or, do I have the workflow (in this case CrewAI flows) have the looping mechanism internal to process and then return which would mean 1 outer flow kickoff with a list as input.

I can see there being a "it depends" answer based on complicated state or expensive to create objects, but I feel like this is a pretty common use case, so there's probably some great common wisdom I should draw from.

1 comment