Hi All, I want to post here a couple of sample projects that I have been using to evaluate the multi-agent tools. I found that the samples I encounter are very simple, even those considered advanced. I've used a couple of projects to evaluate several different tools+LLMs+Env. For the tools I've used AutoGPT, AutoGen, ChatDev, MetaAI, and CrewAI. And to be honest none of them were able to complete it, even with some interference. As for LLMs, I've used (when available) GPT-4o, Claude 3 Opus, Mistral 8x22b, and Llama 3. And for environment, I've tried it in Windows 64x (with powershell 7), Ubuntu v22.4 (the real deal wtih zsh), and WSL2 (running Ubuntu v22.4 with zsh) Full disclosure, the best result I've got so far is with AutoGPT+GPT-4o+WSL2. But I think the CrewAI is the tools that offers more potential. So, I think, I am not being able to set up the agents, tools, and tasks correctly. I would love to get some help in this exercise. Below I am posting the specification of two of the projects I am using. Both projects are only for experimentation. They are not production code. We wanted to test two types of scenarios: 1. A wide project with low coding complexity (shallow) that involves many interconnected parts (frontend, backend, tests, etc). That project was named FacePuppy 2. A narrow project with high coding complexity (deep) that involves a lot of computation and performance in a language that is not very "popular". That project was named SoundSpectrumRT. Those are sample of projects very close to real projects that could be given to our dev team to be developed. Here are their specs that were given in all the tests with the different combination of Tools+LLM+Env: ``` # FacePuppy ## Scope The FacePuppy web application aims to provide a platform for users to create profiles for their pets, specifically focusing on puppies. Users will be able to register, create and manage the puppy's profile, upload the puppy's photos with an optional caption into a gallery, write blog posts about the puppy, search for the puppies of other users, view the blog posts of the puppies of other users, and, if the user is an administrator, access an admin dashboard. The web application must be mobile-responsive to cater to a wide range of devices.