I’m building an app using Google ADK with a custom front end, an AI agent, and a FastAPI backend to connect everything. I want my agent to have persistent user memory, so I’m planning to use Vertex Memory Bank, the new feature in Vertex AI.
For deployment, I’m unsure about the best approach:
- Should I deploy the AI agent directly in Vertex AI Engine and host FastAPI separately (e.g., on Cloud Run)?
- Or should I package and deploy both the AI agent and FastAPI together in a single service (like Cloud Run)?
What would be the best practice or most efficient setup for this kind of use case?