I added a breaker / queuing action for all high volume endpoints - rather than going down from server overload (im really not joking when I say we process more volume than I ever thought we would), we queue actions.
Instead of overloading and endpoints going down, actions are now queued so during times of high volume or black swam volume events - we just queue the action. I figured delays > downtime but open to feedback as we plan for migration