• Oh I have nothing against microservices as a concept and they can be very maintainable and easy to operate, but that’s rarely the outcome when the people building the systems don’t quite know what they’re doing or have bad infra.

    Monoliths are perfectly fine for many use cases, but once you go over some thresholds they get hard to scale unless they’re very well designed. Lock free systems scale fantastically because they don’t have to wait around for eg. state writes, but those designs often mean ditching regular databases for the critical paths and having to use deeper magics like commutative data structures and gossiping, so your nodes can be as independent as possible and not have any bottlenecks. But even a mediocre monolith csn get you pretty far if you’re not doing anything very fancy.

    Microservices give you more granular control “out of the box”, but that doesn’t come without a price, so you need much better tooling and a more experienced team. Still a bit of a minefield because reasoning about distributed systems is hard. They have huge benefits but you really need to be on your game or you’ll end in a world of pain 😀 I was the person who unfucked distributed systems at one company I worked in, and I was continuously surprised by how little thought many coders paid to eg. making sure the service state stays in a “legal” state. Database atomicity guarantees were often either misused or not used at all, so if a service had to do multiple writes to complete some “transaction” (loosely) and it died during rhe write, or a database node died and only part of the writes went through to the master, and maybe suddenly you’re looking at some sort of spreading Byzantine horror where nothing makes sense anymore because that partially completed group of writes has affected other systems. Extreme example, sure, but Byzantine faults where a corrupted state spreads and fucks your consensus are something you only see in a distributed context.

    • Yeah so that’s one place I sort of sacrifice microservice purity. If you have a transaction that needs to update multiple domains then you need a non-microservice to handle that IMO. All of these rules are really good rules of thumb, but there will always be complexity that doesn’t fit into our perfect little boxes.

      The important thing is to document the hell out of the exceptions and do everything you can to keep them on the periphery of business logic. Fight like hell to minimize dependencies on them.

      But that’s true of any architecture.

      If it’s not clear I pretty much agree with everything you say here.