Problem is that companies are using them for all scenarios. It’s often their entire tech stack now, with kubernetes.
It’s similar to the object oriented hype that came before it, where developers had to write all their programs in a way so they could be extended and prepared for any future changes.
Everything became complex and difficult to work with. And almost none of those programs were ever extended in any significant way where object oriented design made it easier. On the contrary, it made it far more difficult to understand the program since you had to know which method was called in which object due to polymorphism when you looked at the code. You had to jump around like crazy to see what code was actually running.
Now with kubernetes, it’s all about making the programs easier to scale and easier to develop for the developers, but it shifts the complexity to the infrastructure needed to support the networking requirements.
All these programs now need to talk over the network instead of simply communicating in the same process. And with that you have to think about failure scenarios, out of order communication, missing messages, separate databases and data storage for different services etc.
I don’t think people have a choice. If you join a company where they use kubernetes, you have to use that technology for everything. You can’t escape the complexity even if you just want to make a simple program. It still needs to run in kubernetes.
Yes but in practice, companies don’t want to replace their entire tech stacks, and specially if it’s a large company. It costs an enormous amount of money (because of the time and effort it takes) and means the entire company has to relearn how to work with that stack instead.
It’s not impossible and it can happen, but in my experience from working at probably 20 companies now, there is almost always a strong resistence to change.
People don’t even change their default search engine or browser most of the time.
If object oriented design is fundamentally about components sending messages to each other, then microservices are a different route to OO design. If people are bad at OO design, then they’re likely bad at designing microservices, as well. The two aren’t so separate.
All these programs now need to talk over the network instead of simply communicating in the same process.
This is where things go really wrong. Separating components over the network can be useful, but needs careful consideration. The end result can easily be noticeably slower than the original, and I’m surprised anybody thought otherwise.
By that chart, 1MB read from an SSD is only 4 times slower than 1MB read from RAM. Wouldn’t have to be an order of magnitude improvement to have an important affect there.
There is no way to make a network request faster than a function call.
Apologies in advance if this it too pedantic, but this isn’t necessarily true. If you’re talking about an operation call that takes ~seconds to run, then the network overhead is negligible. And if you need specialized hardware for it, then it definitely could be delegate it out to a separate machine over the network. Examples could include requiring a GPU, more RAM, or even a faster CPU if your main application is running on more power-efficient CPUs.
I’m not saying that this is true in every case - they are definitely niche cases. But I definitely wouldn’t say that network requests are never faster than local function calls.
Well put. And this is a generic pattern; for example, GPUs are only faster than CPUs if the cost of preparing the GPU and retrieving the result is faster than directly evaluating the algorithm on the CPU. This also applies to main memory! Anything outside of the CPU can incur a latency/throughput/scaling tradeoff.
I don’t disagree with there being tradeoffs in terms of speed, like function vs network requests. But eventually your whole monolith gets so fuckin damn big that everything else slows down.
The whole stack sits in a huge expensive VM, attached to maybe 3 or 4 large database instances, and dev changes take forever to merge in or back out.
Every time a dev wants to locally test their build, they type a command and have to wait for 15-30 minutes. Then troubleshoot any conflicts. Then run over 1000 unit tests. Then check that they didn’t break coverage requirements. Then make a PR. Which triggers the whole damn process all over again except it has to redownload the docker images, reinstall dependencies, rerun 1000+ unit tests, run 1000+ integration tests, rebuild the frontend, which has to happen before running end to end UI tests, pray nothing breaks, merge to main, do it ALL OVER AGAIN FOR THE STAGING ENVIRONMENT, QA has to plan for and execute hundreds of manual tests, and we’re not even at prod yet. The whole way begging for approvals from whoever gets impacted by anything from a one line code change to thousands.
When this process gets so large that any change takes hours to days, no matter how small the change is, then you’re fucked. Because unfucking this once it gets too big becomes such a monstrous effort that it’s equivalent to rebuilding the whole thing from scratch.
I’ve done this song and dance so many times. If you want your shit to be speedy on request, great, just expect literally everything else to drag down. When companies were still releasing software like once a quarter this made sense. It doesn’t anymore.
Think you’re understating it there. Network call takes milliseconds at best. Function call, if the CPU has correctly predicted the indirect branch, is basically free, but even if it hasn’t then you’re talking nanoseconds. It’s slower by millions of times.
Yeah it’s insane. But of course if scaling different parts of the application, I guess micro services are the way to do it. But otherwise one could scale the entire app by just putting more of the entire app on servers. No need for micro services. It just needs to be written to be able to listen to message queues and you can have any number of app instances.
In theory, it can be faster with parallelization. Of course, all the usual caveats about parallelization apply, and you’re most likely going to create a slower system if you don’t think it through.
On the contrary, it made it far more difficult to understand the program since you had to know which method was called in which object due to polymorphism when you looked at the code. You had to jump around like crazy to see what code was actually running.
I agree with this point, but polymorphism is often the better alternative.
Using switch statements for the same thing still have the problem that you need to jump around like crazy just to find where the variable was once set. It also tends to make the code more bloated.
Same with using function references, except this time it can be any function in the entire program.
The solution is to only use polymorphism when it’s absolutely needed. In my experience, those cases are actually quite rare. You don’t need to use it everywhere.
Yeah I agree. With experience you know where to use it and where it really shines, and when not to use it because it will just make everything harder to reason about.
But a lot of devs are not that experienced when they make these decisions. All of us learn from mistakes, and those mistakes stay in the code base. :)
The problem is that they become a buzz word for at scale companies that need them because they have huge complex architects, but then non at scale companies blindly follow the hype when they were created out of necessity for giant tech stacks that are a totally different use case.
They add a lot of overhead and require extra tooling to stay up to date in a maintainable way. At a certain scale that overhead becomes worth it, but it takes a long time to reach that scale. Lots of new companies will debate which architecture to adopt to start a project, but if you’re starting a brand new project it’s probably too early to benefit from the extra overhead of micro architectures.
Of course there are pros and cons to everything, don’t rely on memes for making architecture decisions.
I guess I’m not sure how others build with micro services, but using AWS SAM is stupid simple, and the only maintenance we’ve ever had to do is update a Node version. 🤷
But why? Microservices do have some good advantages in some scenarios
Problem is that companies are using them for all scenarios. It’s often their entire tech stack now, with kubernetes.
It’s similar to the object oriented hype that came before it, where developers had to write all their programs in a way so they could be extended and prepared for any future changes.
Everything became complex and difficult to work with. And almost none of those programs were ever extended in any significant way where object oriented design made it easier. On the contrary, it made it far more difficult to understand the program since you had to know which method was called in which object due to polymorphism when you looked at the code. You had to jump around like crazy to see what code was actually running.
Now with kubernetes, it’s all about making the programs easier to scale and easier to develop for the developers, but it shifts the complexity to the infrastructure needed to support the networking requirements.
All these programs now need to talk over the network instead of simply communicating in the same process. And with that you have to think about failure scenarios, out of order communication, missing messages, separate databases and data storage for different services etc.
You can have the best tool in the world and still find people just hitting their own face with it.
I don’t think people have a choice. If you join a company where they use kubernetes, you have to use that technology for everything. You can’t escape the complexity even if you just want to make a simple program. It still needs to run in kubernetes.
Depends on who you think the people are.
CTOs, technical team leads and such can make those decisions. And devs can also suggest migrating to simpler solutions.
If a tech giant like Amazon can do it like they did with Prime Video, I don’t think it’s impossible other companies can do so too.
Yes but in practice, companies don’t want to replace their entire tech stacks, and specially if it’s a large company. It costs an enormous amount of money (because of the time and effort it takes) and means the entire company has to relearn how to work with that stack instead.
It’s not impossible and it can happen, but in my experience from working at probably 20 companies now, there is almost always a strong resistence to change.
People don’t even change their default search engine or browser most of the time.
“hello OPS-team. Here is my simple program. Have fun running it on your kubernetes”
If object oriented design is fundamentally about components sending messages to each other, then microservices are a different route to OO design. If people are bad at OO design, then they’re likely bad at designing microservices, as well. The two aren’t so separate.
This is where things go really wrong. Separating components over the network can be useful, but needs careful consideration. The end result can easily be noticeably slower than the original, and I’m surprised anybody thought otherwise.
It’s absolutely slower. There is no way to make a network request faster than a function call. It’s slower by probably thousands of times.
I have to look it up every time, but this is always worth reading once a year to remind yourself:
https://gist.github.com/hellerbarde/2843375
Yeah I’ve seen it before. It’s a very good reminder for everyone to keep in mind isn’t it. :)
Since this is from 12 years ago, have any of these numbers changed much? Especially the SSD numbers.
Not in any order of magnitude
By that chart, 1MB read from an SSD is only 4 times slower than 1MB read from RAM. Wouldn’t have to be an order of magnitude improvement to have an important affect there.
Apologies in advance if this it too pedantic, but this isn’t necessarily true. If you’re talking about an operation call that takes ~seconds to run, then the network overhead is negligible. And if you need specialized hardware for it, then it definitely could be delegate it out to a separate machine over the network. Examples could include requiring a GPU, more RAM, or even a faster CPU if your main application is running on more power-efficient CPUs.
I’m not saying that this is true in every case - they are definitely niche cases. But I definitely wouldn’t say that network requests are never faster than local function calls.
Well put. And this is a generic pattern; for example, GPUs are only faster than CPUs if the cost of preparing the GPU and retrieving the result is faster than directly evaluating the algorithm on the CPU. This also applies to main memory! Anything outside of the CPU can incur a latency/throughput/scaling tradeoff.
I don’t disagree with there being tradeoffs in terms of speed, like function vs network requests. But eventually your whole monolith gets so fuckin damn big that everything else slows down.
The whole stack sits in a huge expensive VM, attached to maybe 3 or 4 large database instances, and dev changes take forever to merge in or back out.
Every time a dev wants to locally test their build, they type a command and have to wait for 15-30 minutes. Then troubleshoot any conflicts. Then run over 1000 unit tests. Then check that they didn’t break coverage requirements. Then make a PR. Which triggers the whole damn process all over again except it has to redownload the docker images, reinstall dependencies, rerun 1000+ unit tests, run 1000+ integration tests, rebuild the frontend, which has to happen before running end to end UI tests, pray nothing breaks, merge to main, do it ALL OVER AGAIN FOR THE STAGING ENVIRONMENT, QA has to plan for and execute hundreds of manual tests, and we’re not even at prod yet. The whole way begging for approvals from whoever gets impacted by anything from a one line code change to thousands.
When this process gets so large that any change takes hours to days, no matter how small the change is, then you’re fucked. Because unfucking this once it gets too big becomes such a monstrous effort that it’s equivalent to rebuilding the whole thing from scratch.
I’ve done this song and dance so many times. If you want your shit to be speedy on request, great, just expect literally everything else to drag down. When companies were still releasing software like once a quarter this made sense. It doesn’t anymore.
I agree with you, and that is a hellish environment to work in.
There must be a better middle ground for all of this.
Think you’re understating it there. Network call takes milliseconds at best. Function call, if the CPU has correctly predicted the indirect branch, is basically free, but even if it hasn’t then you’re talking nanoseconds. It’s slower by millions of times.
Yeah it’s insane. But of course if scaling different parts of the application, I guess micro services are the way to do it. But otherwise one could scale the entire app by just putting more of the entire app on servers. No need for micro services. It just needs to be written to be able to listen to message queues and you can have any number of app instances.
In theory, it can be faster with parallelization. Of course, all the usual caveats about parallelization apply, and you’re most likely going to create a slower system if you don’t think it through.
I agree with this point, but polymorphism is often the better alternative.
Using switch statements for the same thing still have the problem that you need to jump around like crazy just to find where the variable was once set. It also tends to make the code more bloated.
Same with using function references, except this time it can be any function in the entire program.
The solution is to only use polymorphism when it’s absolutely needed. In my experience, those cases are actually quite rare. You don’t need to use it everywhere.
Yeah I agree. With experience you know where to use it and where it really shines, and when not to use it because it will just make everything harder to reason about.
But a lot of devs are not that experienced when they make these decisions. All of us learn from mistakes, and those mistakes stay in the code base. :)
but they have a lot more disadvantages for most scenarios (if you’re not a faang scale company, you probably don’t need them)
The problem is that they become a buzz word for at scale companies that need them because they have huge complex architects, but then non at scale companies blindly follow the hype when they were created out of necessity for giant tech stacks that are a totally different use case.
They add a lot of overhead and require extra tooling to stay up to date in a maintainable way. At a certain scale that overhead becomes worth it, but it takes a long time to reach that scale. Lots of new companies will debate which architecture to adopt to start a project, but if you’re starting a brand new project it’s probably too early to benefit from the extra overhead of micro architectures.
Of course there are pros and cons to everything, don’t rely on memes for making architecture decisions.
I guess I’m not sure how others build with micro services, but using AWS SAM is stupid simple, and the only maintenance we’ve ever had to do is update a Node version. 🤷
SAM is serverless but that doesn’t necessarily mean micro services.