Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Micro services at macro-scale?
14 points by themartorana on Nov 30, 2014 | hide | past | favorite | 7 comments
I am the main developer of a relatively-high throughput API - upwards of 1000 req/sec at peak times. The core app is built in Python mostly because it started as an experiment, and was never meant to handle so much traffic.

We use AWS autoscaling, and build ephemeral front-end servers to handle wildly fluctuating traffic levels throughout the day. We can be as low as 3 servers, and as high as 13 (front-end only, does not include DB/Redis).

I've been tearing off piece9s of what has become a large-scale app in to "micro-services" (for lack of a better name). Most are written in Go, as I both enjoy it and find it fast to develop in. But when looking at the future, I'm trying to decide between Go and Elixir for a port of the largest part of the app. I know Elixir's OTP layer lets servers almost work as a hive-mind, whereas Go would be better to services that could have several instances launched in parallel that are not necessarily aware of each other. Our current practice of having all front-end servers load-balance requests and have no inter-communication (so scaling down doesn't cause any real problems) seems to work well.

My question is mostly around architecture, and comparing OTP-based smart clustering with "dumb scaling" of many instances of the same small services that don't inter-communicate. I don't want to get into a language debate, but rather am looking for thoughts about building services to scale to many thousands of requests per second, and the different approaches to doing so.

Thoughts? (Edit: clarity)



> many instances of the same small services that don't really need to be aware of each other

Then your problem is embarrassingly parallel and your solution does not need to be "micro services" to scale.

Sorry if I missed something, I skimmed your text.


Again, it's not micro-services the buzz word. It's easier to have different services that are logically separable - say, push notification services - pulled out of the main app. Testing, maintenance, etc., are much simpler.

Of course certain services have to be aware of others, but they don't necessarily need to be aware of other copies of themselves behind the same LB.

But yes, currently, everything is just parallel instances.


Elixir is usually the best bet for high throughput with low memory requirements.

But I think you should read some presentations about microservices before you decide to code a giant new architecture. There are other solutions already developed. A guide on the mindset of microservices that is useful is well written in these two articles: http://www.pwc.com/us/en/technology-forecast/2014/cloud-comp... and http://www.tigerteam.dk/2014/micro-services-its-not-only-the...

Xing used elixir/erlang built an rabbit mq, riak, and redis based service that seemed like a good build for scaling a large system with minimal memory and latency. You can watch that video here: https://www.youtube.com/watch?v=38yKu5HR-tM

For scaling microservices you generally want to use http and queueing services for distributing the messages. There are many options in this retrospect. Redis and RabbitMQ are some of the better ones. Something newly being developed but could be of interest is syncfree: https://github.com/SyncFree . The reason syncfree is a good fit for a distirbuted service is it will not get bulky from updates based on timed services. For the rabbitmq version of riak: https://github.com/jbrisbin/riak-exchange

An alternatives to riak is hibaridb, but may take a bit more work to get functional with other stuff: https://github.com/hibari/

With http APIs you can just curl into your APIs. Or load balance and proxy them with HAProxy, Nginx, Varnish etc. Amazon has route53 for managing it by dns. There is also zonify: https://github.com/airbnb/zonify

Microservices are also usually built on some form of actor framework. Akka is what Gilt used. You can watch a video on Gilt's microservices: http://tech.gilt.com/post/65070094551/gilts-kevin-scaldeferr... .

Serialization is just as important in microservices for scaling up small microservices to handle larger loads. For example, when Gilt wanted to scale they used Jackson json serialization. Jackson (www.github.com/fastxml/) is fairly scalable because it is as fast as protocolbuff and has support for hierarchies. They also used the sbt runtime packaging for automatically re-creating and re-compiling their components. They built components using ClusterMate (https://github.com/cowtowncoder/ClusterMate) I think.

You might be interested in a startup that hook's github gists for microservices at hook.io .

As for updates they often use pubsub architectures. I think Gilt used one as well. You can infer a lot of what they did from their released github code at www.github.com/gilt/ .

Other resources to read about microservices: https://news.ycombinator.com/item?id=7994540

Disclaimer: I do a lot of reading, not a lot of building. Though sometimes a lot of reading means reading research papers, and whitepapers from companies to get an idea of what works or not.


Awesome links - thank you!


No problem. I also post stuff so I can refer to stuff later as well. : ) If you have any other questions feel free to ask.

In addition thought I'd mention theres a few good resources as well listed from this github account based on java actors/microservices: https://github.com/puniverse


Unless I missed something there is no mention of why you want to rewrite/alter your system?

- Is it for cost?

- Reliability?

- Scratch a tech itch?

- Not flexible enough for future needs?

As obviously that will affect a lot of decisions


"Not flexible enough for future needs" is a large part of it. There are some slow-moving parts that need to be better executed, and some migration to better data stores for certain types of data, so touching a lot of the code is going to be happening anyway.

Concurrency as a first-class feature would allow me to actually simplify things, which is a bonus, as I am loathe to use the threading or multi-processing libraries in Python. I could rewrite parts in say, Twisted, but that's still rewriting.

This is not scratch-a-tech-itch, I assure you. I have plenty of new work to be done that will give me that. It's about taking an academic exercise and battle-hardening it and making appropriate design decisions for scale even above where we currently are.

(Squeezing more req/s out of each AWS instance isn't going to hurt our bottom line either, but that's secondary.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: