Winter hibernation

Happy Sunday!

You haven't heard from me in quite a while, so a quick reminder: This is Mastering Heroku, my personal email newsletter to help folks like you be awesome at running web apps on Heroku.

I took an unplanned hibernation over the winter, but I'd love to start sharing with you again. You can expect personal anecdotes, useful links, and sizzling hot tips.

Experienced people say a consistent schedule is critical for a newsletter, but that's not going to happen with me. You’ll only hear from me if I have something worthy of sharing, which I hope to be once or twice a month. Hopefully you can handle the unpredictability. ;-)

That's it for now. Just wanted to get us both excited for what's to come.

Not your cup of tea? Please unsubscribe below with no hard feelings.

If you're sticking around, what would you like to hear more about?

4 Ways to Scale on Heroku

Mastering Heroku — Issue #6

A few weeks ago I gave a presentation at the Columbus Ruby Brigade about my approach and mental model for scaling Heroku apps. My attempt to record the talk failed, so I rerecorded it as a screencast just for you. ❤️

Here’s what I cover in the video:

  • Scaling option #1: Horizontal—add more dynos. If you see increased request queue times in Scout or New Relic, you need to make your app faster or add more dynos. As soon as you’re using more than one dyno, automate it instead of playing a guessing game.

  • Scaling option #2: Vertical—increase dyno size. Because of Heroku’s random routing, you need concurrency within a single dyno. This means running more web processes, which consume more memory and may require a larger dyno type. Aim for at least three web progresses.

  • Scaling option #3: Process types. You’re not limited to just “web” and “worker” process types in your Procfile. Consider multiple worker process types that pull from different job queues. These can be scaled independently for more control and flexibility.

  • Scaling option #4: App instances. Heroku Pipelines make it relatively easy to deploy a single codebase to multiple production apps. This can be helpful to isolate your main app traffic from traffic to your API or admin endpoints, for example. Heroku will route traffic to the correct app based on the request subdomain and the custom domains configured for each app.

My general advice:

  • Start simple.

  • Configure multiple web processes per dyno, increasing dyno size if needed.

  • If you need more than one web dyno, autoscale it.

  • If certain background workers are resource hogs, they may require a larger dyno size. Split into their own process types with dedicated job queues so they can be scaled independently.

  • If you have dedicated sections of your web app such as an API or admin section, split them into their own subdomain so you can divert traffic to a separate app instance. Keep them on a single app instance until the additional complexity is absolutely necessary.

Did you find the video helpful? Anything you’d add or change? Let me know!

Happy scaling!
— Adam (@adamlogic)

Mastering Heroku — Issue #5

A reader asked me for some help this week:

Hi Adam, saw your post and decided to reach out. I used AutoScale in the past but am now on performance dyno. We keep running into Rack::Timeout::RequestTimeoutException errors. Wondering if you may have any suggestion.

I feel this pain. Request timeouts are the worst. If you're not a Rubyist or if you're just unfamiliar with the error above, Rack::Timeout is a library for timing out requests on Rails.

Why would you want to timeout a request? Because if you don't, Heroku will:

Occasionally a web request may hang or take an excessive amount of time to process by your application. When this happens the router will terminate the request if it takes longer than 30 seconds to complete.

We’ve all seen these infamous H12 errors appear in our logs and in our Heroku metrics panel.

It's unfortunate when a user sees an error page, but what's worse is that your app has no idea when Heroku times out a request. Your app will continue processing to completion, whether it takes an additional five seconds or an additional five minutes.

While the router has returned a response to the client, your application will not know that the request it is processing has reached a time-out, and your application will continue to work on the request.

Libraries like Rack::Timeout allow you to halt processing of a long-running request before Heroku times out. This gives you more control over the error the user sees and prevents a hung request from bringing down your app server.

It's a Band-Aid, though, and this particular Band-Aid often introduces more problems than it solves.

Raising mid-flight in stateful applications is inherently unsafe. A request can be aborted at any moment in the code flow, and the application can be left in an inconsistent state.

This is straight from the Rack::Timeout docs, which do an excellent job of warning you about the risks and tradeoffs. I’ve personally seen all kinds of strange and frustrating behavior with Rack::Timeout. As far as I’m concerned, it’s just not worth it.

So what’s a safe alternative that prevents hung requests from bringing down your app? As usual, Nate Berkopec sums it up well:

Nate references the Ultimate Guide to Ruby Timeouts, which if you’re a Rubyist, you should bookmark right now. By setting library-specific timeouts on database queries and network requests, you can gracefully handle unpreventable slowdowns—roll back transactions, show a meaningful error message, whatever you need to do.

This isn’t a perfect solution, of course. You could still have a single request with 1,000 database queries, none of which individually time out, but collectively are way over Heroku’s 30-second limit.

In these cases, I still don’t think it’s worth reaching for a “big, big hammer”. Instead, set up alerting for Heroku’s H12 errors. You can use Heroku’s threshold alerting for this, or set up alerting in your log management tool (I use both). Heroku add-ons like Logentries will alert you on H12 errors out of the box.

With these alerts in place, you can investigate your timeouts to fix the root cause instead of relying on a Band-Aid. The H12 error will tell you the exact URL that timed out, so use that along with an APM tool like Scout to determine what went wrong. Chances are, you either have an N+1 query or you’ve omitted a timeout on some I/O.

To recap:

  • Set library-specific timeouts for all I/O (database, network, etc.)

  • Avoid solutions that arbitrarily halt application processing in the middle of a request. It’ll lead to unpredictable and hard-to-debug behavior.

  • Monitor your H12 errors.

  • Use an APM tool to fix those slow endpoints.

And of course, use autoscaling to ensure a few slow requests don’t slow down your entire app. 😁

Happy scaling!

Mastering Heroku — Issue #4

I recently had a consulting call with Jesse Hanley (creator of Bento) that got me thinking about an approach many of us are guilty of when our apps are struggling: We just turn the knobs up to 11.

Jesse Hanley@jessethanley

If you're using @heroku and have questions about scale, you should hit up @adamlogic.

Just had a really insightful call with him that took me from "I have no idea what I'm doing, app needs more Performance-L dynos lol" to confident to experiment finding a profitable balance.

September 17, 2018
On Heroku, "turning it to 11" translates to blindly adding dynos and increasing dyno size. This approach can work, but it comes with major strings attached:

  • It gets expensive fast.

  • You risk overwhelming your downstream dependencies, such as hitting a connection limit on Postgres (I touched on this last week).

  • You're masking or ignoring underlying root causes of performance issues.

I subscribe to this crazy idea that Heroku is not expensive. When optimially configured, it can be a steal.

Adam McCrea@adamlogic

Yesterday @railsautoscale handled 1.64M requests. I pay ~$100/mo to host it on @heroku. It *can* get expensive, but that's avoidable. Small teams don't need a dev-ops engineer, they just need guidance.

August 14, 2018
So how do you optimize your Heroku setup? Here are some tips that got Jesse on the right track:

  • If absolutely consistent performance is a hard requirement, you need performance dynos. The shared architecture of standard dynos means your performance will fluctuate due to factors completely out of your control.

  • Remove the guesswork from choosing the number of dynos. Use Heroku's own autoscaling, HireFire, or Rails Autoscale to automatically scale up and down as needed. Autoscaling should be a hard requirement for a production app.

  • Once you’re autoscaling, there's little reason to use a larger dyno type than necessary. A smaller dyno lets you autoscale at a finer granularity and save a whole lot more cash. This is another reason jumping straight to Perf-L dynos is usually a bad idea.

  • On the other hand, you do need a large enough dyno to run multiple web processes. Heroku's random routing architecture means you’ll stabilize your performance by adding web processes so your app server can intelligently route requests within a dyno. A good rule of thumb is to choose a dyno type with enough memory to run at least three web processes.

  • Do the math to ensure you don’t exceed database connection limits: [connection pool] * [processes] * [max dynos] <— Do this for each process type (web, worker, release, etc).

With that basic setup, you can focus efforts your app itself. Use an APM like New Relic or Scout to measure and diagnose potential bottlenecks. Any improvements there will result in faster response time, less scaling up, and lower Heroku bills.

Happy scaling!

Mastering Heroku — Issue #3

I’ve been thinking a lot about these tweets from Nate Berkopec (author of The Complete Guide to Rails Performance):

Nate Berkopec@nateberkopec

Hi! Do you have a Rails application? Open up your code *right now* and double check to make sure that your production database pool (config/database.yml), Sidekiq concurrency, and Puma thread count are all the same number (preferably using an env variable like RAILS_MAX_THREADS).

September 7, 2018
Nate Berkopec@nateberkopec

This is true if on Heroku or not: 5 for Puma/web procs, 10-25 for Sidekiq. Increase sidekiq number if your jobs do lots of HTTP calls. If mostly DB, shoot lower (10 is fine). To set this number on a per-dyno basis, add it your Procfile (e.g. "web: RAILS_MAX_THREADS=5 rails s")

September 7, 2018
It’s good, simple advice, and it’s absolutely an area where many apps get into trouble, on Heroku or otherwise. If your connection pools are too small, your application will unnecessarily waste time waiting for an available connection, perhaps encountering timeout errors after waiting too long. If your connection pools are too large, you might run up against a connection limit in your datastore. It’s a delicate balance.

Multiply those considerations by your number of datastores (I use Postgres and Redis) and the number of different processes you’re running (I have 5 defined in my Procfile), and suddenly you’re juggling a lot of information.

Nate’s advice above is a great starting point, but you’ll eventually need a solid understanding of why it’s good advise and how changing any of these settings will impact your application as a whole. I find it hard to wrap my head around without some kind of visualization. Something like this tool from Manuel van Rijn, but more visual and not specific to Sidekiq and Redis. Here’s a rough sketch of what I’m imagining:

This is too app-specific to be a Heroku feature, or even an add-on. It’s just a tool for you to plug in your own numbers and visualize the result. Here are the kinds of questions I want it to answer:

  • If I need to scale my app from 3 to 5 web dynos, how many extra datastore connections would I create?

  • What if I increase the number of processes per dyno (Puma workers, for example)?

  • How can I maximize the usage of my limited connections on an entry-level Postgres/Redis tier?

It could also highlight potential issues such as when a connection pool is smaller than the number of threads or when the total datastore connections exceed the current plan limit. Taking it a step further, I’d love to provide guidance on how to implement these settings, but that’s very framework-specific. I’m not sure… it’s a very rough idea right now that I thought would be fun to share.

Would you use something like this? Is it a dumb idea? Reply and let me know what you think!

Loading more posts…