I have one app that runs on Rails 4.1.6 and Ruby 2.1.3 on Heroku. What I realize after I leave it run for a while I see there a lot of swap space used. I don’t know why. And also I cause Error R14 (Memory quota exceeded).
Do you guys used to familiar with this situation? or know how to free up swap space?
We recently saw a similar issue. We did not see the large amount (looks like about ~30% of total) swap usage but we did see memory consumption continually increase until we hit the max allowed per dyno leading to a force restart. We are running Unicorn with Rails 4.1.5 and Ruby 2.1.3.
We experimented with some different Ruby GC settings and while the changes did adjust the way GC was happening we still had R14 errors so we implemented unicorn worker killer and our R14’s went away.
We have had similar issues on our apps on every patch version of Ruby 2.1. I haven’t tried unicorn-worker-killer yet, but I have experienced R14 errors that persist until we restart.
@jferris did a visual read of the codebase looking for global state in our
application code:
global variables
class variables
singletons
per-process instance state
Not seeing any examples of those, he did a binary search through the gems,
commenting out one gem at a time, restarting Foreman, and changing the PID on
the watch command.
New Relic was the only gem that resulted in memory not rising indefinitely. It
settled around 130MB for the web process.
We’re trying out removing the New Relic gem tonight to see if we can improve this graph:
@Simon_Taranto, we added Unicorn Worker Killer to our app on staging today, and with the default settings out of the box it works wonderfully. Thanks very much for the recommendation!
We were, but I gave Skylight.io a try a couple months ago and was extremely impressed with it. We got some huge performance wins out of conditions that we diagnosed through it. I am fairly sure that it’s possible to do a lot of the same stuff with New Relic, but Skylight’s low resource usage was a nice plus for us given our prior issues with memory use on dynos.
After better controlling the experiment better, I was not able to isolate the issue to the 3.9.x series of the New Relic gem.
Talking with their support, the recommended solution is:
heroku config:set NEW_RELIC_AGGRESSIVE_KEEPALIVE=1 --remote production
It sounds like there may be a bad interaction between the New Relic Ruby agent and the generational garbage collector in Ruby 2.1 where the agent will re-establish a new SSL connection to New Relic servers each minute in order to submit data. This causes a handful of Ruby objects to be allocated and triggers lots of native memory allocations through malloc in the openssl library. These allocations aren’t accounted for by Ruby’s GC triggering logic, so unless something else triggers a GC, they can hang around too long.
Somewhat paradoxically, this issue affects idling applications (or apps with infrequent requests) more than applications under a steady amount of load. This is because of the fact that GC runs will be triggered less frequently in an idling application.
The environment variable above is a configuration setting that you can set that will cause the agent to re-use a single SSL connection to New Relic servers. It might become a default soon in upcoming New Relic Ruby gem releases.
After setting the variable in one of our apps with this problem, while keeping New Relic in the app, we saw this result:
@samnang, I dropped gctools because I didn’t find it was that effective for us (it may be more effective if you have 2x dynos or are working in a non-Heroku environment). Now that we’re using unicorn worker killer our RAM use is (mostly) staying below the Heroku R14 limit, so I’m holding out hope for Ruby 2.2 this December 25th before I spend much further time on memory issues.
You can find the config option by searching for “aggressive_keepalive”. I believe the NEW_RELIC_AGGRESSIVE_KEEPALIVE may be specific to the Heroku add-on, with the prefix being a tip-off for which add-on and the rest of the config variable’s name used by the service.
An alternative to using the ENV variable I believe would be to set aggressive_keepalive: true in your config/newrelic.yml. I believe that will be the default in upcoming New Relic gem versions.
@croaky I see in the upcase gemfile that ruby version is 2.2.2, and new relic gem to 3.8.1. Is this working well in heroku and not having memory leak issues like 2.1? I downgraded to ruby 2.0 and memory got a lot better. I was having memory issues with 2.2.2.