Hello fellow rubyists,
continuing working on my current Rails project(updating a legacy Rails/Wordpress mix to a decent iOS/Android JSON API server) I stumbled upon yet another pitfall I couldn’t predict… max_connection mysql error in resque jobs handled by a single worker.
The error looks like this:
Worker
6813a6ae-acb1-4299-9784-49048efe6001:2 on PHOTOS_QUEUE at about 10 hours ago Retry or Remove
Class
PhotoDownloader
Arguments
10461
2311
"https://fbcdn-sphotos-h-a.akamaihd.net/hphotos-ak-ash4/296216_195621207167014_6213639_n.jpg"
Exception
Mysql2::Error
Error
User 'b10656ee5a6bc3' has exceeded the 'max_user_connections' resource (current value: 30)
and it occurs in production when I queue about a 700 to 1000 jobs to the worker.
We have a mini CMS for admins of this API and the feature I was trying to implement is a csv file upload. Admins can upload a csv file with multiple entries instead of entering things manually through the form.
Upon submission of the csv file I parse it and create ActiveRecord objects and also I detach external image downloads(from an external url) to the resque worker(about 2 image per entry from the csv file) which ends up in 700 to 1200 image download jobs queued per csv file.
For image storage I use AWS S3 using Carrierwave.
But unfortunately as I mentioned above I had a max_connection mysql error which made me think that when the worker takes on a job to download an image it open a db connection but doesn’t close it when the job is done… am I correct? should I close it manually somehow? I did try to do it with no success.
I solved the problem by rethinking how I process this downloads. Now instead of detaching a job per image download I collect all the new entries from the csv parsing and then create a single job to download all the images… but it doesn’t seem right because something might fail and it’s hard to retry partially executed job… But it solved the max_connection issue because now I have only one open connection for single monstrous download job.
Am I doing it the right way? Was my initial approach better? If so how could I solve max_connection issue?
P.S. I can’t list all the links/documentation I read while I was solving it (just too many urls and open tabs :D) but I can assure you I went through everything I could find, especially on heroku but it seems nobody experienced quite the same problem.
Thanks.