February 20, 2013
Guys, sorry about the recent server slow downs. Many of you even reported you couldn’t make your purchases because the shopping cart was broken. We have finally found – and fixed – the problem. And even extended the 4th anniversary 40% discount offer till 22nd February.
They say when it rains, it rains cats and dogs. We had a number of difficulties with our site in the last two weeks.
This is how I felt initially!
- Orders paid via 2Checkout stayed in “pending” state. The automatic status change on payment confirmation stopped working. This meant we had to manually review and process these orders.
- Tasks Plus product information page started showing 500 Internal Server errors. For no apparent reason.
- Cart page gave 404 errors at times. Would also show different products or no products from what you’d added…
- The website got extremely slow at times – again, with no clear reason for it to slow down (like a surge in traffic etc)
- People who ordered the Master Pack only got the Planning module and no other products!!
- Notifications Plus page gave errors at times, and worked at others. Some other pages too behaved erratically.
When everything was going wrong…
Everything crashing.. just when you want it all to work…
We kept discovering these problems while the 4th anniversary promotion was going on, and you can understand how stressful it can be if you billing counter does not work when people are lined up with their wallets open and credit cards in their hands…
We tried a few solutions. Without much success.
I replaced the Tasks Plus page with a new page and that started working. Disabled a caching mechanism we use to improve performance and that did not seem to cure the problem as well.
Blamed it on Putler…
Then I “blamed” it on the Putler PayPal Proxy API… (Putler is our ecommerce analytics tool, that supports PayPal, 2Checkout, Shopify etc) Basically, we host Putler APIs on the same server and process 100,000 requests on an average day. I thought that must be slowing down things.
Putler must be slowing down the server…
Then I did some code optimizations on that – like connecting to MySQL database only when there is a query to perform etc, and while that gave some relief, the troubles did not go away.
We’d had enough by yesterday. So the first thing I did was buy additional dedicated MySQL instance from MediaTemple (our hosting provider). Hoping that would take off the database load. That helped, yet problems persisted.
I then rewrote the Putler API code to not use databases at all. That gave superb performance improvement to Putler users (doubled the transaction import speed), but still did not solve the Apps Magnet website issues.
Nothing’s working…
Figuring out the problem, finally!!
Late in the evening, I got response on our support ticket from MediaTemple that the internal server errors are due to PHP memory limits. Three minutes after that, Rahul Bansal emailed me a PHP error message on the site – and suggested turning off displaying errors – because it can expose the actual file path.
Rahul was spot on, so I turned off error display. But now I had two confirmed reports of memory exhaustion, and 100mb ought to be enough for any request that comes our way. So there was certainly something going wrong….
My experience bringing down servers (with my code) and getting them back again told me there had to be an infinite loop somewhere.
2 + 2 = 4. I knew where the problem was….
We use WooCommerce, and our own Chained Products plugin to automatically provide access to products when you buy a bundle. So when you buy a combo pack, there is a chain of products that automatically get added to your order. I had updated to the latest version of Chained Products two weeks ago. And a recent version added support for “nested” chained products.
Tasks Plus chains Tickets Plus. And Tickets Plus chains Tasks Plus. There you have it… A circular, infinite loop that would exhaust RAM and could bring the server down.
What a silly reason for all the mess
Rolled back to the earlier version of Chained Products, and the cart started working again…
Talked to my team mates who built the Chained Products plugin and they told me: “yes, we’d thought of that test case. But we didn’t think anyone would chain products like that, so omitted handling it from the first release…”. There you have it.. I was the dumb store owner who configured my products exactly like that!!
This is how Chained Products team felt when I explained the problem
We finally recalled Murphy’s Law and I asked them to “fix” the issue and release an update.
Back on track now…
At this time, the server is responding swiftly, the new MySQL instance is taking care of the heavy lifting, Putler is serving double the requests every second, Apps Magnet cart is working, Tasks Plus and other pages are back to life… And the 2Checkout automatic order completion is working too (it was my mistake.. the latest version required slight change in the way 2Checkout sends back order confirmation data to us. I had done one change, but not the other.. didn’t read the fine print in the manual… 😉
40% Discount Extended till 22nd Feb
To compensate for the troubles you’ve had over the last week or two, we have extended the 40% off 4th anniversary offer till 22nd Feb.
So go ahead and get all the activeCollab modules you need. You won’t get this huge discount for another year (when we turn 5 and consider giving 50% off)!!
Yay!! Everything is working now!
40% Off on our activeCollab 3 modules – instant download
Thanks Rahul for the nudge. That led to the whole solution I owe you a coffee when we meet!
Hopefully this is not how you feel now ;-)
Go ahead, get your favorite activeCollab 3 solution from here
PS: Images courtesy Educational Colours.