How We Fixed a Gunicorn Reentrant Logging Bug in Production

At Glinteco, we recently faced a puzzling production issue where one of our clients’ Flask apps intermittently went down, even though all services were running and logs showed no errors at first glance. After a deep dive, we uncovered a critical problem:

RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'></stderr>

In this post, we’ll walk you through:

What caused it
How we debugged it
The fix we applied
Key takeaways for any team using Gunicorn + logging in Python

The Problem

The client’s setup was fairly common:

Flask app running behind Gunicorn
Deployed on two servers behind a load balancer
New Relic monitoring installed
Gunicorn version: 20.0.1
Python version: 3.6

Suddenly, users started reporting on-and-off access to the site. But system metrics showed normal CPU and memory usage. The processes were live. It didn’t make sense, until we checked this error in stderr:

RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>

Worker with pid 12345 was terminated due to signal 9</stderr>

Debugging the Issue

This bug was hard to trace. Here’s what we learned through investigation:

The error is related to Gunicorn’s logging mechanism, especially when multiple worker processes try to write to stderr at the same time.
This is a known issue in older versions of Gunicorn, where reentrant or recursive logging can trigger a crash.
Signal 9 (SIGKILL) means the worker process was forcefully killed by the system — possibly due to logging deadlocks or memory exhaustion.

We checked:

Supervisord logs → no trace of high resource usage
App logs → intermittent and incomplete
Gunicorn GitHub issues → Relevant issue found (since 2021)

The Fix: Upgrade in Stages

We didn’t jump to the latest version immediately. Instead, we followed a safe upgrade process:

Upgraded Gunicorn to 21.2.0 — as the community confirmed this version resolved many logging issues.
Monitored the system for a few days. Errors disappeared, and system became stable again.
After confirming stability, we moved to Gunicorn 23.0.0, the latest stable version.
Also updated Python to 3.10 for improved performance and compatibility.

No issues since.

Bonus Fixes (Optional but Recommended)

We replaced direct stderr logging with a proper logging handler in Python using logging.StreamHandler(sys.stdout).
Reduced the number of Gunicorn workers slightly to limit concurrency during peak hours.
Added max_requests setting in Gunicorn to recycle workers after N requests.

gunicorn app:app --workers=4 --max-requests=1000 --max-requests-jitter=100

Key Takeaways

Gunicorn < 20.1 is not production-safe anymore — upgrade ASAP.
Logging can silently kill your app if not properly configured.
Always monitor stderr and worker signals, even if app logs look fine.
Follow a safe upgrade path, and subscribe to GitHub issues or changelogs of critical dependencies.

Final Thoughts

This issue reminded us why even mature libraries like Gunicorn need active maintenance and monitoring. At Glinteco, we help clients not just build apps, but run and scale them safely.

If you’re experiencing strange behavior in your Flask/Django app, or need a second opinion on your deployment pipeline, reach out. We’ve been there.