Fix intermittent S3 timeouts by giving metron-web Network=host#517
Merged
Fix intermittent S3 timeouts by giving metron-web Network=host#517
Conversation
Outbound connections from the bridge network container (to DigitalOcean Spaces S3) were proxied through pasta's user-space network stack, causing intermittent ConnectTimeoutErrors at a ~4% rate regardless of S3 health. Switch metron-web to Network=host so its outbound traffic goes through the kernel network stack directly. metron-postgres and metron-redis stay on the bridge network but now publish to 127.0.0.1 so metron-web can reach them via the host loopback. Anubis (still on the bridge) reaches metron-web via host.containers.internal:8000. Update metron.env.example to reflect the new 127.0.0.1 addresses for DB_HOST, REDIS_URL, and THUMBNAIL_REDIS_HOST, and update DEPLOYMENT.md with the revised architecture diagram and explanation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
metron-webfromNetwork=metron.networktoNetwork=hostso outbound S3 traffic (sorl-thumbnail, media uploads) goes through the kernel network stack instead of pasta's user-space proxy127.0.0.1so metron-web can reach them via the host loopback; updateDB_HOST,REDIS_URL, andTHUMBNAIL_REDIS_HOSTinmetron.env.exampleaccordinglyTARGETtohttp://host.containers.internal:8000since metron-web is no longer addressable by container name from the bridge networkBackground
Switching metron-web's outbound S3 connections from pasta user-space networking to the host kernel stack. Testing confirmed ~4% failure rate on S3 connects inside the container regardless of whether pasta or slirp4netns was used, while direct host tests had 0 failures.
Deploy
Update metron.env — change three values:
DB_HOST=127.0.0.1
REDIS_URL=redis://127.0.0.1:6379/0
THUMBNAIL_REDIS_HOST=127.0.0.1
Reload and restart everything
How to Test
systemctl --user status metron-web metron-anubis— both activeConnectTimeoutErrorin logsjournalctl CONTAINER_NAME=metron-web -n 50— no timeout errors