Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server death does not automatically release LUN mapping #6

Open
dannert opened this issue Nov 29, 2019 · 0 comments
Open

Server death does not automatically release LUN mapping #6

dannert opened this issue Nov 29, 2019 · 0 comments

Comments

@dannert
Copy link

dannert commented Nov 29, 2019

I deployed a mongodb-dev helm chart with the 1.0.2 driver. After successful deployment I killed the server hosting the MongoDB to simulate a server failure and observe recovery behavior.

Based on my testing, the MongoDB database does not recover without significant manual intervention because Flex driver does not force an unmap / remap of the LUN to the new worker node.

Issue 1) Deleting the POD from CLI hangs. K8S notices the delete, puts the POD into Terminating and starts a new POD on another worker. The creation of that new POD fails as the Flex driver does not force unmap the LUN - even with worker node down and POD Terminating - and does not map the LUN to the new worker. Question is, why does the POD deletion hang - could that be in the Flex driver?
Issue 2) No forced unmap / re-map.

The only way to cleanly, without manual intervention, get the MongoDB running again is to restart the "failed" worker node. At that point the "Delete POD" command completes and the LUN is successfully re-mapped. I believe that in "real life" it is somewhat unlikely that a server which "died" comes back in a short time frame --> any application relying on that MongoDB would be hanging.

Screenshot of new MongoDB POD log illustrating the issue:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant