Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for Patches to api-server and other changes #13

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

FayeGibbins
Copy link

  • Update to apline 3.9
  • 1 sec sleep if api-server errs when patching
  • small refactor
  • new flag --patchnode (default off) to enable patching
  • More info logs
  • Some JSON munging to enable patching
  • Munge the AWS update from 2 to 1 method
  • Allow the passing of an options struct through the code base.

@gjtempleton
Copy link

Having a look at the changes now, but you'll need to add the new github.com/mattbaird/jsonpatch dependency to Gopkg.toml as the build is currently failing.

@FayeGibbins
Copy link
Author

FayeGibbins commented Jun 12, 2019 via email

@ottoyiu
Copy link
Owner

ottoyiu commented Jun 12, 2019

Thanks for the PR! Can you explain the intent for introducing json patch? I seem to be lacking some context for this change by just reading the code.

@FayeGibbins
Copy link
Author

Hi @ottoyiu-san,
Thank you responding so quickly! I'm afraid I'm very new to Go and this is my first ever PR written in Go. Apologies for my code not being clear.
I was asked to alter your code so that it could using the api-server's PATCH utility. I do not know how to do this so searched the Internet for an example of how it was to be done. I found this: https://github.com/tamalsaha/patch-demo/blob/master/main.go#L113 and so coppied it.
As you can see it uses JSONPatch.
Perhaps there is a better way?
Yours
Faye


flag.Set("logtostderr", "true")
flag.Parse()

var my_opts = common.K8sEc2SrcdstOpts{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By go convention this should be camelCase rather than snake_case: https://github.com/golang/go/wiki/CodeReviewComments#variable-names

@@ -18,10 +18,19 @@ import (
func main() {
kubeconfig := flag.String("kubeconfig", "", "Path to a kubeconfig file")
version := flag.Bool("version", false, "Prints current k8s-ec2-srcdst version")
patchnode := flag.Bool("patchnode", false, "By default k8s-ec2-srcdst updates the node opbject, this flag turns on patching instead. This can be useful for when AWS api rate limiting is an issue.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/opbject/object/g

@@ -1,4 +1,4 @@
FROM alpine:3.6
FROM alpine:3.9

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for updating the Alpine version?

@@ -48,8 +53,9 @@ func NewSrcDstController(client kubernetes.Interface, ec2Client *ec2.EC2) *Contr
60*time.Second,
// Callback Functions to trigger on add/update/delete
cache.ResourceEventHandlerFuncs{
AddFunc: c.handler,
UpdateFunc: func(old, new interface{}) { c.handler(new) },
//AddFunc: c.handler,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this commented out code.

@@ -7,6 +7,10 @@ import (
"k8s.io/client-go/tools/clientcmd"
)

type K8sEc2SrcdstOpts struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're exporting this it should have a comment for godoc to pick up as with the funcs in this file.

// Sleep for a second and give it one
// more try. The sleep also stops us
// slamming AWS if api-server is
// containtly refusing to allow the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/containtly/constantly/g

// https://github.com/kubernetes/client-go/blob/master/kubernetes/typed/core/v1/node.go#L170
if _, err := c.client.Core().Nodes().Patch(nodeCopy.Name, types.JSONPatchType, json_patch); err != nil {
glog.Errorf("Failed to patch %s annotation: %v", SrcDstCheckDisabledAnnotation, err)
// Sleep for a second and give it one

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is slightly oddly formatted with so many line breaks. It also doesn't make clear to me in what circumstance the retry after 1 second would handle an initial failure. The reference to AWS is also confusing me here as the calls to AWS' APIs occur outside of this logic loop.

glog.V(5).Info(string(json_patch))
// https://github.com/kubernetes/client-go/blob/master/kubernetes/typed/core/v1/node.go#L170
if _, err := c.client.Core().Nodes().Patch(nodeCopy.Name, types.JSONPatchType, json_patch); err != nil {
glog.Errorf("Failed to patch %s annotation: %v", SrcDstCheckDisabledAnnotation, err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why an Errorf if we're going to give it a second try? Seems like Warningf would make more sense to me as it's not necessarily an error yet. I'd also make clear in this message that it's only the first attempt.

// containtly refusing to allow the
// patches.
time.Sleep(1 * time.Second)
if _, err := c.client.Core().Nodes().Patch(nodeCopy.Name, types.JSONPatchType, json_patch); err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't log that we've managed a successful patch on a second attempt that I can see.

if err != nil {
glog.Errorf("Failed to MarshalIndent: %v", err)
}
glog.V(5).Info(string(json_patch))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is basically debug info this doesn't feel like it should be displayed at a lower verbosity level than some of the meaningful info messages around which nodes have been successfully patched.

@maruina
Copy link

maruina commented Jun 18, 2019

Hello @ottoyiu ,
I work with Faye and Guy and I'll try to give you some context about why she's submitting this PR.

We've been using the controller in production for quite some time and it's working well. Unfortunately, we have a very busy AWS account and we often get rate limited. When we have a scaling event and we get rate limited, we see the controller reporting the following errors

E0618 07:31:45.442463 1 srcdst_controller.go:102] Failed to set kubernetes-ec2-srcdst-controller.ottoyiu.com/srcdst-check-disabled annotation: Operation cannot be fulfilled on nodes "ip-XX.XX.XX.XX.eu-west-1.compute.internal": the object has been modified; please apply your changes to the latest version and try again

and getting a lot of 409 responses from the Kubernetes API server.

In this scenario, the node is marked as ready but the src/dst check is still enabled. This means that we drop all the cross-az traffic.

By using a patch method, we think we will be able to solve this issue. We already tested in one of our production cluster where we added ~100 nodes at once and we didn't see any problem or delay in disabling the src/dst check.

Please get in touch with us if you think there is something we're missing or anything else we can do for you.

@@ -48,14 +49,18 @@ func TestDisableSrcDstIfEnabled(t *testing.T) {
ec2Client := NewMockEC2Client()
kubeClient := fake.NewSimpleClientset(&v1.NodeList{Items: []v1.Node{*node0, *node1}})

var my_opts = common.K8sEc2SrcdstOpts{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also seems like we should add a test for this new functionality when PatchNode = true

@@ -7,6 +7,10 @@ import (
"k8s.io/client-go/tools/clientcmd"
)

type K8sEc2SrcdstOpts struct {
Patchnode bool

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be UpperCamelCased

@ottoyiu
Copy link
Owner

ottoyiu commented Jun 18, 2019

@maruina @FayeGibbins @gjtempleton thanks for the contribution, explanation and the code review. This seem to work at first glance as a workaround for the update conflicts.

We have an internal branch for this project that is completely re-written to make use of informers, and indexers that already take care of this issue. We want to ultimately merge that back upstream here. However, I'm not sure where it is in the timeline unfortunately. I will see if I can get an update from the person involved with maintaining this project at my office.

@gjtempleton
Copy link

Thanks for getting back to us so quickly.

If those internal changes are able to be merged back upstream here that would be great as they sound like they'll also be a solution to the problem we're seeing. We really appreciate the effort you put into maintaining this project.

@gjtempleton gjtempleton mentioned this pull request Dec 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants