-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase gRPC message size #759
Comments
@LahiLuk I'll have to look into it, can you share a minimal example that will result in the error? If it is as simple as adding an option when we initialize the gcp client feel free to open a PR yourself, currently, I can't give you a timeline as to when I can resolve this. |
Hi @dacbd, anything that will produce a log larger than 4 MB should result in the error. Here's an example:
I'm not sure if I'll be able to open a PR since I'm new to Terraform, and I've never used Go, but I'll try and look into it. In the meantime, do you have a suggestion for a workaround? A crude one I came up with is to redirect all shell output to a file, but that complicates log monitoring... Since TPI is geared towards running ML experiments, I find it a bit weird that no one ran into this issue yet, since those logs tend to be quite detailed and the datasets large... |
I'll take a look at the example you linked.
…On Fri, Sep 29, 2023, 07:35 Lahorka Nikolovski ***@***.***> wrote:
Hi @dacbd <https://github.com/dacbd>,
anything that will produce a log larger than 4 MB should result in the
error. Here's an example:
terraform {
required_providers { iterative = { source = "iterative/iterative" } }
}
provider "iterative" {}
resource "iterative_task" "grpc-error-example" {
cloud = "aws"
machine = "t2.micro"
spot = -1
image = "ubuntu"
region = "eu-west-1"
storage {
workdir = ""
output = ""
}
script = <<-END
#!/bin/bash
while true; do
echo "Hello, World!"
sleep 0.01 # Slow down log creation a bit
done
END
}
I'm not sure if I'll be able to open a PR since I'm new to Terraform, and
I've never used Go, but I'll try and look into it.
It seems in any case that other providers were able to increase the
maximum message size, see for example terraform-plugin-go
<hashicorp/terraform-plugin-go#139>.
In the meantime, do you have a suggestion for a workaround? A crude one I
came up with is to redirect all shell output to a file, but that
complicates log monitoring... Since TPI is geared towards running ML
experiments, I find it a bit weird that no one ran into this issue yet,
since those logs tend to be quite detailed and the datasets large...
—
Reply to this email directly, view it on GitHub
<#759 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIN7M7ABW5Y4FMUGFGHH2DX43MCPANCNFSM6AAAAAA5ALRPK4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I also forgot to mention... It seems that the task itself actually keeps on running after the error, and the logs keep being written to s3. It's just that any terraform commands run locally fail with the error. |
@0x2b3bfa0 can you try and take a look at this, I'm hoping it might be as simple as updating our terraform-provider-sdk, or we need to add something more to the |
Hello,
I encountered the following error while running a task with TPI:
Due to the error, the provisioned EC2 instance stopped producing expected outputs, but still kept running. I could not run
terraform destroy
and had to terminate the instance and all other resources manually.When running the same task on a smaller subset of data, the task completes successfully.
If I understand correctly, gRPC uses the default 4MB message size unless configured to allow a larger size. Is there a way for TPI plugin users to configure this setting?
Environment Details:
The text was updated successfully, but these errors were encountered: