Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending large binary data (Or doing custom marshalling/framing) #7930

Open
tjad opened this issue Dec 13, 2024 · 14 comments
Open

Sending large binary data (Or doing custom marshalling/framing) #7930

tjad opened this issue Dec 13, 2024 · 14 comments
Assignees
Labels
Area: RPC Features Includes Compression, Encoding, Attributes/Metadata, Interceptors. stale Status: Requires Reporter Clarification Type: Question

Comments

@tjad
Copy link

tjad commented Dec 13, 2024

I have to integrate with a gRPC endpoint that takes large binary data. I have no control over the endpoint - I don't define how it works.

My solution is to do my own framing - almost.

I noticed there is a PreparedMsg, I want to make use of this, so that I can do the encoding step manually. This is so that I encode the protobuf request, and modify where the binary data should be.
Basically i would
1.) encode a stubbed protobuf request (which includes a placeholder for binary input)
2.) split the encoded protobuf into 2 parts (before and after where the binary data should be)
3.) modify the size indication within the protobuf (so that it matches the size of my binary data)
4.) prepare a message with PreparedMsg.Encode() for the first part of the protobuf (everything before binary data)
5.) SendMsg with the clientstream
6.) using an io.Reader, loop, populating mem.BufferSlice at a fixed size (and yes, respect that I may not modify this slice as per internals), and SendMsg ffor each mem.BufferSlice containing a portion of the binary data
7.) prepare the final part of the protobuf in a PreparedMsg and send it

Currently the problem is that there is no type check here for mem.BufferSlice , so it returns an error, telling me the encoder is expecting a proto.Message (i.e it still tries to encode the already encoded message)
https://github.com/grpc/grpc-go/blob/master/rpc_util.go#L690

Something like this

	msg_buf, ok := msg.(mem.BufferSlice)

	if ok {
		return msg_buf, nil
	}

I have tested this, when adding this check, the rest of the process seems to go normally. The server receives my data and responds correctly.

I am certain there would be a better way of doing this,, but currently there is no indication of being able to support for streaming data from an io.Reader. This seems like a big flaw in design, or at least it is a massive limitation.

I need to do this in order to not have to load up the entire data in memory, encode it as protobuff, and then have gRPC do its thing. The size of my data can be as small as 1MB and span all the way up to 200MB (or more). This is a massive constraint on scalability of my client service where it is constrained to memory.

I think ideally, there should be a way to "StreamMsg" (like SendMsg), where I can pass in an io.Reader, and the Transport layer can handle it appropriately. The io.Reader would output marshalled protobuf.

This way we would be able to effectively write our own custom data processing, and the gRPC Transport( and http2.Framing) could still be harnessed via gRPC

@tjad
Copy link
Author

tjad commented Dec 13, 2024

In my scenario above, the server is acting upon the entire binary data immediately anyway - so it doesn't matter for the server to load up all the data at once, but obviously if they could do similar io.Reader streaming of binary data out of the protobuf message, that would be more ideal for them - I think anyway.

I think in the general case, it is not always possible to stream parts of the data, say I were sending RGB image data for example and pushing it into an AI model where the model uses all 3 layers of data as a single sample for input into the model.

I really think that gRPC (for golang anway) would benefit from this - it would remove the only constraint for TRUE streaming.

@tjad tjad changed the title Sending large binary data (Or doing custom marshalling) Sending large binary data (Or doing custom marshalling/framing) Dec 13, 2024
@purnesh42H purnesh42H self-assigned this Dec 16, 2024
@purnesh42H purnesh42H added the Area: RPC Features Includes Compression, Encoding, Attributes/Metadata, Interceptors. label Dec 16, 2024
@purnesh42H
Copy link
Contributor

purnesh42H commented Dec 17, 2024

@tjad thanks for the question. Any specific reason why you are using PreparedMsg? It is a special message type that represents a pre-serialized Protobuf message. It essentially allows you to bypass gRPC's internal marshaling step if you've already serialized your message manually.

I think in your case, you probably just need to implement your own codec since you don't want dependency on proto https://github.com/grpc/grpc-go/blob/master/Documentation/encoding.md#implementing-a-codec

And then you can create a client stream and use stream.Send() to send your raw binary data (preferably in chunks).

@tjad
Copy link
Author

tjad commented Dec 17, 2024

Thank you for responding @purnesh42H
The encoder would need to encapsulate the above logic yes?
It would then mean that I would return a mem.BufferSlice holding 3 Buffers
1.) the head (first part of the modified protobuf - as per above)
2.) an io.Reader (this returns the actual binary data)
3.) the tail (last part of the protobuf - as per above).

Does this sound doable?

I would only call stream.Send() once - as it is a single gRPC request for the method. The API does not permit sending "multiple Requests" with separate payloads and having them treated as a single request for processing.

I can't serialize all the data at once into a single protobuf binary - as it would consume too much memory, resulting in an application that needs to be scaled by memory, and the memory would be depleted quickly given the size of data I'm dealing with 1MB upto 200MB or greater (we haven't started processing larger data yet).

The binary data is originally streamed from another data source (i/o bound)

@tjad
Copy link
Author

tjad commented Dec 17, 2024

It essentially allows you to bypass gRPC's internal marshaling step if you've already serialized your message manually.

This does not actually work in the current state of code - as per my indication above. Even though I serialized the protobuf already and then sent it as a PreparedMsg, I got an error telling me it was expecting protobuf message (unserialized)

Currently the problem is that there is no type check here for mem.BufferSlice , so it returns an error, telling me the encoder is expecting a proto.Message (i.e it still tries to encode the already encoded message) https://github.com/grpc/grpc-go/blob/master/rpc_util.go#L690

Something like this

	msg_buf, ok := msg.(mem.BufferSlice)

	if ok {
		return msg_buf, nil
	}

You will notice that encode is just encoding, and not doing any checks currently

Using a codec would get around this problem.

@purnesh42H
Copy link
Contributor

purnesh42H commented Dec 21, 2024

The encoder would need to encapsulate the above logic yes?
It would then mean that I would return a mem.BufferSlice holding 3 Buffers
1.) the head (first part of the modified protobuf - as per above)
2.) an io.Reader (this returns the actual binary data)
3.) the tail (last part of the protobuf - as per above).

Does this sound doable?

If you want to do manual encoding, there is no other way but to define a custom codec. You need to implement the Codec or CodecV2 interface which will have its own init method to register itself as a codec and a Name() method to return the name of the codec. Codecs are registered by name into a global registry maintained in the encoding package. The Marshal and Unmarshal methods of CodecV2 use mem.BufferSlice, which offers a means to represent data that spans one or more Buffer instances so that might suit your use case better. The Codec, however, just works on the single byte slice. CodecV2 was released in grpc-go version 1.66.0 though.

You can refer to the grpc's implementation of the proto codec to get an idea https://github.com/grpc/grpc-go/blob/master/encoding/proto/proto.go.

I would only call stream.Send() once - as it is a single gRPC request for the method. The API does not permit sending "multiple Requests" with separate payloads and having them treated as a single request for processing.

One thing to call out here is that codecs provide the wire format of the request data. It is not recommended to send more than few KBs of data on the wire. If you are using ClientStream to send data, you will be sending the individual messages/chunks over the network and gRPC will receive the chunks from the network and makes them available to the server-side streaming RPC method. It will be part of the same streaming call. If not already done, take a look at client streaming example here https://grpc.io/docs/languages/go/basics/#client-side-streaming-rpc-1. Are you saying the endpoint won't process the chunks even if they are in correct order?

I can't serialize all the data at once into a single protobuf binary - as it would consume too much memory, resulting in an application that needs to be scaled by memory, and the memory would be depleted quickly given the size of data I'm dealing with 1MB upto 200MB or greater (we haven't started processing larger data yet).

The binary data is originally streamed from another data source (i/o bound)

After starting the client stream, you will have to chunk at the application level and custom codec will be used for each chunk sent using stream.Send(). See how to use the custom codec https://github.com/grpc/grpc-go/blob/master/Documentation/encoding.md#using-a-codec

@purnesh42H
Copy link
Contributor

This does not actually work in the current state of code - as per my indication above. Even though I serialized the protobuf already and then sent it as a PreparedMsg, I got an error telling me it was expecting protobuf message (unserialized)

If you have already serialized your message, you don't need PreparedMsg. Just send the serialized bytes directly (if your API supports it) or encapsulate them in a suitable protobuf message with a bytes field and use regular stream.Send().

PreparedMsg.Encode() still expects the unserialized protobuf message as input. The Encode() method uses the codec associated with the RPC stream to perform the serialization. When you call Encode(), it internally serializes the message you provide using the configured codec. It does not simply take the bytes you've already serialized.

The main benefit of using PreparedMsg is the encoding happens before the sending path. So, if you have many messages to send you can encode them separately before calling stream.Send()

preparedMsgs := make([]*grpc.PreparedMsg, len(messages))
for i, msg := range messages {
    preparedMsgs[i] = &grpc.PreparedMsg{}
    preparedMsgs[i].Encode(stream, msg) // Encode concurrently
}

for _, pMsg := range preparedMsgs {
    stream.SendMsg(pMsg) // Fast, just sends the pre-encoded data
}

Copy link

This issue is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added the stale label Dec 27, 2024
@tjad
Copy link
Author

tjad commented Dec 29, 2024

@purnesh42H I think you're not fully comprehending my use case provided above. I have tried to explain very elaborately what needs to be achieved, and why this library's implementation is the only thing preventing or limited in providing support for such requirements.

I have taken a lot of time to review the internals of this gRPC client, and it certainly does not permit fully usage of 2GB as per protobuf limitation of message sizes efficiently. HTTP2's Dataframes is just a transport, it does not limit anything other than sending packets in 16kB parts. This library is memory bound.

Please try to have someone who is very experienced with the internals try to assist - or at least try to comprehend fully what I am trying to achieve, I have provided enough information.

In short, I want to have an io.Reader be read by the underlying layers and transport. Encoding is only 1 part of the problem, compression is another layer - which "materializes" the whole mem.BufferSlice into memory, which is yet another problem. Why is an io.Reader not used through the layers for pipelining data through the gRPC layers ? All applications are memory bound due to this and memory is not cheap.

The internals need to be reworked - I'm not saying it's a quick thing to do, but it is certainly a limitation and oversight in terms of efficiency of the technologies brought together (Golang, Protobuf, gRPC).

Do I need to open a feature request, or assist in fixing the internals with a PR etc ? @dfawley

@tjad
Copy link
Author

tjad commented Dec 29, 2024

And FYI, I have built a custom serializer and attached/registered it to the encoding registry, this alone does not resolve the problem.

@tjad
Copy link
Author

tjad commented Dec 29, 2024

I honestly don't understand how "mem buffers" were seen as a means for efficiently transferring data between layers, mem.BufferSlice is an interesting idea, but loading data into memory should not have been seen as the only possible kind of data being sent over gRPC. Loading data into memory should be a last resort. I should only load data into memory if I am acting upon that data. When I have binary data, I don't particularly want to act on the data, and there is no need to require a 2nd means/technology for transferring large binary. The mere fact that I can swap out protobuf for different message framing means that I can send > 2GB anyway, if I really need to (that's really the ultimate limitation when using protobuf)

It doesn't have to be limited to binary data either, imagine a large database query, I could potentially stream data out of the database with database streaming as a different kind of use case.

Nothing is made perfect, it's an iterative process, and we can only make progress when we acknowledge where there is room for improvement.

@arjan-bal
Copy link
Contributor

arjan-bal commented Jan 2, 2025

@tjad, if I understand correctly, you want to avoid holding the entire 200MB message in memory by streaming things from the application to gRPC to the network. There are some problems that I think will prevent this:

  1. The Golang object that represents the proto, needs to be held in memory by the application. Holding the object would take up 200MB stack or heap space.
  2. The protobuf Marshal API to serialize golang objects to bytes, returns a byte slice and isn't capable of streaming the reponse.

Due to these points, I can't see a way to stream the bytes through the system. On the network, the bytes are streamed by TCP in smaller IP packets.

Based on the above facts, I don't think it's possible to avoid holding an entire proto message in memory. The smallest entity for a gRPC stream is a message, so I think you should try to split a request into multiple messages instead of splitting a single message. This would require logic on the receiver to assemble these message based on other metadata in the proto messages being sent. Since you mentioned that you don't control the server, I think you should ask the service owner about ways to efficiently use their API.

@tjad
Copy link
Author

tjad commented Jan 6, 2025

When using default protobuf serializer, yes that is correct about 1. and 2.

However, when trying to actively manage your data and memory usage, it is possible to split message serialization into parts, and stream them independently. When knowing the size of binary (as a member of the protobuf message structure, the size can be indicated as part of the header for that tag and the data can be streamed across the network without loading it into memory.

And what happens when I have a different serializer ? gRPC does not mandate using protobuf. Protobuf does indeed make it difficult, but not impossible with its 2GB limit. On the receiving end If I were smart, when receiving the data I would just as easily deserialize the data as a stream which can be done directly from network io - not at all impossible.

My ultimate constraint right now, is being forced to work with mem buffers, where rather when passing data around we've always traditionally used io packages as an interface, in golang io.Reader/io.Writer These interfaces permit using memory with buffers, or not, depending on exact needs/requirements. Especially when primary topic is IO. This fundamental construct we've always held dear for efficiency has been relinquished from our tool set.

@dfawley
Copy link
Member

dfawley commented Jan 7, 2025

This request feels very niche and like you're doing some borderline inadvisable things.

protobuf generally doesn't work well with huge messages. You'd be better suited by using a streaming RPC so that you can keep message sizes down. I know you don't control the service, but you should request this of them, as it will benefit all of their users, and themselves. (Large messages will incur a huge cost to the service owner, too -- during the deserialization step, they will have to have two copies of the data in memory unless they have a pretty sophisticated deserializer that is able to either lazily deserialize, or alias memory.)

Short of that, you should be able to implement a custom codec for this kind of problem, as mentioned earlier. You would still need one full copy of the data in memory at all times, but at least you wouldn't need two copies. You would do this by making the codec serialize the data to output and deallocate the data in the input at the same time.

Why is an io.Reader not used through the layers for pipelining data through the gRPC layers

io.Reader is horribly inefficient, and not good for composition. It requires a copy at every layer. Further, protobuf does not support any kind of streaming-style API, and protobuf represents 99+% of gRPC's usage.

If you're wondering about the current codec / BufferSlice design, you should check out this presentation at the last grpconf about the topic: https://www.youtube.com/watch?v=SjWa636dpP8

Either way, I don't think we're going to be able to implement this request.

  1. Because gRPC messages are length-prefixed, we must know the full length of the serialized message before we begin transmitting it. If this was ever incorrect -- i.e. N bytes were promised by M were delivered, the whole stream would have to be shut down. This requirement significantly limits the usefulness of such a feature.

  2. We would need to redesign significant pieces of grpc code to handle sending partial messages through from the API level all the way to the HTTP/2 transport level.

  3. In the process, we could lose optimizations that allow us to batch writes to the wire, hurting performance for everyone, without other additional design work to account for this.

Given that this is not a concern that many of our users have, I don't think we would be able to justify this kind of project. It would be better to work with the service owner to support a streaming API instead, or to implement your own codec that is able to stream the serialized bytes into mem.BufferSlices instead.

Copy link

This issue is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added the stale label Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: RPC Features Includes Compression, Encoding, Attributes/Metadata, Interceptors. stale Status: Requires Reporter Clarification Type: Question
Projects
None yet
Development

No branches or pull requests

4 participants