Skip to content

Add EIP: EVM64 - Little endian opcodes #9821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sorpaas
Copy link
Contributor

@sorpaas sorpaas commented May 26, 2025

This defines BYTE64, MLOAD64, MSTORE64 and PUSH*64.

@sorpaas sorpaas requested a review from eth-bot as a code owner May 26, 2025 01:32
@github-actions github-actions bot added c-new Creates a brand new proposal s-draft This EIP is a Draft t-core labels May 26, 2025
@eth-bot
Copy link
Collaborator

eth-bot commented May 26, 2025

File EIPS/eip-evm64-le.md

Requires 1 more reviewers from @g11tech, @lightclient, @SamWilsn

@eth-bot eth-bot added e-consensus Waiting on editor consensus e-review Waiting on editor to review labels May 26, 2025
@eth-bot eth-bot changed the title EVM64 - Little endian opcodes Add EIP: EVM64 - Little endian opcodes May 26, 2025
@github-actions github-actions bot added the w-ci Waiting on CI to pass label May 26, 2025
Copy link
Member

@jochem-brouwer jochem-brouwer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be motivated why the bytes are "reversed" in some cases. I do not have enough knowledge about lower level hardware, but it seems to me that also if you have 64bit processors you still need to reverse it at some point (either coming from 256-bit EVM (legacy) to here, or from here back into 256-bit EVM again).

The gas costs are also likely too low, especially because of bitmask and reversal "extra" work.


### `BYTE64`

`BYTE64` (0xc01a) is defined as `(x >> i * 8) & 0xFF`. Note that the definition is changed from big endian to little endian.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume this masks the stack bits above the 64 bit range? If I would query BYTE 9 this should always push 0 to the stack - right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is the same behavior as 256-bit BYTE -- if the range is exceeded then we push 0.


`BYTE64` (0xc01a) is defined as `(x >> i * 8) & 0xFF`. Note that the definition is changed from big endian to little endian.

The gas cost is `G_VERYLOW64`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 256 bit variant costs 3 gas, and this one likely also need to mask the topmost bits (the "bits on the left") first (?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not need to mask the topmost bits -- just discard them. x is 64-bit, and i is 64-bit. Whatever topmost 192 bits were we don't ever touch them!


### `MLOAD64` and `MSTORE64`

`MLOAD64` (0xc051) will load a 64-bit integer in little endian onto the stack. `MSTORE64` (0xc052) will read an 64-bit integer from the stack, and store it to memory in little endian.
Copy link
Member

@jochem-brouwer jochem-brouwer May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the memory offset from big endian to little endian. So MSTORE(0, 0x1020) will behave different than MSTORE64(0, 0x1020)? 🤔 EDIT: yes, it will, is explained in the text below 😄 👍

I'm assuming this works like MSTORE8, so it will only write to the 64 bits it could write to and not more, not less?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah they behave different. The length is also different. MSTORE(0, 0x1020) would be memory [0x0 .. 0x0 0x10 0x20] (total 32 bytes), and MSTORE64(0, 0x1020) would be [0x20 0x10 0x0 .. 0x0] (total 8 bytes).

MSTORE8 is only one byte so endianness does not matter.


`MLOAD64` (0xc051) will load a 64-bit integer in little endian onto the stack. `MSTORE64` (0xc052) will read an 64-bit integer from the stack, and store it to memory in little endian.

The gas cost for both opcodes is `G_VERYLOW64`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the dynamic resizing costs of the memory? Since this also needs to "reverse" the bytes before writing to the inner memory this seems more work and this is thus likely too low.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About resizing cost it extends the memory by 8 bytes (rather than 32 bytes).

See below comment about "reverse" the bytes!


`PUSH2_64` (0xc061) to `PUSH8_64` (0xc067) follows 2-byte to 8-byte literal. The literal is read little endian and pushed onto the stack.

The gas cost for them is `G_VERYLOW64`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PUSH costs 3 gas, and this has to read code plus reverse it, so this is also likely too low.

@sorpaas
Copy link
Contributor Author

sorpaas commented May 26, 2025

@jochem-brouwer

I think it should be motivated why the bytes are "reversed" in some cases. I do not have enough knowledge about lower level hardware, but it seems to me that also if you have 64bit processors you still need to reverse it at some point (either coming from 256-bit EVM (legacy) to here, or from here back into 256-bit EVM again).

The gas costs are also likely too low, especially because of bitmask and reversal "extra" work.

I want to comment together on the point of needing reversal "extra" work.

An efficient EVM interpreter works by storing the stack 256-bit value already in little endian U256. Generally, it's [u64; 4] (the least significant u64 can be the first or the last item of the array, doesn't matter). This is actually the only way that the interpreter can work, because it cannot do math, or practically any operations (other than clone/copy) on this 256-bit value, unless it's little endian! We use little endian native machine anyway! The interpreter then "reverse" the bytes when needing to store values into memory.

So as you can see, it's not that those 64-bit little endian opcodes need reversal "extra" work, but the 256-bit big endian opcodes! I think this should also answer your concern about the other points for gas costs.

But on the other hand I agree this is "confusing". It took me a long time to even convince myself that doing those little endian opcodes would be beneficial and efficient (they indeed are!). So I perfectly understand all this little endian / big endian stuff sounds really confusing for any core devs. This is also the reason why this is defined as a separate "optional EIP" but not together with the core EVM64 EIP-7937. I believe that one is rather straightforward and easy to understand, but for this one it'll probably take an awful long time before everyone is convinced that everything ought to be defined this way.

@jochem-brouwer
Copy link
Member

Aha, ok, yes this clarifies a lot. I think my confusion also comes from writing stuff in JavaScript which is obviously very far away from the low-level code. I also realized (or: had to look up) that most systems run in little endian and considering that, you are right, then indeed the big endian would first need to be converted.

I think if you could write in your post above in the EIP it would clarify a lot 😄 👍 On the other hand, this might also be common knowledge and I come from a different "world" where this is not obvious and thus confusing.

Copy link

The commit c17b9ab (as a parent of da5b456) contains errors.
Please inspect the Run Summary for details.

@sorpaas
Copy link
Contributor Author

sorpaas commented May 28, 2025

@Editors can you help to assign a number for this EIP? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c-new Creates a brand new proposal e-consensus Waiting on editor consensus e-review Waiting on editor to review s-draft This EIP is a Draft t-core w-ci Waiting on CI to pass
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants