-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Add EIP: EVM64 - Little endian opcodes #9821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
File
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be motivated why the bytes are "reversed" in some cases. I do not have enough knowledge about lower level hardware, but it seems to me that also if you have 64bit processors you still need to reverse it at some point (either coming from 256-bit EVM (legacy) to here, or from here back into 256-bit EVM again).
The gas costs are also likely too low, especially because of bitmask and reversal "extra" work.
|
||
### `BYTE64` | ||
|
||
`BYTE64` (0xc01a) is defined as `(x >> i * 8) & 0xFF`. Note that the definition is changed from big endian to little endian. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume this masks the stack bits above the 64 bit range? If I would query BYTE 9
this should always push 0 to the stack - right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is the same behavior as 256-bit BYTE
-- if the range is exceeded then we push 0.
|
||
`BYTE64` (0xc01a) is defined as `(x >> i * 8) & 0xFF`. Note that the definition is changed from big endian to little endian. | ||
|
||
The gas cost is `G_VERYLOW64`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 256 bit variant costs 3 gas, and this one likely also need to mask the topmost bits (the "bits on the left") first (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not need to mask the topmost bits -- just discard them. x
is 64-bit, and i
is 64-bit. Whatever topmost 192 bits were we don't ever touch them!
|
||
### `MLOAD64` and `MSTORE64` | ||
|
||
`MLOAD64` (0xc051) will load a 64-bit integer in little endian onto the stack. `MSTORE64` (0xc052) will read an 64-bit integer from the stack, and store it to memory in little endian. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the memory offset from big endian to little endian. So MSTORE(0, 0x1020)
will behave different than MSTORE64(0, 0x1020)
? 🤔 EDIT: yes, it will, is explained in the text below 😄 👍
I'm assuming this works like MSTORE8, so it will only write to the 64 bits it could write to and not more, not less?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they behave different. The length is also different. MSTORE(0, 0x1020)
would be memory [0x0 .. 0x0 0x10 0x20]
(total 32 bytes), and MSTORE64(0, 0x1020)
would be [0x20 0x10 0x0 .. 0x0]
(total 8 bytes).
MSTORE8
is only one byte so endianness does not matter.
EIPS/eip-evm64-le.md
Outdated
|
||
`MLOAD64` (0xc051) will load a 64-bit integer in little endian onto the stack. `MSTORE64` (0xc052) will read an 64-bit integer from the stack, and store it to memory in little endian. | ||
|
||
The gas cost for both opcodes is `G_VERYLOW64`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the dynamic resizing costs of the memory? Since this also needs to "reverse" the bytes before writing to the inner memory this seems more work and this is thus likely too low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About resizing cost it extends the memory by 8 bytes (rather than 32 bytes).
See below comment about "reverse" the bytes!
|
||
`PUSH2_64` (0xc061) to `PUSH8_64` (0xc067) follows 2-byte to 8-byte literal. The literal is read little endian and pushed onto the stack. | ||
|
||
The gas cost for them is `G_VERYLOW64`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PUSH costs 3 gas, and this has to read code plus reverse it, so this is also likely too low.
I want to comment together on the point of needing reversal "extra" work. An efficient EVM interpreter works by storing the stack 256-bit value already in little endian So as you can see, it's not that those 64-bit little endian opcodes need reversal "extra" work, but the 256-bit big endian opcodes! I think this should also answer your concern about the other points for gas costs. But on the other hand I agree this is "confusing". It took me a long time to even convince myself that doing those little endian opcodes would be beneficial and efficient (they indeed are!). So I perfectly understand all this little endian / big endian stuff sounds really confusing for any core devs. This is also the reason why this is defined as a separate "optional EIP" but not together with the core EVM64 EIP-7937. I believe that one is rather straightforward and easy to understand, but for this one it'll probably take an awful long time before everyone is convinced that everything ought to be defined this way. |
Aha, ok, yes this clarifies a lot. I think my confusion also comes from writing stuff in JavaScript which is obviously very far away from the low-level code. I also realized (or: had to look up) that most systems run in little endian and considering that, you are right, then indeed the big endian would first need to be converted. I think if you could write in your post above in the EIP it would clarify a lot 😄 👍 On the other hand, this might also be common knowledge and I come from a different "world" where this is not obvious and thus confusing. |
The commit c17b9ab (as a parent of da5b456) contains errors. |
@Editors can you help to assign a number for this EIP? Thank you! |
This defines
BYTE64
,MLOAD64
,MSTORE64
andPUSH*64
.