Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler_flags: Change optimization_fast from '-Ofast' to '-O3' #87495

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

thughes
Copy link
Contributor

@thughes thughes commented Mar 21, 2025

'-Ofast' enables optimizations that are not standards compliant, which can
produce unexpected results unless used carefully.

When building with clang 19 and newer, it warns:

clang: error: argument '-Ofast' is deprecated; use '-O3 -ffast-math' for
the same behavior, or '-O3' to enable only conforming optimizations
[-Werror,-Wdeprecated-ofast]

Relevant discussion:

@thughes
Copy link
Contributor Author

thughes commented Mar 21, 2025

There are other possibilities here, such as disable the warning for clang, only change clang to use -O3, or change clang to use -O3 -ffast-math to be similar to gcc.

However, it seems like we want the meaning of this flag to be the same across all compilers, so I think it makes sense for whatever we do to be consistent. If we want to keep the unsafe optimizations, it might be good to rename it to something like optimization_fast_unsafe so people pay attention when using it.

@@ -21,7 +21,7 @@ endif()
set_compiler_property(PROPERTY optimization_speed -O2)
set_compiler_property(PROPERTY optimization_size -Os)
set_compiler_property(PROPERTY optimization_size_aggressive -Oz)
set_compiler_property(PROPERTY optimization_fast -Ofast)
set_compiler_property(PROPERTY optimization_fast -O3)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we choose to make this change, I'll add a context comment here so it doesn't accidentally get changed back in the future.

@thughes thughes marked this pull request as ready for review March 21, 2025 20:31
@zephyrbot zephyrbot added area: Toolchains Toolchains size: XS A PR changing only a single line of code area: Build System labels Mar 21, 2025
@thughes thughes force-pushed the push-sppopnmtpnxw branch from 9919c29 to a52b9d4 Compare March 21, 2025 22:04
Copy link
Collaborator

@nordicjm nordicjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR says changing for clang but is changing for gcc, clang changes should go in the clang file as per https://github.com/zephyrproject-rtos/zephyr/blob/a52b9d4084133edbdf547f308f5cb349a764fcc3/cmake/compiler/clang/compiler_flags.cmake#L1-L4]

GCC has no plans to deprecate this flag https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115990

'-Ofast' enables optimizations that are not standards compliant, which can
produce unexpected results unless used carefully.

When building with clang 19 and newer, it warns:

clang: error: argument '-Ofast' is deprecated; use '-O3 -ffast-math' for
the same behavior, or '-O3' to enable only conforming optimizations
[-Werror,-Wdeprecated-ofast]

Relevant discussion:

* llvm/llvm-project#98736
* https://discourse.llvm.org/t/rfc-deprecate-ofast/78687/109
* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115990

Signed-off-by: Tom Hughes <tomhughes@chromium.org>
@thughes thughes force-pushed the push-sppopnmtpnxw branch from a52b9d4 to e42bd38 Compare March 24, 2025 17:30
@thughes
Copy link
Contributor Author

thughes commented Mar 25, 2025

I think we want to be consistent on this on all compilers.

-Ofast is not standards compliant on both gcc and clang. We probably don't want people to use it unless they're really sure that it's safe for the code they're enabling it on. The name optimization_fast suggests that it's just faster, not necessarily unsafe. We could rename it to optimization_fast_unsafe to make it clear that it's not compliant. Or we could just choose to make optimization_fast be something that is compliant like I proposed in this change. (It's not clear whether the unsafe aspect was taken into account when it was enabled for the code that is currently using it.)

@thughes thughes requested a review from nordicjm March 25, 2025 16:25
@tejlmand
Copy link
Collaborator

I think we want to be consistent on this on all compilers.

That's not possible, cause compilers has their own way / flags for handling this.

on both gcc and clang

you forgot armclang, arcmwdt, iar, armcc, and whatever compiler that exists out there 🙈 .

The name optimization_fast suggests that it's just faster, not necessarily unsafe. We could rename it to optimization_fast_unsafe to make it clear that it's not compliant.

I would say it's outside scope of Zephyr project to determine if the fast implementation for a given compiler is to be considered non-compliant, or unsafe as you say.

The main problems with appending unsafe are:

  1. It gives an impression that code may just crash, but that's not the case.
    For gcc it simply means that fast-math is enabled, which takes short-cuts to calculate faster, but the result can be imprecise and thus not following IEEE. That's not the same as unsafe.
  2. Not all compilers takes such short-cut, in which case unsafe is wrong.

Developers should really understand what a given option does instead of just assuming.

Or we could just choose to make optimization_fast be something that is compliant like I proposed in this change. (It's not clear whether the unsafe aspect was taken into account when it was enabled for the code that is currently using it.)

I don't know the history for this flag, but changing it know will really impact things for everyone depending on current usage.
In embedded, and especially battery operated, systems, then a slight math impression is usually accepted if you can save power or execute faster.
Embedded systems are usually resource constrained, so having a fast optimization makes a lot of sense, so you can go back to sleep faster.

Changing from -Ofast --> -O3 will be hard to debug for developers who might suddenly see increased power usage / faster battery drain because floating point ops takes longer time.

@thughes
Copy link
Contributor Author

thughes commented Mar 26, 2025

but the result can be imprecise and thus not following IEEE. That's not the same as unsafe.

If you're working on a safety critical system, that can indeed be unsafe. Perhaps a better name would be "non-compliant-math" or something to that effect.

For gcc it simply means that fast-math is enabled

It also enables -fallow-store-data-races, which could lead to correctness issues: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-Ofast

Not all compilers takes such short-cut, in which case unsafe is wrong.

One possibility would be to abstract into two different settings. A "compliant/safe fast" and "non-compliant / unsafe fast". Each compiler could then define what that means (similar to how the other compiler settings are abstracted to allow each one to define an implementation that makes sense for it).

@tejlmand
Copy link
Collaborator

tejlmand commented Mar 27, 2025

but the result can be imprecise and thus not following IEEE. That's not the same as unsafe.

If you're working on a safety critical system, that can indeed be unsafe. Perhaps a better name would be "non-compliant-math" or something to that effect.

I would still say that it depends on what you're using floats for, and if you remember to check for nan/inf, or error checks (if you don't do those checks, then fast-math or not doesn't really matter much as your code itself is unsafe).

But I agree that in a safety critical system you would generally want to trade performance for extra checks, just to be on the extra safe side.

For gcc it simply means that fast-math is enabled

It also enables -fallow-store-data-races, which could lead to correctness issues: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-Ofast

True, but this again depends on whether you have guarded your threads properly.
Which is quite different from the fast-math issue where you can't do anything regarding imprecise floating point math.
Hence the reason I believe the math part to be of highest importance in this regard.

Not all compilers takes such short-cut, in which case unsafe is wrong.

One possibility would be to abstract into two different settings. A "compliant/safe fast" and "non-compliant / unsafe fast". Each compiler could then define what that means (similar to how the other compiler settings are abstracted to allow each one to define an implementation that makes sense for it).

We could have a extra STRICT_COMPLIANCE Kconfig flag, similar to what we have for MISRA:

zephyr/Kconfig.zephyr

Lines 616 to 617 in 43e8f11

config MISRA_SANE
bool "MISRA standards compliance features"

We could then have the Kconfig fast setting depend on such STRICT_COMPLIANCE setting.
Also setting CONFIG_MISRA_SANE=y could result in a strict compliance setting being enabled.

Note, i've used STRICT_COMPLIANCE as name, but I would like a better name because I find it too generic, because it's not identifying compliance against what. In this context it's fine, but a better name would be preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Build System area: Toolchains Toolchains size: XS A PR changing only a single line of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants