Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Farm v2.0 performance improvement #2113

Open
1 of 12 tasks
fu050409 opened this issue Mar 11, 2025 · 0 comments
Open
1 of 12 tasks

Farm v2.0 performance improvement #2113

fu050409 opened this issue Mar 11, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers performance Performance dependent v2.0 v2.0 version

Comments

@fu050409
Copy link
Member

fu050409 commented Mar 11, 2025

As a extremely fast front-end compiler, Farm should undoubtedly performance-first. We mostly didn't think about detailed performance optimizations on v1, and in fact we had more options to maximize our performance on standard implementations like String and Vec. I think Farm in v2 should be an extremely optimized and powerful compiler.

To this end, I found a large number of Rust crates that implemented alternatives to the Rust std library, and began to benchmark these crates comprehensively, and then gradually began to replace and benchmark all the code in the Farm line by line. Yes, I think it's time we did fine-grained benchmarking. All of the alternative projects are using Rust for performance optimization, and I think we can do more comprehensive performance optimization under our design philosophy.


Tracking pull request: #2116


Benchmarks of std alternatives

  1. Alternatives for String

    See https://github.com/fu050409/str-bench

  2. Alternatives for Vec

    See https://github.com/fu050409/vec-bench

  3. Alternatives for HashMap

    See https://github.com/fu050409/hashmaps-bench

Prerequisites

  • Optimize Farm profile to fine-grained benchmark performance
  • Add more benchmarks

Refactor program

  • 1. String

    The results of the benchmarks show that EcoString provides extreme performance on large string cloning, but does not perform as expected on string replacement. ImString in also provides the same performance as EcoString on cloning, but performs catastrophically poorly in several other benchmarks. ArcStr, first mentioned by @ErKeLost, only shows a slight advantage in String conversions with the standard library. And CompactString, which is highly prized by oxlint, while showing amazing performance on small strings, barely wins any of the other benchmarks, and even shows rather poor performance on large strings. For basic string manipulation, it's easy to see that instead String offers the best performance and is the most stable implementation.

    So pure substitution is likely to cause performance problems instead, and we should use a combination of these libraries. Both EcoString and CompactString are fully compatible with String and each performs very well in specific scenarios, allowing us to use the former in situations where we need to copy large strings, and the latter for storing small strings. Any situation where string operations are needed should be converted to the original String, and the to_string operation has an overhead, but it's very small, an order of magnitude lower than the overhead of a string push operation.

    • Covert all high-clone large strings to EcoString
    • Covert all high-clone small strings to CompactString
  • 2. Vec

    For benchmarks of Vec and its alternatives, SmallVec outperforms Vec in both large and small data volumes. EcoVec provides good performance on clones, but performs poorly in any Vec operation. We should use EcoVec if the data requires most Clone operations, and use SmallVec instead in all other cases.

    • We should benchmark the conversion between SmallVec and EcoVec, and if the results show that the overhead of converting between the two is tolerable compared to the Vec operation, then we can use SmallVec and EcoVec interchangeably to improve performance.
    • Convert all Vec to SmallVec
    • Convert all high-clone vectors to EcoVec
  • 3. HashMap

    It's easy to see from the benchmarks that using rustc_hash far outperforms the Rust standard library no matter what the situation, and we should refactor all hash tables to rustc_hash.

    Tracking PR: refactor(v2): use faster hasher to optimize #1941

  • 4. Check if we have encountered lots of string search, if so, try with SIMD (Need benchmarks)

  • 5. Check if it's OK to use other allocators to optimize performance

@fu050409 fu050409 added the enhancement: pending triage Untriaged enhancement label Mar 11, 2025
@fu050409 fu050409 self-assigned this Mar 11, 2025
@fu050409 fu050409 added enhancement New feature or request good first issue Good for newcomers performance Performance dependent v2.0 v2.0 version and removed enhancement: pending triage Untriaged enhancement labels Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers performance Performance dependent v2.0 v2.0 version
Projects
None yet
Development

No branches or pull requests

1 participant