Skip to content

Conversation

jdksjolen
Copy link
Contributor

@jdksjolen jdksjolen commented Aug 29, 2025

Hi,

The MemBaseline used to access the VMT instance directly without a lock. We fix that, and we switch from using a LinkedList to a copied RegionsTree instead.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8366363: MemBaseline accesses VMT without using lock (Bug - P4)

Contributors

  • Casper Norrbin <cnorrbin@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27003/head:pull/27003
$ git checkout pull/27003

Update a local copy of the PR:
$ git checkout pull/27003
$ git pull https://git.openjdk.org/jdk.git pull/27003/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27003

View PR using the GUI difftool:
$ git pr show -t 27003

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27003.diff

Using Webrev

Link to Webrev Comment

@jdksjolen jdksjolen force-pushed the cleanup-membaseline branch from 80e5eef to bfa2d72 Compare August 29, 2025 12:14
@bridgekeeper
Copy link

bridgekeeper bot commented Aug 29, 2025

👋 Welcome back jsjolen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Aug 29, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title 8366363 8366363: MemBaseline accesses VMT without using lock Aug 29, 2025
@openjdk
Copy link

openjdk bot commented Aug 29, 2025

@jdksjolen The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Aug 29, 2025
@jdksjolen jdksjolen force-pushed the cleanup-membaseline branch from bfa2d72 to 206b36d Compare August 29, 2025 16:50
@jdksjolen
Copy link
Contributor Author

/label hotspot-runtime

@jdksjolen jdksjolen marked this pull request as ready for review August 29, 2025 18:53
@openjdk openjdk bot added rfr Pull request is ready for review hotspot-runtime hotspot-runtime-dev@openjdk.org labels Aug 29, 2025
@openjdk
Copy link

openjdk bot commented Aug 29, 2025

@jdksjolen
The hotspot-runtime label was successfully added.

@mlbridge
Copy link

mlbridge bot commented Aug 29, 2025

@@ -451,6 +451,13 @@ class RBTree : public AbstractRBTree<K, RBNode<K, V>, COMPARATOR> {

public:
RBTree() : BaseType(), _allocator() {}
RBTree(const RBTree& other) : BaseType(), _allocator() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caspernorrbin , I added this. Does it look correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me, as long as the value type can also be properly copy-constructible. Maybe we can add in an assert to avoid potential future misuse? :-)

Something like:
assert(std::is_copy_constructible<V>(), "Value type must be copy-constructible");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can do that! I'll put it in a static_assert even.

@openjdk
Copy link

openjdk bot commented Sep 2, 2025

@jdksjolen Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@jdksjolen
Copy link
Contributor Author

Oops, I accidentally committed to my current branch, I meant to commit to a new one. I'm going to rewrite history a bit, sorry bot.

Copy link
Contributor

@afshin-zafari afshin-zafari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jdksjolen for taking this issue.
I have only one concern when we copy a RBTree to a new one. We do a visit_in_order on the source tree and upsert the key-value to the destination tree. We do a sorted access on both trees. We can avoid one, by a new function like: visit_all() for the source tree, or for the destination tree use the hint_node somehow (that I don't know how, ping @caspernorrbin) in RBTree::upsert(K, V, Node* hint_node)

@caspernorrbin
Copy link
Member

We can avoid one, by a new function like: visit_all() for the source tree, or for the destination tree use the hint_node somehow (that I don't know how, ping @caspernorrbin) in RBTree::upsert(K, V, Node* hint_node)

I don't think we would see much improvement iterating differently on the source tree. However, for the destination tree we traverse down the tree for each upsert, which could have quite an impact on large trees. hint_node decides where to start searching from, so if we cache the previous node we can avoid this traversal and start directly at the insert point.

To achieve this we could change the return type of upsert from void to RBNode<K, V>*, which is easily done and change the copy-constructor to something like:

  RBTree(const RBTree& other) : BaseType(), _allocator() {
    static_assert(std::is_copy_constructible<K>::value, "Key type must be copy-constructible when copying a RBTree");
    static_assert(std::is_copy_constructible<V>::value, "Value type must be copy-constructible when copying a RBTree");
    RBNode<K, V>* prev_node = nullptr;
    other.visit_in_order([&](RBNode<K, V>* node) {
      RBNode<K, V>* new_node = this->upsert(node->key(), node->val(), prev_node);
      prev_node = new_node;
      return true;
    });
  }

@jdksjolen
Copy link
Contributor Author

We can avoid one, by a new function like: visit_all() for the source tree, or for the destination tree use the hint_node somehow (that I don't know how, ping @caspernorrbin) in RBTree::upsert(K, V, Node* hint_node)

I don't think we would see much improvement iterating differently on the source tree. However, for the destination tree we traverse down the tree for each upsert, which could have quite an impact on large trees. hint_node decides where to start searching from, so if we cache the previous node we can avoid this traversal and start directly at the insert point.

To achieve this we could change the return type of upsert from void to RBNode<K, V>*, which is easily done and change the copy-constructor to something like:

  RBTree(const RBTree& other) : BaseType(), _allocator() {
    static_assert(std::is_copy_constructible<K>::value, "Key type must be copy-constructible when copying a RBTree");
    static_assert(std::is_copy_constructible<V>::value, "Value type must be copy-constructible when copying a RBTree");
    RBNode<K, V>* prev_node = nullptr;
    other.visit_in_order([&](RBNode<K, V>* node) {
      RBNode<K, V>* new_node = this->upsert(node->key(), node->val(), prev_node);
      prev_node = new_node;
      return true;
    });
  }

I think we need to switch out the iterator for that to make sense, specifically it needs to be DFS. Anyway, let's do that.

@afshin-zafari
Copy link
Contributor

When we do a visit_in_order we can set next(right node) of new nodes one by one. IOW, since Key-Values are retrieved in order, we can insert them one after another. Right?
Inspired by the visit_in_order() itself: there we first get leftmost() and then use next() in a loop.

@openjdk
Copy link

openjdk bot commented Sep 2, 2025

@jdksjolen Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@jdksjolen
Copy link
Contributor Author

Sorry about the ridiculous amount of commits. For each one I thought "this is the last one". I've created a temp branch that I'll push to so you don't have to see this WIP.

enum Dir { Left, Right };
struct node_pair { const RBNode<K, V>* current; RBNode<K, V>* other_parent; Dir d; };
struct stack {
node_pair s[64];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify the choice of 64 as an upper bound? It's not immediately obvious to me if we guarantee (or if we don't care about) the maximum visitation size in the code below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an arbitrary depth (it's used in other parts of the RBTree code as well). The maximum length of the stack represents the maximum depth of the tree that we support. A perfectly balanced binary tree of depth 64 has 2**64 - 1 nodes, so in practice this will never be reached.

@jdksjolen
Copy link
Contributor Author

/contributo add @caspernorrbin

@jdksjolen
Copy link
Contributor Author

/contributor add @caspernorrbin

@openjdk
Copy link

openjdk bot commented Sep 2, 2025

@jdksjolen this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout cleanup-membaseline
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Sep 2, 2025
@openjdk
Copy link

openjdk bot commented Sep 2, 2025

@jdksjolen Unknown command contributo - for a list of valid commands use /help.

@openjdk
Copy link

openjdk bot commented Sep 2, 2025

@jdksjolen
Contributor Casper Norrbin <cnorrbin@openjdk.org> successfully added.

while ((rgn = itr.next()) != nullptr) {
VirtualMemoryAllocationSite tmp(*rgn->call_stack(), rgn->mem_tag());
bool failed_oom = false;
_vma_allocs_replacement->visit_reserved_regions([&](ReservedMemoryRegion& rgn) {
Copy link
Contributor

@afshin-zafari afshin-zafari Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _vma_allocs_replacement is used only locally here. Instead of creating a copy of the VMATree, we can do our loop here within a lock. Can't we?
Then we save one tree traverse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used here:

void MemDetailReporter::report_virtual_memory_map() {
  // Virtual memory map always in base address order
  output()->print_cr("Virtual memory map:");
  _baseline.virtual_memory_allocations()->visit_reserved_regions([&](ReservedMemoryRegion& rgn) {
    report_virtual_memory_region(&rgn);
    return true;
  });
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org merge-conflict Pull request has merge conflict with target branch rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

4 participants