Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting current worker number #88

Open
ahmafi opened this issue Nov 27, 2024 · 3 comments
Open

Getting current worker number #88

ahmafi opened this issue Nov 27, 2024 · 3 comments

Comments

@ahmafi
Copy link

ahmafi commented Nov 27, 2024

Are there any ids or index numbers assigned to workers? I want to pass the worker number to the Submit function.

@alitto
Copy link
Owner

alitto commented Nov 27, 2024

Hi @ahmafi!
As of version 2.1.5 there is no mechanism to get the ID/number. I'm curious about your use case, if you can, could you provide a bit more context?

@ahmafi
Copy link
Author

ahmafi commented Nov 27, 2024

Hi, I'm spawning a headless chrome in each pool and I wanted to save the chrome userDataDir for each pool, so they can benefit the cached content. They can't use a shared userDataDir, so the best solution IMO is to use one of them for each running pool. I could create them before starting the pool and pass them to the pool. But my code is written in a way that it already creates those in the worker function based on the pool index.

@alitto
Copy link
Owner

alitto commented Nov 28, 2024

Oh, I see. IMO your use case could benefit from a sync.Pool, which is part of go's sync package. That pool is intentionally meant to create a "pool" of reusable objects that can be safely obtained from multiple concurrent goroutines.
Assuming you create such pool of "user data dirs" using sync.Pool, you might control the number of max concurrent chrome processes by using a pool with limited concurrency like this:

import (
    "sync"
    "github.com/alitto/pond/v2"
)

func main() {

    var userDataDirPool = sync.Pool{
        New: func() any {
	    // Create a new user data dir in a random location
	    return newDir
        },
    }
  
    // Create a pool with a max concurrency of 10
    pool := pond.NewPool(10)
    defer pool.StopAndWait()
  
    // Submit several concurrent tasks
    for i := 0; i < 1000; i++ {
          pool.Submit(func() {
              // Get a dir from the pool (it may be a new one or reused)
              userDataDir := userDataDirPool.Get().(string)
              // Optionally, clean up the user data dir in case some of its contents cannot be reused
	      // cleanupDir(userDataDir)
           
	      // Launch chrome process using the obtained userDataDir
	      launchChromeProcess(userDataDir)

              // Return the user data dir to the pool
              userDataDirPool.Put(userDataDir) 
          })
    }
}

The max concurrency limit on pond.Pool will ensure there cannot be more than 10 chrome processes running in parallel while the sync.Pool will guarantee only 1 of the workers is using each user data dir at any given time.
Please let me know if you find this useful and thanks for sharing your use case!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants