Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Sorting Drivers based on the devices provided #259

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

nrspruit
Copy link
Contributor

  • Added support for reporting drivers sorted based on the types of devices returned.

  • By default, drivers will be sorted such that the ordering will be:

    • Drivers with Discrete GPUs only - Drivers with Discrete and Integrated GPUs - Drivers with Integrated GPUs - Drivers with Mixed Devices Types (ie GPU + NPU) - Drivers with Non GPU Devices Only
  • If ZE_ENABLE_PCI_ID_DEVICE_ORDER is set, then the following ordering is provided:

    • Drivers with Integrated GPUs - Drivers with Discrete and Integrated GPUs - Drivers with Discrete GPUs only - Drivers with Mixed Devices Types (ie GPU + NPU) - Drivers with Non GPU Devices Only
  • Expanded the zello_world sample to test both init paths for easy prototyping.

- Added support for reporting drivers sorted based on the types of
  devices returned.
- By default, drivers will be sorted such that the ordering will be:
	- Drivers with Discrete GPUs only
        - Drivers with Discrete and Integrated GPUs
        - Drivers with Integrated GPUs
        - Drivers with Mixed Devices Types (ie GPU + NPU)
        - Drivers with Non GPU Devices Only
- If ZE_ENABLE_PCI_ID_DEVICE_ORDER is set, then the following ordering
  is provided:
	- Drivers with Integrated GPUs
        - Drivers with Discrete and Integrated GPUs
        - Drivers with Discrete GPUs only
        - Drivers with Mixed Devices Types (ie GPU + NPU)
        - Drivers with Non GPU Devices Only

- Expanded the zello_world sample to test both init paths for easy
  prototyping.

Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
@nrspruit nrspruit marked this pull request as ready for review January 15, 2025 14:36
@@ -61,13 +62,22 @@ int main( int argc, char *argv[] )
{
tracing_runtime_enabled = true;
}
if( argparse( argc, argv, "-legacy", "--enable_legacy_init" ) )
Copy link
Contributor

@rwmcguir rwmcguir Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor suggestion, rename -legacy to -legacy_init as many other future options may be legacy in nature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -26,6 +26,18 @@
#include "spdlog/spdlog.h"
namespace loader
{
///////////////////////////////////////////////////////////////////////////////
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional, put a comment here that the ordering here is critical and/or used by sorting algorithms, so care should be used when adding. Also this should NOT be used anyplace else so we can modify this ENUM (I see this is internal so should be fine, but let's ensure it stays internal) without ABI issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This remains internal and is not used outside of the loader nor released. I updated the comment accordingly.

rwmcguir
rwmcguir previously approved these changes Jan 16, 2025
Copy link
Contributor

@rwmcguir rwmcguir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than silly comments, I don't have any commentary on the ordering mechanism. Seems simple and easily modifiable via the ENUM, which is ... well pretty cool.

Comment on lines +33 to +34
ZEL_DRIVER_TYPE_DISCRETE_GPU= 0, ///< The driver has Discrete GPUs only
ZEL_DRIVER_TYPE_GPU = 1, ///< The driver has Heterogenous GPU types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that iGPU can be assumed as slower than dGPU but why assuming that dGPU only driver is better than dGPU + iGPU ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an assumption based on an arch review. This is not guaranteed to always be true which is why customers should always ensure the right device is being used in their workloads and not always assume device 0 is the best.

@@ -135,6 +135,47 @@ namespace loader
return flags_value;
}

bool driverSortComparitor(const driver_t &a, const driver_t &b) {
bool pciOrderingRequested = getenv_tobool( "ZE_ENABLE_PCI_ID_DEVICE_ORDER" );
Copy link
Contributor

@JablonskiMateusz JablonskiMateusz Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case of ZE_ENABLE_PCI_ID_DEVICE_ORDER set I believe that all runtimes should also respect the flag and expose lowest pci id first, so we can store pci_id of first device of each driver and the sort by that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I can remove the redundant setting and set in the first call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest, the comparator function now reads the driver's pci device order setting such that the env is only read one time and then the value is set for all drivers.

Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants