Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Introduce Clock Management Subsystem (Clock Driver Based) #72102

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

danieldegrasse
Copy link
Collaborator

@danieldegrasse danieldegrasse commented Apr 29, 2024

Introduction

This PR proposes a clock management subsystem. It is opened as an alternative to #70467. The eventual goal would be to replace the existing clock control drivers with implementations using the clock management subsystem. This subsystem defines clock control hardware within the devicetree, and abstracts configuration of clocks to "clock states", which reference the clock control devicetree nodes in order to configure the clock tree.

Problem description

The core goal of this change is to provide a more device-agnostic way to manage clocks. Although the clock control subsystem does define clocks as an opaque type, drivers themselves still often need to be aware of the underlying type behind this opaque definition, and there is no standard for how many cells will be present on a given clock controller, so implementation details of the clock driver are prone to leaking into drivers. This presents a problem for vendors that reuse IP blocks across SOC lines with different clock controller drivers.

Beyond this, the clock control subsystem doesn't provide a simple way to configure clocks. clock_control_configure and clock_control_set_rate are both available, but clock_control_configure is ripe for leaking implementation details to the driver, and clock_control_set_rate will likely require runtime calculations to achieve requested clock rates that aren't realistic for small embedded devices (or leak implementation details, if clock_control_subsys_rate_t isn't an integer)

Proposed change

This proposal provides the initial infrastructure for clock management, as well as an implementation on the LPC55S69 and an initial driver conversion for the Flexcomm serial driver (mostly for demonstration purposes). Long term, the goal would be to transition all SOCs to this subsystem, and deprecate the clock control API. The subsystem has been designed so it can exist alongside clock control (much like pinmux and pinctrl) in order to make this transition smoother.

The application is expected to assign clock states within devicetree, so the driver should have no awareness of the contents of a clock state, only the target state name. Clock outputs are also assigned within the SOC devicetree, so drivers do not see the details of these either.

In order to fully abstract clock management details from consumers, the clock management layer is split into two layers:

  • clock management layer: public facing, used by consumers to query rates from and reconfigure their clocks
  • clock driver layer: internal to clock management, used by clock drivers to query rates from and reconfigure their parent sources

This split is required because not all applications want the flash overhead of enabling runtime rate resolution, so clock states need to be opaque to the consumer. When a consumer requests a rate directly via clock_mgmt_req_rate, the request will be satisfied by one of the predefined states for the clock, unless runtime rate resolution is enabled. Consumers can also apply states directly via clock_mgmt_apply_state.

Detailed RFC

Clock Management Layer

The clock management layer is the public API that consumers use to interact with clocks. Each consumer will have a set of clock states defined, along with an array of clock outputs. Consumers can query rates of their output clocks, and apply new clock states at any time.

The Clock Management API exposes the following functions:

  • clock_mgmt_get_rate: Reads a clock rate from a given clock output in Hz
  • clock_mgmt_apply_state: Applies a new clock state from those defined in the consumer's devicetree node
  • clock_mgmt_set_callback: Sets a callback to fire before any of the clock outputs defined for this consumer are reconfigured. A negative return value from this callback will prevent the clock from being reconfigured.
  • clock_mgmt_disabled_unused: Disable any clock sources that are not actively in use
  • clock_mgmt_req_rate: Request a frequency range from a clock output

A given clock consumer might define clocks states and outputs like so:

&mydev_clock_source {
    mydev_state_default: mydev-state-default {
        compatible = "clock-state";
	clock-frequency = <DT_FREQ_M(10)>;
	clocks = <&mydev_div 1 &mydev_mux 3>;
    };
    
    mydev_state_sleep: mydev-state-sleep {
        compatible = "clock-state";
	clock-frequency = <DT_FREQ_M(1)>;
	clocks = <&mydev_div 2 &mydev_mux 0>;
    };
}

mydev {
    ...
    compatible = "vnd,mydev";
    clock-outputs = <&mydev_clock_source>;
    clock-output-names = "default";
    clock-state-0 = <&mydev_state_default>;
    clock-state-1 = <&mydev_state_sleep>;
    clock-state-names = "default", "sleep";
    ...
};

Note that the cells for each node within the clocks property of a state are specific to that node's compatible. It is expected that this values will be used to configure settings like multiplexer selections or divider values directly.

The consumer's driver would then interact with the clock management API like so:

/* A driver for the "vnd,mydev" compatible device */
#define DT_DRV_COMPAT vnd_mydev

...
#include <zephyr/drivers/clock_mgmt.h>
...

struct mydev_config {
    ...
    /* Reference to default clock */
    const struct clock_output *clk;
    /* clock default state */
    const clock_mgmt_state_t default_state;
    /* clock sleep state */
    const clock_mgmt_state_t sleep_state;
    ...
};

...

int mydev_clock_cb(const struct clock_mgmt_event *ev, const void *data)
{
    const struct device *dev = (const struct device *)data;

    if (ev->new_rate > HS_MAX_CLK_RATE) {
        /* Can't support this new rate */
        return -ENOTSUP;
    }
    if (ev->type == CLOCK_MGMT_POST_RATE_CHANGE) {
        /* Handle clock rate change ... */
    }
    return 0;
}

static int mydev_init(const struct device *dev)
{
    const struct mydev_config *config = dev->config;
    int clk_rate;
    ...
    /* Set default clock to default state */
    hs_clock_rate = clock_mgmt_apply_state(config->clk, config->default_state);
    if (hs_clock_rate < 0) {
        return hs_clock_rate;
    }
    /* Register for a callback if default clock changes rate */
    ret = clock_mgmt_set_callback(config->hs_clk, hs_clock_cb, dev);
    if (ret < 0) {
        return ret;
    }
    ...
}

#define MYDEV_DEFINE(i)                                                    \
    /* Define clock output for default clock */                            \
    CLOCK_MGMT_DT_INST_DEFINE_OUTPUT_BY_NAME(i, default);                  \
    ...                                                                    \
    static const struct mydev_config mydev_config_##i = {                  \
        ...                                                                \
        /* Initialize clock output */                                      \
        .clk = CLOCK_MGMT_DT_INST_GET_OUTPUT_BY_NAME(i, default),          \
        /* Read states for default and sleep */                            \
        .default_state = CLOCK_MGMT_DT_INST_GET_STATE(i, default,          \
						         default),         \
        .sleep_state = CLOCK_MGMT_DT_INST_GET_STATE(i, default,            \
						       sleep),             \
        ...                                                                \
    };                                                                     \
    static struct mydev_data mydev_data##i;                                \
    ...                                                                    \
								           \
    DEVICE_DT_INST_DEFINE(i, mydev_init, NULL, &mydev_data##i,             \
		          &mydev_config##i, ...);

    DT_INST_FOREACH_STATUS_OKAY(MYDEV_DEFINE)

Requesting Clock Rates versus Configuring the Clock Tree

Clock states can be defined using one of two methods: either clock rates can be requested from using clock_mgmt_req_rate, or states can be applied directly using clock_mgmt_apply_state. If CONFIG_CLOCK_MGMT_SET_RATE is enabled, clock rate requests can also be handled at runtime, which may result in more accurate clocks for a given request. However, some clock configurations may only be possibly by directly applying a state using clock_mgmt_apply_state.

Directly Configuring Clock Tree

For flash optimization or advanced use cases, the devicetree can be used to configure clock nodes directly with driver-specific data. Each clock node in the tree defines a set of specifiers within its compatible, which can be used to configure node specific settings. Each node defines two macros to parse these specifiers, based on its compatible: Z_CLOCK_MGMT_xxx_DATA_DEFINE and Z_CLOCK_MGMT_xxx_DATA_GET (where xxx is the device compatible string as an uppercase token). The expansion of Z_CLOCK_MGMT_xxx_DATA_GET for a given node and set of specifiers will be passed to the clock_configure function as a void * when that clock state is applied. This allows the user to configure clock node specific settings directly (for example, the precision targeted by a given oscillator or the frequency generation method used by a PLL). It can also be used to reduce flash usage, as parameters like PLL multiplication and division factors can be set in the devicetree, rather than being calculated at runtime.

Defining a clock state that directly configures the clock tree might look like so:

&mydev_clock_source {
    mydev_state_default: mydev-state-default {
            compatible = "clock-state";
	    clock-frequency = <DT_FREQ_M(10)>;
	    clocks = <&mydev_div 1 &mydev_mux 3>;
    };
};

This would setup the mydev_mux and mydev_div using hardware specific settings (given by their specifiers). In this case, these settings might be selected so that the clock output of mydev_clock_source would be 1MHz.

Runtime Clock Requests

When CONFIG_CLOCK_MGMT_RUNTIME is enabled, clock requests issued via clock_mgmt_req_rate will be aggregated, so that each request from a consumer is combined into one set of clock constraints. This means that if a consumer makes a request, that request is "sticky", and the clock output will reject an attempt to reconfigure it to a range outside of the requested frequency. For clock states in devicetree, the same "sticky" behavior can be achieved by adding the locking-state property to the state definition. This should be done for states on critical clocks, such as the CPU core clock, that should not be reconfigured due to another consumer applying a clock state.

Clock Driver Layer

The clock driver layer describes clock producers available on the system. Within an SOC clock tree, individual clock nodes (IE clock multiplexers, dividers, and PLLs) are considered separate producers, and should have separate devicetree definitions and drivers. Clock drivers can implement the following API functions:

  • notify: Called by parent clock to notify child it is about to reconfigure to a new clock rate. Child clock can return error if this rate is not supported, or simply calculate its new rate and forward the notification to its own children
  • get_rate: Called by child clock to request frequency of this clock in Hz
  • configure: Called directly by clock management subsystem to reconfigure the clock node. Clock node should notify children of its new rate
  • round_rate: Called by a child clock to request best frequency in Hz a parent can produce, given a requested target frequency
  • set_rate: Called by a child clock to set parent to best frequency in Hz it can produce, given a requested target frequency

To implement these APIs, the clock drivers are expected to make use of the clock driver API. This API has the following functions:

  • clock_get_rate: Read the rate of a given clock
  • clock_round_rate: Get the best clock frequency a clock can produce given a requested target frequency
  • clock_set_rate: Set a clock to the best clock frequency it can produce given a requested target frequency. Also calls clock_lock on the clock to prevent future reconfiguration by clocks besides the one taking ownership
  • clock_notify_children: Notify all clock children that this clock is about to reconfigure to produce a new rate.
  • clock_children_check_rate: Verify that children can accept a new rate
  • clock_children_notify_pre_change: Notify children a clock is about to reconfigure
  • clock_children_notify_post_change: Notify children a clock has reconfigured

As an example, a vendor multiplexer driver might get its rate like so:

static int vendor_mux_get_rate(const struct clk *clk_hw)
{
	const struct vendor_mux_get_rate *data = clk_hw->hw_data;
	int sel = *data->reg & VND_MUX_MASK;
	
	/* Return rate of active parent */
	return clock_get_rate(data->parents[sel]);
}

SOC devicetrees must define all clock outputs in devicetree. This approach is required because clock producers can reference each node directly in a clock state, in order to configure the clock tree without needing to request a clock frequency and have it applied at runtime.

An SOC clock tree therefore might look like the following:

mydev_clock_mux: mydev-clock-mux@400002b0 {
	compatible = "vnd,clock-mux";
	#clock-cells = <1>;
	reg = <0x400002b0 0x3>;
	offset = <0x0>;
	/* Other clock nodes that provide inputs to this multiplexer */
	input-sources = <&fixed_12m_clk &pll &fixed_96m_clk &no_clock>;
	#address-cells = <1>;
	#size-cells = <1>;

	/* Divider whose parent is this multiplexer */
	mydev_clock_div: mydev-clock-div@40000300 {
		compatible = "vnd,clock-div";
		#clock-cells = <1>;
		reg = <0x40000300 0x8>;

		mydev_clock_source: mydev-clock-source {
			compatible = "clock-output";
			#clock-cells = <1>;
		};
	};
};

Producers can provide specifiers when configuring a node, which will be used by the clock subsystem to determine how to configure the clock. For a clock node with the compatible vnd,clock-compat, the following
macros must be defined:

  • Z_CLOCK_MGMT_VND_CLOCK_COMPAT_DATA_DEFINE: Defines any static data that is needed to configure this clock
  • Z_CLOCK_MGMT_VND_CLOCK_COMPAT_DATA_GET: Gets reference to previously defined static data to configure this clock. Cast to a void* and passed to clock_configure. For simple clock drivers, this may be the only definition needed.

For example, the vnd,clock-mux compatible might have one specifier: "selector", and the following macros defined:

/* No data structure needed for mux */
#define Z_CLOCK_MGMT_VND_CLOCK_MUX_DATA_DEFINE(node_id, prop, idx)
/* Get mux configuration value */
#define Z_CLOCK_MGMT_VND_CLOCK_MUX_DATA_GET(node_id, prop, idx)         \
	DT_PHA_BY_IDX(node_id, prop, idx, selector)

The value that Z_CLOCK_MGMT_VND_CLOCK_MUX_DATA_GET expands to will be passed to the clock_configure API call for the driver implementing the vnd,clock-mux compatible. Such an implementation might look like the following:

static int vendor_mux_configure(const struct clk *clk_hw, const void *mux)
{
	const struct vendor_mux_get_rate *data = clk_hw->hw_data;
	int ret;
	uint32_t mux_val = (uint32_t)mux;
	int current_rate = clock_get_rate(clk_hw);
	int new_rate;

	if (mux_val > data->parent_count) {
		return -EINVAL;
	}
	
	new_rate = clock_get_rate(data->parents[mux_val]);

	/* Notify children of new rate, and check if they can accept it */
	ret = clock_children_check_rate(clk_hw, new_rate);
	if (ret < 0) {
		return ret;
	}
	
	/* Notify children we are about to change rate */
	ret = clock_children_notify_pre_change(clk_hw, current_rate, new_rate);
	if (ret < 0) {
		return ret;
	}

	(*data->reg) = mux_val;
	
	/* Notify children we have changed rate */
	ret = clock_children_notify_post_change(clk_hw, current_rate, new_rate);
	if (ret < 0) {
		return ret;
	}
	return 0;
}

A clock state to set the mydev_clock_mux to use the pll clock as input would then look like this:

clocks = <&mydev_clock_mux 1>;

Note the mydev_clock_source leaf node in the clock tree above. These nodes must exist as children of any clock node that can be used by a peripheral, and the peripheral must reference the mydev_clock_source node in its clock-outputs property. The clock management subsystem implements clock drivers for nodes with the clock-output compatible, which handles mapping the clock management APIs to internal clock driver APIs.

Framework Configuration

Since the clock management framework would likely be included with every SOC build, several Kconfigs are defined to enable/disable features that will not be needed for every application, and increase flash usage when enabled. These Kconfig are the following:

  • CONFIG_CLOCK_MGMT_RUNTIME: Enables clocks to notify children of reconfiguration. Needed any time that peripherals will reconfigure clocks at runtime, or if clock_mgmt_disable_unused is used. Also makes requests from consumers to clock_mgmt_req_rate aggregate, so that if a customer makes a request that the clock accepts it is guaranteed the clock will not change frequency outside of those parameters.
  • CONFIG_CLOCK_MGMT_SET_RATE: Enables clocks to calculate a new rate and apply it at runtime. When enabled, clock_mgmt_req_rate will use
    runtime rate resolution if no statically defined clock states satisfy a request. Also enables CONFIG_CLOCK_MGMT_RUNTIME.

Dependencies

This is of course a large change. I'm opening the RFC early for review, but if we choose this route for clock management we will need to create a tracking issue and follow a transition process similar to how we did for pin control.

Beyond this, there are a few key dependencies I'd like to highlight:

  • in order to reduce flash utilization (by avoiding linking in children clock nodes an application does not need), I am referencing child clocks by clock handles (which work in a manner similar to device handles). This requires a 2 stage link process for runtime clock support as the first stage link must discard unused clock structures and the second stage link adds clock handles where needed
  • Some minimal devicetree scripting changes are needed to handle clock states

Flash usage

NOTE: these numbers are subject to change! This is simply present to provide a benchmark of the rough flash impact of the clock framework with/without certain features

The below builds all configure the clock tree to output 96MHz using the internal oscillator to drive the flexcomm0 serial, and configure the LPC55S69's PLL1 to output a core clock at 144MHz (derived from the 16MHz external crystal)

# Baseline, without clock management enabled
$ west build -p always -b lpcxpresso55s69//cpu0 samples/hello_world/ -DCONFIG_CLOCK_MANAGEMENT=n -DCONFIG_CLOCK_CONTROL=y
Memory region         Used Size  Region Size  %age Used
           FLASH:       16578 B       160 KB     10.12%
             RAM:        4240 B       192 KB      2.16%
        USB_SRAM:          0 GB        16 KB      0.00%
        IDT_LIST:          0 GB        32 KB      0.00%
# Disable clock control, enable clock management. 170 additional bytes of FLASH, 16 bytes of extra RAM
$ west build -p always -b lpcxpresso55s69//cpu0 samples/hello_world/ -DCONFIG_CLOCK_MANAGEMENT=y -DCONFIG_CLOCK_CONTROL=n
Memory region         Used Size  Region Size  %age Used
           FLASH:       16748 B       160 KB     10.22%
             RAM:        4256 B       192 KB      2.16%
        USB_SRAM:          0 GB        16 KB      0.00%
        IDT_LIST:          0 GB        32 KB      0.00%
# Clock Management with notification support. 2356 additional bytes of FLASH, 56 bytes of RAM		
$ west build -p always -b lpcxpresso55s69//cpu0 samples/hello_world/ -DCONFIG_CLOCK_MANAGEMENT=y -DCONFIG_CLOCK_CONTROL=n -DCONFIG_CLOCK_MANAGEMENT_RUNTIME=y
Memory region         Used Size  Region Size  %age Used
           FLASH:       19104 B       160 KB     11.66%
             RAM:        4312 B       192 KB      2.19%
        USB_SRAM:          0 GB        16 KB      0.00%
        IDT_LIST:          0 GB        32 KB      0.00%
# Clock Management with runtime rate setting and notification support. 4632 additional bytes of FLASH, 0 bytes of RAM		
$ west build -p always -b lpcxpresso55s69//cpu0 samples/hello_world/ -DCONFIG_CLOCK_MANAGEMENT=y -DCONFIG_CLOCK_CONTROL=n -DCONFIG_CLOCK_MANAGEMENT_RUNTIME=y -DCONFIG_CLOCK_MANAGEMENT_SET_RATE=y
Memory region         Used Size  Region Size  %age Used
           FLASH:       23736 B       160 KB     14.49%
             RAM:        4312 B       192 KB      2.19%
        USB_SRAM:          0 GB        16 KB      0.00%
        IDT_LIST:          0 GB        32 KB      0.00%

Concerns and Unresolved Questions

I'm unsure what the implications of requiring a 2 stage link process for all builds with the clock control framework that have runtime clocking enabled will be for build/testing time overhead.

In many ways, the concept of clock states duplicates operating points in Linux. I'm not sure if we want to instead define clock states as operating points. The benefit of this would be for peripherals (or CPU cores) that support multiple operating points and use a regulator to select them, since we could define the target voltage with the clock state.

Currently, we aggregate clock requests sent via clock_mgmt_req_rate within the clock output driver, and the clock output driver will handle rejecting any attempt to configure a frequency outside of the constraints that have been set on it. While this results in simple application usage, I am not sure if it would instead be better to rely on consumers to reject rates they cannot handle. An approach like this would likely use less flash.

Currently we also issue callbacks at three points:

  • to validate children can accept a new frequency
  • before setting a new rate for the clock
  • after setting a new rate for the clock

We could potentially issue only one callback, directly before reconfiguring the clock. The question here is if this would satisfy all use cases, or are there consumers that need to take a certain action right before their clock reconfigures, and a different action after? One example I can think of is a CPU that needs to raise core voltage before its frequency rises, but would reduce core voltage after its core frequency drops

Alternatives

The primary alternative to this PR would be #70467. That PR implements uses functions to apply clock states, while this PR implements the SOC clock backend using a method much more similar to the common clock framework, but wraps the implementation in a clock management subsystem using states similar to how #70467 does. This allows us to work around the "runtime rate setting" issue, since this feature can now be optional

@danieldegrasse danieldegrasse force-pushed the rfc/clock-mgmt-drivers branch 9 times, most recently from b4d7f94 to 8cc246a Compare April 29, 2024 21:08
@danieldegrasse danieldegrasse marked this pull request as ready for review April 29, 2024 22:07
@zephyrbot zephyrbot added area: UART Universal Asynchronous Receiver-Transmitter platform: NXP NXP area: native port Host native arch port (native_sim) area: Linker Scripts area: Build System area: Devicetree Binding PR modifies or adds a Device Tree binding platform: NXP Drivers NXP Semiconductors, drivers area: Devicetree labels Apr 29, 2024
Add clock-device include to flexcomm, as it can be used as a clock
consumer within the clock subsystem.

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add clock outputs for all LPC55Sxx flexcomm nodes, so these nodes can
request their frequency via the clock management subsystem

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add support for clock management to the serial flexcomm driver, dependent
on CONFIG_CLOCK_MGMT. When clock management is not enabled, the flexcomm
driver will fall back to the clock control API.

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add support for clock management on CPU0. This requires adding clock
setup for the CPU0 core clock to run at 144MHz from PLL1, and adding
clock setup for the flexcomm0 uart node to use the FROHF clock input.

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
For most builds, CONFIG_CLOCK_CONTROL is still required. However, for
simple applications that only use UART on the lpcxpresso55s69, it is now
possible to build with CONFIG_CLOCK_CONTROL=n and run the application as
expected. Move to implying this symbol so applications can opt to
disable it.

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
The native_sim board will be used within the clock mgmt API test to
verify the clock management subsystem (using emulated clock node
drivers). Therefore, indicate support for clock mgmt in the board YAML
so that twister will run the clock mgmt API test on it.

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add clock_management_api test. This test is intended to verify features
of the clock management API, including the following:
- verify that clock notification callbacks work as expected when a clock
  root is reconfigured
- verify that if a driver returns an error when configuring a clock,
  this will be propagated to the user.
- verify that consumer constraints will be able to block other consumers
  from reconfiguring clocks
- verify that consumers can remove constraints on their clocks

The test is supported on the `native_sim` target using emulated clock
drivers for testing purposes in CI, and on the
`lpcxpresso55s69/lpc55s69/cpu0` target to verify the clock management
API on real hardware.

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add clock management hardware test. This test applies a series of clock
states for a given consumer, and verifies that each state produces the
expected rate. This test is intended to verify that each clock node
driver within an SOC implementation works as expected.

Boards should define their test overlays for this test to exercise as
much of the clock tree as possible, and ensure some clock states do not
define an explicit clocks property, to test their `clock_set_rate` and
`clock_round_rate` implementations.

Initial support is added for the `lpcxpresso55s69/lpc55s69/cpu0` target,
as this is the only hardware supporting clock management.

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
@dleach02 dleach02 dismissed stale reviews from cfriedt and XenuIsWatching via 0faa5e4 December 10, 2024 21:11
@dleach02 dleach02 force-pushed the rfc/clock-mgmt-drivers branch from c43c6df to 0faa5e4 Compare December 10, 2024 21:11
@danieldegrasse
Copy link
Collaborator Author

@danieldegrasse
Resolving clock conflict is exactly the feature I need, could you also include a drawing in the doc to demo how it is done?

Latest push includes a diagram and blub describing this process

danieldegrasse and others added 2 commits December 10, 2024 16:51
Add clock_management_minimal test. This test is intended to verify that
clock management functions correctly when runtime notifications and rate
setting are disabled. It also verifies that support for multiple clock
outputs on a device works as expected.

The test has the following phases:
- apply default clock state for both clock outputs of the emulated
  consumer. Verify that the resulting clock frequencies match what is
  expected.
- apply sleep clock state for both clock outputs of the emulated
  consumer. Verify that the resulting clock frequencies match what is
  expected.
- Request a clock frequency from each clock output, which should match
  the frequency of one of the defined states exactly. Verify that the
  expected state is applied.

The test is supported on the `native_sim` target using emulated clock
drivers for testing purposes in CI, and on the
`lpcxpresso55s69/lpc55s69/cpu0` target to verify the clock management
API on real hardware.

Signed-off-by: Daniel DeGrasse <ddegrasse@tenstorrent.com>
Add documentation for clock management subsystem. This documentation
includes descriptions of the clock management consumer API, as well as
implementation guidelines for clock drivers themselves

Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
@dleach02 dleach02 force-pushed the rfc/clock-mgmt-drivers branch from 0faa5e4 to 14760f9 Compare December 10, 2024 21:57
@swift-tk
Copy link
Collaborator

swift-tk commented Dec 11, 2024

@danieldegrasse
Resolving clock conflict is exactly the feature I need, could you also include a drawing in the doc to demo how it is done?

Latest push includes a diagram and blub describing this process

Thanks, it looks good.

Do you have a list of all possible return values from the APIs lying around somewhere? Could we identify from the return values whether a clock consumer denied clock change due to conflict or general error or simply locked?

Copy link
Member

@erwango erwango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nacking the PR as I think the complexity induced on device tree side is not acceptable as is.

Since clock configuration is a base of a zephyr target configuration, this change will hit all users similarly to changes as hwmv2 or pinctrl introduction with the difference that it is significantly more complex.
I want to avoid this is merged too quickly because a smooth transition from clock_control is possible and hence the risk seen lower compared to changes previously mentioned.

Note that I'm not acting here because I'd be biased because dts clock configuration is already available on STM32 and I don't want to port a new clock framework on STM32. This being said, having had user's feedbacks on complexity on STM32 dts clock configuration make me think the solution proposed in this PR will be even more difficult to swallow for Zephyr users at large (not only people involved in Zephyr "core" developments, not only STM32 users).

Unfortunately, I don't have quick fixes or ready to use alternative to this problem.

My request is that this PR could be reviewed more widely at project level, the complexity induced on user side assessed, and improved when needed.

Excellent job otherwise, @danieldegrasse.

@XenuIsWatching
Copy link
Member

XenuIsWatching commented Dec 19, 2024

Nacking the PR as I think the complexity induced on device tree side is not acceptable as is.

Since clock configuration is a base of a zephyr target configuration, this change will hit all users similarly to changes as hwmv2 or pinctrl introduction with the difference that it is significantly more complex. I want to avoid this is merged too quickly because a smooth transition from clock_control is possible and hence the risk seen lower compared to changes previously mentioned.

Note that I'm not acting here because I'd be biased because dts clock configuration is already available on STM32 and I don't want to port a new clock framework on STM32. This being said, having had user's feedbacks on complexity on STM32 dts clock configuration make me think the solution proposed in this PR will be even more difficult to swallow for Zephyr users at large (not only people involved in Zephyr "core" developments, not only STM32 users).

Unfortunately, I don't have quick fixes or ready to use alternative to this problem.

My request is that this PR could be reviewed more widely at project level, the complexity induced on user side assessed, and improved when needed.

Excellent job otherwise, @danieldegrasse.

While I understand where you are coming from having to lead a team myself to transition to using Zephyr on an STM32 device, there was a lot of feet dragging in the beginning with trying to understand how complex it was even with clock configuration in the device tree, and it makes sense from where they came from before which was STM32Cube where they have a GUI with maximum pretties that can show graphically how the clock tree can be configured and will automagically generate the code for them...... but it with having it in a dts which rather text based and easy to not know what you are doing if you've never used it before, it did take a while for everyone to eventually come around really like and enjoy the feature of having it in DTS despite the complaints in the beginning ❤️ as it did allow us to move faster than having to go through STM32Cube IDE (or was it STM32CubeMX, I can no longer remember the exact tool name 😵 )

For how this works now as is for at least me, with the complexity, it wasn't that complex and was a simplification for me especially when it came to DVFS for power management, which allowed us to have much incredible portability when it comes to DVFS, for example of the clocks need to change due to power management states going on, it can bleed down to a poor child such as UART for example in the clock tree which is very dependant on the system input clock that is about to change and will mess up it's already configured baud rate. This API allowed us to implement a singular driver (with no hacks reaching around the zephyr driver) and a lot less custom C for every application and SoC. Implementing this at the devicetree level saved us a lot of time and did reduce the complexity of having multiple levels of sleep states keeping it in the DTS.

I don't want to be speaking for @danieldegrasse here (I just really want to see this keep moving forward), but there already was the talk at ZDS 2024. The only next step I can think of would be at the Architecture committee review. Would this be a good agenda item to add @carlescufi ?

@danieldegrasse
Copy link
Collaborator Author

I don't want to be speaking for @danieldegrasse here (I just really want to see this keep moving forward), but there already was the talk at ZDS 2024. The only next step I can think of would be at the Architecture committee review. Would this be a good agenda item to add @carlescufi ?

No worries, I want this to move forwards as well. I am willing to discuss/consider if there is a better way to describe clock settings in devicetree though- using node specifiers gets difficult to understand when we configure things like PLLs.

As far as I can see, the one thing we need that is non-negotiable (and different from Linux) is the ability to define multiple independent states for each given clock producer, and directly apply those states at runtime.

Another way we could define states would be something like the following:

&pll {
	pll_state0: state0 {
		vnd,vco-div = <4>;
		vnd,pdiv = <2>;
	};
	pll_state1: state1 {
		vnd,vco-div = <3>;
		vnd,pdiv = <1>;
	};
};

&uart_mux {
	uart_mux_pll: state0 {
		vnd,mux-sel = <3>;
	};
	uart_mux_fixed_clk: state0 {
		vnd,mux-sel = <2>;
	};
};

&uart_div {
	uart_div0: state0 {
	      vnd,div = <1>;
	};
};

&uart_clock {
	uart-clock-default {
		compatible = "clock-state";
		frequency = <DT_FREQ_M(75)>;
		clocks = <&uart_div0 &uart_mux_pll &pll_state0>;
	};
};

The issue I see with this definition is that it will get really verbose for things like dividers/gates- moreover, what do we name nodes like PLL states? They cannot be named based on frequency (as the PLL is probably just configuring a multiplier and divider, not a source clock)

Another option would be something like this:

&uart_clock {
	uart-clock-default {
		compatible = "clock-state";
		frequency = <DT_FREQ_M(75)>;
		pll-setting {
			compatible = "vnd,pll-setting";
			clock = <&pll>;
			vnd,vco-div = <4>;
			vnd,pdiv = <2>;
		};
		mux-setting {
			compatible = "vnd,mux-setting";
			clock = <&uart_mux>;
			vnd,mux-sel = <3>;
		};
		div-setting {
			compatible = "vnd,div-setting";
			clock = <&uart_div>;
			vnd,div = <1>;
		};
	};
};

This will double the number of new compatibles we need for clock management, but it would solve the issue of how to name nodes that described configuration settings at least.

One final option we could explore would be moving parts of (or all of) the configuration for clocks into C code, and only keeping some limited state descriptions in devicetree for each clock consumer to reference. This would involve a large rework of the framework, but might make describing clock settings easier?

@erwango, do you have any other thoughts here? If you have a structure for describing clock configuration in mind, I'm open to it.

Copy link
Collaborator

@swift-tk swift-tk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does look quite nasty in the form of DTS with so many layers of clock dependency, but that is just the nature of zephyr DTS written in code. It might be helpful to the user with GUI tool or visually displayable script like https://github.com/dottspina/dtsh.

On the other hand, to reduce clock dependency when adding support to an SoC, one should collapse unneccessary nodes if they just form a "straight line". After all, there are only 3 genuine layers: vendor specific drivers, clock management common drivers, clock consumers. For example, one may collapse divider, mux and source into one if it is extremely dedicated while the framework still maintains maximum flexibility.

So one up vote from me.

@swift-tk
Copy link
Collaborator

BTW, @danieldegrasse, does the framework support collapsing gate, divider and mux?

@erwango
Copy link
Member

erwango commented Dec 20, 2024

Thanks @danieldegrasse to work on this point.

You're proposal is indeed much more readable, but I understand the verbosity concern. I had no time yet to spend much time on this. In a first step I wanted to check effect of using macros and defines. For instance

Instead of

	sys_clk_144mhz: sys_clk_144mhz {
		compatible = "clock-state";
		clocks = <&ahbclkdiv 1 &fro_12m 1 &mainclkselb 0 &mainclksela 0
			&xtal32m 1 &clk_in_en 1 &pll1clksel 1
			&pll1_pdec 2 &pll1 288000000 8 144 0 53 31
			&pll1_directo 0 &pll1_bypass 0
			&mainclkselb 2>;
       }

Having: (please don't mind the correctness)

	sys_clk_144mhz: sys_clk_144mhz {
		compatible = "clock-state";
		clocks = <[....]
                        &pll1 MHZ(288) PLL_MNPXY(8 144 0 53 31)
			&pll1_directo DIRECT
                        &pll1_bypass BYPASS
			&mainclkselb 2>;
       }

With PLL_MNPXY() just expanding as 8 144 0 53 31, but at least you get some helper to be sure the config is what you want without the need to have the binding before the eyes.
It helps partially, but remains vastly insufficient. For instance &mainclkselx specificiers don't provide any hint on the configuration unless you have the ref manual with you (and that node binding doesn't add more tricks).

Issue with defines is that you'll need to make them specific for each platform as BYPASS may be 0 or 1 or anything, depending on your HW.

Other tricky point to understand is mainclkselb which gets 2 different specifiers in the same array (&mainclkselb 0 [...] &mainclkselb 2). I'm not sure this is correct btw.

@danieldegrasse
Copy link
Collaborator Author

BTW, @danieldegrasse, does the framework support collapsing gate, divider and mux?

Yes- the vendor is free to write their clock drivers however they please. Generally the guidance would be to make each independently configurable clock producer a separate node, but things can be combined if it makes logical sense (for example, PLLs will likely just be one node). If a vendor wants to combine a multiplier and divider for example, that would be normal and expected.

With PLL_MNPXY() just expanding as 8 144 0 53 31, but at least you get some helper to be sure the config is what you want without the need to have the binding before the eyes.
It helps partially, but remains vastly insufficient. For instance &mainclkselx specificiers don't provide any hint on the configuration unless you have the ref manual with you (and that node binding doesn't add more tricks).

Sure, this is permitted- we could also have per-device headers with some of these specifier definitions if desired. Really up to the vendor.

Other tricky point to understand is mainclkselb which gets 2 different specifiers in the same array (&mainclkselb 0 [...] &mainclkselb 2). I'm not sure this is correct btw.

This is intentional- clock settings in a state are applied sequentially- so that clock state actually moves the main clock to a safe low frequency source (FRO12M), then configures a PLL, and moves the main clock to that source.

You're proposal is indeed much more readable, but I understand the verbosity concern. I had no time yet to spend much time on this. In a first step I wanted to check effect of using macros and defines. For instance

@erwango would you prefer something like my proposal? I think the first option might work well, provided we can determine a way to name nodes that we reference. PLLs in particular are challenging- do we name the states based on the settings being applied? Something like pll0_mult_4_3 to say multiply the input by 4/3?

(when nonzero)
Other specifier values may cause undefined behavior.

compatible: "clock-source"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where this compat is used anywhere, what is the purpose of this?
There is a gate-offset in the source? Should it be a combination of div, mux, gate and output?

I see there is a nxp,syscon-clock-gate.yaml, but no base binding for clock gate?

@XenuIsWatching
Copy link
Member

@erwango would you prefer something like my proposal? I think the first option might work well, provided we can determine a way to name nodes that we reference. PLLs in particular are challenging- do we name the states based on the settings being applied? Something like pll0_mult_4_3 to say multiply the input by 4/3?

pinging @erwango

Copy link
Member

@nandojve nandojve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @danieldegrasse ,

Very nice work!

I have a few questions:

Is the proposal at moment capable to know that multiple peripherals can share the same clock line ? I mean, imagine that we disable a peripheral, could the clock system prevent that clock will be disabled because the other peripheral still consuming the clock ?

Comment on lines +279 to +331
pll0clksel: pll0clksel@40000290 {
compatible = "nxp,syscon-clock-mux";
#clock-cells = <1>;
/* SYSCON::PLL0CLKSEL[SEL] */
reg = <0x40000290 0x3>;
offset = <0x0>;
input-sources = <&fro_12m &clk_in_en &fro_1m &rtcosc32ksel
&no_clock>;
#address-cells = <1>;
#size-cells = <1>;

pll0: pll0@40000580 {
compatible = "nxp,lpc55sxx-pll0";
reg = <0x40000580 0x20>;
#clock-cells = <9>;
#address-cells = <1>;
#size-cells = <1>;

pll0_pdec: pll0-pdec@4000058c {
compatible = "nxp,lpc55sxx-pll-pdec";
#clock-cells = <1>;
/* SYSCON::PLL0PDEC[PDIV] */
reg = <0x4000058c 0x5>;
};
};
};

pll0_directo: pll0-directo@40000580 {
compatible = "nxp,syscon-clock-mux";
#clock-cells = <1>;
/* SYSCON::PLL0CTRL[BYPASSPOSTDIV] */
reg = <0x40000580 0x1>;
offset = <0x14>;
input-sources = <&pll0_pdec &pll0>;
};

pll0_bypass: pll0-bypass@40000580 {
compatible = "nxp,syscon-clock-mux";
#clock-cells = <1>;
/* SYSCON::PLL0CTRL[BYPASSPLL] */
reg = <0x40000580 0x1>;
offset = <0xf>;
input-sources = <&pll0_directo &pll0clksel>;
#address-cells = <1>;
#size-cells = <1>;

pll0div: pll0div@400003c4 {
compatible = "nxp,syscon-clock-div";
#clock-cells = <1>;
/* SYSCON::PLL0CLKDIV[DIV] */
reg = <0x400003c4 0x8>;
};
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking your work and it is very interesting. I liked how the rtcosc32ksel node was created. However, it is a little more difficult to understand the pll0. I didn't see the compatible = "clock-output"; for instance. Then, how it is arranged is hard to understand what will be the outputs, I imagine that is because of the nature of the PLL itself. So, the clock-output-names can be omitted because now there is the clock-output, is this correct?

In this case, a PLL with multiple outputs will be something like:

pll {
  my-out-0 {
    compatible = "clock-output";
  };
  my-out-1 {
    compatible = "clock-output";
  };
  ...
  my-out-n {
    compatible = "clock-output";
  };
};

I was wondering if you could elaborate to us a little bit more about it.

@XenuIsWatching
Copy link
Member

XenuIsWatching commented Apr 2, 2025

Hi @danieldegrasse ,

Very nice work!

I have a few questions:

Is the proposal at moment capable to know that multiple peripherals can share the same clock line ? I mean, imagine that we disable a peripheral, could the clock system prevent that clock will be disabled because the other peripheral still consuming the clock ?

I actually did something like this, I just use the existing pm_device_busy_set(dev); and pm_device_busy_clear(dev); with the driver to 'lock' changes to the clock

int cdns_i3c_clock_cb(const struct clock_management_event *ev, const void *data)
{
    const struct device *dev = data;

    if (ev->type == CLOCK_MANAGEMENT_PRE_RATE_CHANGE) {
        /* Do not allow a clock change if currently active */
        if (pm_device_is_busy(dev) && (ev->old_rate != ev->new_rate)) {
            return -EBUSY;
        }
    } else if (ev->type == CLOCK_MANAGEMENT_POST_RATE_CHANGE) {
        /* Update the prescaler values */
        if (ev->old_rate != ev->new_rate) {
            cdns_i3c_set_prescalers(dev);
        }
    }
    return 0;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Build System area: Devicetree Binding PR modifies or adds a Device Tree binding area: Devicetree area: Linker Scripts area: native port Host native arch port (native_sim) area: UART Universal Asynchronous Receiver-Transmitter platform: NXP Drivers NXP Semiconductors, drivers platform: NXP NXP
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.