Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Create API to report status of integrations synchronization #216178

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

criamico
Copy link
Contributor

@criamico criamico commented Mar 27, 2025

Closes #192363

Summary

Add endpoint that compares integrations installed on remote cluster with integrations in ccr index fleet-synced-integrations-ccr-<outputId>. Feature flag: enableSyncIntegrationsOnRemote

  • Use the ccr info api to check that the ccr index is active
  • Compare the content of the two indices and report the sync status for each integration:
GET kbn:/api/fleet/remote_synced_integrations/status

{
  "integrations": [
    {
      "package_name": "akamai",
      "package_version": "2.28.0",
      "updated_at": "2025-03-27T10:29:52.485Z",
      "sync_status": true
    },
    {
      "package_name": "auth0",
      "package_version": "1.21.0",
      "updated_at": "2025-03-26T12:06:26.268Z",
      "sync_status": false,
      "error": "Installation status: not_installed" 
    },
]

Testing

Setup local env with the guide added in dev_docs (preview)

  • Install some integrations on local cluster, wait that they are synced on remote
  • From remote cluster dev tools, run
GET kbn:/api/fleet/remote_synced_integrations/status
  • To verify that custom assets are synced choose an integration, for instance system
  • From the package policy select a var, advanced options and add a custom mapping and a custom pipeline. In my example I used system
Screenshot 2025-04-01 at 11 18 40
  • Run the endpoint again and you should see the status of custom assets too:
{
  "integrations": [
    {
      "package_name": "akamai",
      "package_version": "2.28.0",
      "updated_at": "2025-03-27T10:29:52.485Z",
      "sync_status": "completed"
    },
    {
      "package_name": "elastic_agent",
      "package_version": "2.2.0",
      "updated_at": "2025-03-26T14:06:29.216Z",
      "sync_status": "completed"
    },
    {
      "package_name": "synthetics",
      "package_version": "1.4.1",
      "updated_at": "2025-03-26T14:06:31.909Z",
      "sync_status": "completed"
    },
    {
      "package_name": "system",
      "package_version": "1.67.3",
      "updated_at": "2025-03-28T10:08:00.602Z",
      "sync_status": "completed"
    }
  ],
  "custom_assets": {
    "component_template:logs-system.auth@custom": {
      "name": "logs-system.auth@custom",
      "type": "component_template",
      "package_name": "system",
      "package_version": "1.67.3",
      "sync_status": "completed"
    },
    "ingest_pipeline:logs-system.auth@custom": {
      "name": "logs-system.auth@custom",
      "type": "ingest_pipeline",
      "package_name": "system",
      "package_version": "1.67.3",
      "sync_status": "completed"
    }
  }
}

Checklist

@criamico criamico added the Team:Fleet Team label for Observability Data Collection Fleet team label Mar 27, 2025
@criamico criamico self-assigned this Mar 27, 2025
@criamico
Copy link
Contributor Author

@elasticmachine merge upstream

if (!installedIntegrations) {
return {
items: [],
error: `No integrations installed on remote`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not necessarily an error if the main cluster doesn't have any integrations either


try {
const installedPipelines = await getPipeline(esClient, abortController);
const installedComponentTemplates = await getComponentTemplate(esClient, abortController);
Copy link
Contributor Author

@criamico criamico Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for the solution of fetching all the pipelines and component templates and keep them on memory, instead of doing a call for each one in the loop below. @juliaElastic do you think it could become a performance issue?

Copy link
Contributor

@juliaElastic juliaElastic Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only compare pipelines and component templates that match '*@custom', like it's done in custom_assets.ts. The other assets are already installed when the package is installed, no need to compare them.

Object.entries(ccrCustomAssets).forEach(([ccrCustomName, ccrCustomAsset]) => {
if (ccrCustomAsset.type === 'ingest_pipeline') {
const installedAsset = installedPipelines[ccrCustomAsset?.name];
if (isEqual(installedAsset?.processors, ccrCustomAsset?.pipeline?.processors)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipelines have an optional version which we can use to compare like here

(existingPipeline.version && existingPipeline.version < customAsset.pipeline.version) ||

we should probably have a common logic to compare here and custom_assets.ts

@criamico criamico changed the title 192363 integrations sync status [Fleet] Create API to report status of integrations synchronization Apr 1, 2025
@criamico
Copy link
Contributor Author

criamico commented Apr 1, 2025

@elasticmachine merge upstream

@criamico criamico added v9.1.0 release_note:skip Skip the PR/issue when compiling release notes backport:skip This commit does not require backporting labels Apr 1, 2025
@criamico criamico marked this pull request as ready for review April 1, 2025 16:05
@criamico criamico requested a review from a team as a code owner April 1, 2025 16:05
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

kibanamachine and others added 3 commits April 3, 2025 14:30
…t --include-path /api/status --include-path /api/alerting/rule/ --include-path /api/alerting/rules --include-path /api/actions --include-path /api/security/role --include-path /api/spaces --include-path /api/streams --include-path /api/fleet --include-path /api/dashboards --update'
@criamico
Copy link
Contributor Author

criamico commented Apr 3, 2025

@juliaElastic based on the new requirements in #217025 we should be able to query by output_id:

Create API that queries remote kibana sync status API by output ID (to be used by the UI to show status)
e.g. /api/fleet/remote_synced_integrations/{output_id}/status -> https://{remote_kibana}/api/fleet/remote_synced_integrations/status

In a previous commit it was already by output_id. Do you think we'll need to keep the general status? Otherwise I'll change it directly in this PR.

@criamico
Copy link
Contributor Author

criamico commented Apr 4, 2025

@elasticmachine merge upstream

elasticmachine and others added 3 commits April 4, 2025 07:26
…t --include-path /api/status --include-path /api/alerting/rule/ --include-path /api/alerting/rules --include-path /api/actions --include-path /api/security/role --include-path /api/spaces --include-path /api/streams --include-path /api/fleet --include-path /api/dashboards --update'
@juliaElastic
Copy link
Contributor

@juliaElastic based on the new requirements in #217025 we should be able to query by output_id:

Create API that queries remote kibana sync status API by output ID (to be used by the UI to show status)
e.g. /api/fleet/remote_synced_integrations/{output_id}/status -> https://{remote_kibana}/api/fleet/remote_synced_integrations/status

In a previous commit it was already by output_id. Do you think we'll need to keep the general status? Otherwise I'll change it directly in this PR.

We can keep as is in the current pr, as it collects the status in the remote cluster. The new API by output_id will only call the remote API using the remote output kibana url and API key.

@criamico
Copy link
Contributor Author

criamico commented Apr 4, 2025

@elasticmachine merge upstream

return { info: res.follower_indices[0] };
} catch (err) {
if (err?.body?.error?.type === 'index_not_found_exception')
throw new IndexNotFoundError(`Index not found`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return the error message instead of throwing an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's handled here: https://github.com/elastic/kibana/pull/216178/files#diff-f8de6a6d308d65b9d61c400a2fdbebe1078e67eaf036537c028476260be254f5R312-R315

I left the throw block and handled outside because we might need this function elsewhere, this is a basic utility for the ccr case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, I added some unit tests to cover these cases

};
} else if (
installedPipeline?.version &&
installedPipeline.version < ccrCustomAsset.pipeline.version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the version comparison should be done before the equality check, so we can skip the equality check if the version is not equal

if (ccrCustomAsset.is_deleted === true && installedPipeline) {
return {
...result,
sync_status: 'failed' as SyncStatus.FAILED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be synchronizing (the deletion might not have happened yet) unless we know there was an error deleting?

if (ccrCustomAsset.is_deleted === true && installedCompTemplate) {
return {
...result,
sync_status: 'failed' as SyncStatus.FAILED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, should these be synchronizing unless we know there was an error deleting?

@criamico
Copy link
Contributor Author

criamico commented Apr 7, 2025

@elasticmachine merge upstream

@@ -255,6 +250,11 @@ const compareCustomAssets = ({
sync_status: 'failed' as SyncStatus.FAILED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be synchronizing too

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
fleet 1199 1201 +2

History

cc @criamico

return {
...ccrIntegration,
sync_status: 'failed' as SyncStatus.FAILED,
error: `Installation status: ${localIntegrationSO?.attributes.install_status}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we return the error message if install_status: install_failed from latest_install_failed_attempts or latest_executed_state?

latest_install_failed_attempts?: InstallFailedAttempt[];

Copy link
Contributor

@juliaElastic juliaElastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the updates

@criamico criamico enabled auto-merge (squash) April 7, 2025 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Fleet] Report status of integration synchronization in Fleet API
4 participants