|
21 | 21 | | Label name | Description |
|
22 | 22 | | :----------|-------------|
|
23 | 23 | | docs-required| default tag used to request documentation, has to be removed before merge |
|
24 |
| -| ok-package-test | run all package tests | |
| 24 | +| ok-package-test | Build for all possible targets | |
25 | 25 | | ok-to-test | run all integration tests |
|
26 | 26 | | ok-to-merge | run mergebot and merge (rebase) current PR |
|
27 | 27 | | ci/integration-docker-ok | integration test is able to build docker image |
|
@@ -66,14 +66,162 @@ For some reason this is not automatically done via permission inheritance or sim
|
66 | 66 |
|
67 | 67 | Each major version (e.g. 1.8 & 1.9) supports different targets to build for, e.g. 1.9 includes a CentOS 8 target and 1.8 has some other legacy targets.
|
68 | 68 |
|
69 |
| -This is all handled by the [build matrix generation composite action](../actions/generate-package-build-matrix/action.yaml) so make sure to update appropriately. |
70 |
| -The build matrix is then fed into the reusable job that builds packages which will then fire for the appropriate targets. |
| 69 | +This is all handled by the [build matrix generation composite action](../actions/generate-package-build-matrix/action.yaml). |
| 70 | +This uses a [JSON file](../../packaging/build-config.json) to specify the targets so ensure this is updated. |
| 71 | +The build matrix is then fed into the [reusable job](./call-build-linux-packages.yaml) that builds packages which will then fire for the appropriate targets. |
| 72 | +The reusable job is used for all package builds including unstable/nightly and the PR `ok-package-test` triggered ones. |
71 | 73 |
|
72 | 74 | ## Releases
|
73 | 75 |
|
74 |
| -Currently the process is as follows: |
| 76 | +The process at a high level is as follows: |
75 | 77 |
|
76 |
| -1. Tag the source with whatever tag you like on master. |
77 |
| -2. The [`Deploy to staging`](./staging-build.yaml) workflow will then kick in to build everything and upload it either to the S3 staging bucket (packages) or ghcr.io (containers). |
78 |
| -3. Once this completes, the [`Test staging`](./staging-test.yaml) workflow will then run to carry out smoke tests on these packages and containers. |
79 |
| -4. The [`Release from staging`](./staging-release.yaml) workflow can then be manually initiated to promote staging to release. |
| 78 | +1. Tag created with `v` prefix. |
| 79 | +2. [Deploy to staging](https://github.com/fluent/fluent-bit/actions/workflows/staging-build.yaml) workflow runs. |
| 80 | +3. [Test staging](https://github.com/fluent/fluent-bit/actions/workflows/staging-test.yaml) workflow runs. |
| 81 | +4. Manually initiate [release from staging](https://github.com/fluent/fluent-bit/actions/workflows/staging-release.yaml) workflow. |
| 82 | +5. A PR is auto-created to increment the minor version now for Fluent Bit using the [`update_version.sh`](../../update_version.sh) script. |
| 83 | +6. Create PRs for doc updates - Windows & container versions. (WIP to automate). |
| 84 | + |
| 85 | +Breaking the steps down. |
| 86 | + |
| 87 | +### Deploy to staging and test |
| 88 | + |
| 89 | +This should run automatically when a tag is created matching the `v*` regex. |
| 90 | +It currently copes with 1.8+ builds although automation is only exercised for 1.9+ releases. |
| 91 | + |
| 92 | +Once this is completed successfully the staging tests should also run automatically. |
| 93 | + |
| 94 | + |
| 95 | + |
| 96 | +If both complete successfully then we are good to go. |
| 97 | + |
| 98 | +Occasional failures are seen with package builds not downloading dependencies (CentOS 7 in particular seems bad for this). |
| 99 | +A re-run of failed jobs should resolve this. |
| 100 | + |
| 101 | +The workflow builds all Linux, macOS and Windows targets to a staging S3 bucket plus the container images to ghcr.io. |
| 102 | + |
| 103 | +### Release from staging workflow |
| 104 | + |
| 105 | +This is a manually initiated workflow, the intention is multiple staging builds can happen but we only release one. |
| 106 | +Note that currently we do not support parallel staging builds of different versions, e.g. master and 1.9 branches. |
| 107 | +**We can only release the previous staging build and there is a check to confirm version.** |
| 108 | + |
| 109 | +Ensure AppVeyor build for the tag has completed successfully as well. |
| 110 | + |
| 111 | +To trigger: <https://github.com/fluent/fluent-bit/actions/workflows/staging-release.yaml> |
| 112 | + |
| 113 | +All this job does is copy the various artefacts from staging locations to release ones, it does not rebuild them. |
| 114 | + |
| 115 | + |
| 116 | + |
| 117 | +With this example you can see we used the wrong `version` as it requires it without the `v` prefix (it is used for container tag, etc.) and so it fails. |
| 118 | + |
| 119 | + |
| 120 | + |
| 121 | +Make sure to provide without the `v` prefix. |
| 122 | + |
| 123 | + |
| 124 | + |
| 125 | +Once this workflow is initiated you then also need to have it approved by the designated "release team" otherwise it will not progress. |
| 126 | + |
| 127 | + |
| 128 | + |
| 129 | +They will be notified for approval by Github. |
| 130 | +Unfortunately it has to be approved for each job in the sequence rather than a global approval for the whole workflow although that can be useful to check between jobs. |
| 131 | + |
| 132 | + |
| 133 | + |
| 134 | +This is quite useful to delay the final smoke test of packages until after the manual steps are done as it will then verify them all for you. |
| 135 | + |
| 136 | +#### Packages server sync |
| 137 | + |
| 138 | +The workflow above ensures all release artefacts are pushed to the appropriate container registry and S3 bucket for official releases. |
| 139 | +The packages server then periodically syncs from this bucket to pull down and serve the new packages so there may be a delay (up to 1 hour) before it serves the new versions. |
| 140 | +The syncs happen hourly. |
| 141 | +See <https://github.com/fluent/fluent-bit-infra/blob/main/terraform/provision/package-server-provision.sh.tftpl> for details of the dedicated packages server. |
| 142 | + |
| 143 | +The main reason for a separate server is to accurately track download statistics. |
| 144 | +Container images are handled by ghcr.io and Docker Hub, not this server. |
| 145 | + |
| 146 | +#### Transient container publishing failures |
| 147 | + |
| 148 | +The parallel publishing of multiple container tags for the same image seems to fail occasionally with network errors, particularly more for ghcr.io than DockerHub. |
| 149 | +This can be resolved by just re-running the failed jobs. |
| 150 | + |
| 151 | +#### Windows builds from AppVeyor |
| 152 | + |
| 153 | +This is automated, however confirm that the actual build is successful for the tag: <https://ci.appveyor.com/project/fluent/fluent-bit-2e87g/history> |
| 154 | +If not then ask a maintainer to retrigger. |
| 155 | + |
| 156 | +It can take a while to find the one for the specific tag... |
| 157 | + |
| 158 | +#### ARM builds |
| 159 | + |
| 160 | +All builds are carried out in containers and intended to be run on a valid Ubuntu host to match a standard Github Actions runner. |
| 161 | +This can take some time for ARM as we have to emulate the architecture via QEMU. |
| 162 | + |
| 163 | +<https://github.com/fluent/fluent-bit/pull/7527> introduces support to run ARM builds on a dedicated <actuated.dev> ephemeral VM runner. |
| 164 | +A self-hosted ARM runner is set up and provisioned for this per the [documentation](https://docs.actuated.dev/provision-server/). |
| 165 | +For forks, this should all be skipped and run on a normal Ubuntu Github hosted runner but be aware this may take some time. |
| 166 | + |
| 167 | +### Manual release |
| 168 | + |
| 169 | +As long as it is built to staging we can manually publish packages as well via the script here: <https://github.com/fluent/fluent-bit/blob/master/packaging/update-repos.sh> |
| 170 | + |
| 171 | +Containers can be promoted manually too, ensure to include all architectures and signatures. |
| 172 | + |
| 173 | +### Create PRs |
| 174 | + |
| 175 | +Once releases are published we need to provide PRs for the following documentation updates: |
| 176 | + |
| 177 | +1. Windows checksums: <https://docs.fluentbit.io/manual/installation/windows#installation-packages> |
| 178 | +2. Container versions: <https://docs.fluentbit.io/manual/installation/docker#tags-and-versions> |
| 179 | + |
| 180 | +<https://github.com/fluent/fluent-bit-docs> is the repo for updates to docs. |
| 181 | + |
| 182 | +Take the checksums from the release process above, the AppVeyor stage provides them all and we attempt to auto-create the PR with it. |
| 183 | + |
| 184 | +## Unstable/nightly builds |
| 185 | + |
| 186 | +These happen every 24 hours and [reuse the same workflow](./cron-unstable-build.yaml) as the staging build so are identical except they skip the upload to S3 step. |
| 187 | +This means all targets are built nightly for `master` and `2.0` branches including container images and Linux, macOS and Windows packages. |
| 188 | + |
| 189 | +The container images are available here (the tag refers to the branch): |
| 190 | + |
| 191 | +* [ghcr.io/fluent/fluent-bit/unstable:2.0](ghcr.io/fluent/fluent-bit/unstable:2.0) |
| 192 | +* [ghcr.io/fluent/fluent-bit/unstable:master](ghcr.io/fluent/fluent-bit/unstable:master) |
| 193 | +* [ghcr.io/fluent/fluent-bit/unstable:windows-2019-2.0](ghcr.io/fluent/fluent-bit/unstable:windows-2019-2.0) |
| 194 | +* [ghcr.io/fluent/fluent-bit/unstable:windows-2019-master](ghcr.io/fluent/fluent-bit/unstable:windows-2019-master) |
| 195 | + |
| 196 | +The Linux, macOS and Windows packages are available to download from the specific workflow run. |
| 197 | + |
| 198 | +## Integration tests |
| 199 | + |
| 200 | +On every commit to `master` we rebuild the [packages](./build-master-packages.yaml) and [container images](./master-integration-test.yaml). |
| 201 | +The container images are then used to [run the integration tests](./master-integration-test.yaml) from the <https://github.com/fluent/fluent-bit-ci> repository. |
| 202 | +The container images are available as: |
| 203 | + |
| 204 | +* [ghcr.io/fluent/fluent-bit/master:x86_64](ghcr.io/fluent/fluent-bit/master:x86_64) |
| 205 | + |
| 206 | +## PR checks |
| 207 | + |
| 208 | +Various workflows are run for PRs automatically: |
| 209 | + |
| 210 | +* [Unit tests](./unit-tests.yaml) |
| 211 | +* [Compile checks on CentOS 7 compilers](./pr-compile-check.yaml) |
| 212 | +* [Linting](./pr-lint.yaml) |
| 213 | +* [Windows builds](./pr-windows-build.yaml) |
| 214 | +* [Fuzzing](./pr-fuzz.yaml) |
| 215 | +* [Container image builds](./pr-image-tests.yaml) |
| 216 | +* [Install script checks](./pr-install-script.yaml) |
| 217 | + |
| 218 | +We try to guard these to only trigger when relevant files are changed to reduce any delays or resources used. |
| 219 | +**All should be able to be triggered manually for explicit branches as well.** |
| 220 | + |
| 221 | +The following workflows can be triggered manually for specific PRs too: |
| 222 | + |
| 223 | +* [Integration tests](./pr-integration-test.yaml): Build a container image and run the integration tests as per commits to `master`. |
| 224 | +* [Performance tests](./pr-perf-test.yaml): WIP to trigger a performance test on a dedicated VM and collect the results as a PR comment. |
| 225 | +* [Full package build](./pr-package-tests.yaml): builds all Linux, macOs and Windows packages as well as container images. |
| 226 | + |
| 227 | +To trigger these, apply the relevant label. |
0 commit comments