Skip to content

Commit aae8b47

Browse files
Feature: Union schema compatibility (#21)
* MagicBot/add-union-schema updates * add union schema * update for databricks and version * update identifiers * update pkgs * update materialization * update grouping * update changelog & readme * add consistency tests * update consistency report * update changelog & add autoreleaser * regen docs * Update packages.yml
1 parent be158f0 commit aae8b47

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+627
-57
lines changed

.buildkite/hooks/pre-command

+2-1
Original file line numberDiff line numberDiff line change
@@ -21,4 +21,5 @@ export CI_SNOWFLAKE_DBT_USER=$(gcloud secrets versions access latest --secret="C
2121
export CI_SNOWFLAKE_DBT_WAREHOUSE=$(gcloud secrets versions access latest --secret="CI_SNOWFLAKE_DBT_WAREHOUSE" --project="dbt-package-testing-363917")
2222
export CI_DATABRICKS_DBT_HOST=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HOST" --project="dbt-package-testing-363917")
2323
export CI_DATABRICKS_DBT_HTTP_PATH=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HTTP_PATH" --project="dbt-package-testing-363917")
24-
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
24+
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
25+
export CI_DATABRICKS_DBT_CATALOG=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_CATALOG" --project="dbt-package-testing-363917")

.buildkite/pipeline.yml

+1
Original file line numberDiff line numberDiff line change
@@ -69,5 +69,6 @@ steps:
6969
- "CI_DATABRICKS_DBT_HOST"
7070
- "CI_DATABRICKS_DBT_HTTP_PATH"
7171
- "CI_DATABRICKS_DBT_TOKEN"
72+
- "CI_DATABRICKS_DBT_CATALOG"
7273
commands: |
7374
bash .buildkite/scripts/run_models.sh databricks

.buildkite/scripts/run_models.sh

+3-1
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,6 @@ dbt deps
1919
dbt seed --target "$db" --full-refresh
2020
dbt run --target "$db" --full-refresh
2121
dbt test --target "$db"
22-
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"
22+
dbt run --vars '{apple_store__using_subscriptions: true, google_play__using_earnings: true, google_play__using_subscriptions: true}' --target "$db" --full-refresh
23+
dbt test --vars '{apple_store__using_subscriptions: true, google_play__using_earnings: true, google_play__using_subscriptions: true}' --target "$db"
24+
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"

.github/workflows/auto-release.yml

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
name: 'auto release'
2+
on:
3+
pull_request:
4+
types:
5+
- closed
6+
branches:
7+
- main
8+
9+
jobs:
10+
call-workflow-passing-data:
11+
if: github.event.pull_request.merged
12+
uses: fivetran/dbt_package_automations/.github/workflows/auto-release.yml@main
13+
secrets: inherit

CHANGELOG.md

+16
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,19 @@
1+
# dbt_app_reporting v0.4.0
2+
[PR #21](https://github.com/fivetran/dbt_app_reporting/pull/21) includes the following updates:
3+
4+
## 🚨 Breaking hanges 🚨
5+
- Identifier variables for the following packages have been updated for consistency with the source name and compatibility with the union schema feature. See the package's changelog for a full list of changes.
6+
- [dbt_apple_store](https://github.com/fivetran/dbt_linkedin/blob/main/CHANGELOG.md#dbt_apple_store-v040)
7+
- [dbt_google_play](https://github.com/fivetran/dbt_microsoft_ads/blob/main/CHANGELOG.md#dbt_google_play-v040)
8+
9+
## Feature update 🎉
10+
- Unioning capability! This adds the ability to union source data from multiple app_reporting connectors. Refer to the [README](https://github.com/fivetran/dbt_app_reporting/blob/main/README.md#union-multiple-connectors) for more details.
11+
- Added a `source_relation` column in each upstream model for tracking the source of each record.
12+
- The `source_relation` column is also persisted from the upstream models to the end models.
13+
14+
## Under the hood
15+
- Included auto-releaser GitHub Actions workflow to automate future releases.
16+
117
# dbt_app_reporting v0.3.2
218
## Bug Fixes
319
[PR #19](https://github.com/fivetran/dbt_app_reporting/pull/19) includes the following update:

README.md

+23-7
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Include the following github package version in your `packages.yml`
4545
```yaml
4646
packages:
4747
- package: fivetran/app_reporting
48-
version: [">=0.3.0", "<0.4.0"] # we recommend using ranges to capture non-breaking changes automatically
48+
version: [">=0.4.0", "<0.5.0"] # we recommend using ranges to capture non-breaking changes automatically
4949
```
5050
5151
Do NOT include the individual app platform packages in this file. The app reporting package itself has dependencies on these packages and will install them as well.
@@ -114,15 +114,31 @@ models:
114114
> Provide a blank `+schema: ` to write to the `target_schema` without any suffix.
115115

116116
## (Optional) Step 7: Additional configurations
117-
<details><summary>Expand to view configurations</summary>
117+
<details open><summary>Expand/collapse configurations</summary>
118+
119+
### Union multiple connectors
120+
If you have multiple app reporting connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `<package_name>_union_schemas` OR `<package_name>_union_databases` variables (cannot do both) in your root `dbt_project.yml` file. Below are the variables and examples for each connector:
121+
122+
```yml
123+
vars:
124+
apple_store_union_schemas: ['apple_store_usa','apple_store_canada']
125+
apple_store_union_databases: ['apple_store_usa','apple_store_canada']
126+
127+
google_play_union_schemas: ['google_play_usa','google_play_canada']
128+
google_play_union_databases: ['google_play_usa','google_play_canada']
129+
```
130+
Please be aware that the native `source.yml` connection set up in the package will not function when the union schema/database feature is utilized. Although the data will be correctly combined, you will not observe the sources linked to the package models in the Directed Acyclic Graph (DAG). This happens because the package includes only one defined `source.yml`.
131+
132+
To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.
118133

119134
### Change the source table references
120135
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:
121136
> IMPORTANT: See the Apple Store [`dbt_project.yml`](https://github.com/fivetran/dbt_apple_store_source/blob/main/dbt_project.yml) and Google Play [`dbt_project.yml`](https://github.com/fivetran/dbt_google_play_source/blob/main/dbt_project.yml) variable declarations to see the expected names.
122137

123138
```yml
124139
vars:
125-
<default_source_table_name>_identifier: your_table_name
140+
apple_store_<default_source_table_name>_identifier: your_table_name
141+
google_play_<default_source_table_name>_identifier: your_table_name
126142
```
127143

128144
</details>
@@ -143,16 +159,16 @@ This dbt package is dependent on the following dbt packages. For more informatio
143159
```yml
144160
packages:
145161
- package: fivetran/apple_store
146-
version: [">=0.3.0", "<0.4.0"]
162+
version: [">=0.4.0", "<0.5.0"]
147163
148164
- package: fivetran/apple_store_source
149-
version: [">=0.3.0", "<0.4.0"]
165+
version: [">=0.4.0", "<0.5.0"]
150166
151167
- package: fivetran/google_play
152-
version: [">=0.3.0", "<0.4.0"]
168+
version: [">=0.4.0", "<0.5.0"]
153169
154170
- package: fivetran/google_play_source
155-
version: [">=0.3.0", "<0.4.0"]
171+
version: [">=0.4.0", "<0.5.0"]
156172
157173
- package: fivetran/fivetran_utils
158174
version: [">=0.4.0", "<0.5.0"]

dbt_project.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
name: 'app_reporting'
2-
version: '0.3.2'
2+
version: '0.4.0'
33
config-version: 2
44
models:
55
app_reporting:

docs/catalog.json

+1
Large diffs are not rendered by default.

docs/index.html

+75
Large diffs are not rendered by default.

docs/manifest.json

+1
Large diffs are not rendered by default.

docs/run_results.json

+1
Large diffs are not rendered by default.

integration_tests/ci/sample.profiles.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ integration_tests:
4545
schema: app_reporting_integrations_test_5
4646
threads: 8
4747
databricks:
48-
catalog: null
48+
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
4949
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
5050
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
5151
schema: app_reporting_integrations_test_5

integration_tests/dbt_project.yml

+38-34
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,49 @@
11
name: 'app_reporting_integration_tests'
2-
version: '0.3.2'
2+
version: '0.4.0'
33
profile: 'integration_tests'
44
config-version: 2
55
vars:
6+
# apple_store__using_subscriptions: true # uncomment this line when generating docs!
7+
# google_play__using_subscriptions: true # uncomment this line when regenerating docs!
8+
# google_play__using_earnings: true # uncomment this line when regenerating docs!
69
google_play_schema: app_reporting_integrations_test_5
710
apple_store_schema: app_reporting_integrations_test_5
811
google_play_source:
9-
stats_installs_app_version_identifier: "stats_installs_app_version"
10-
stats_crashes_app_version_identifier: "stats_crashes_app_version"
11-
stats_ratings_app_version_identifier: "stats_ratings_app_version"
12-
stats_installs_device_identifier: "stats_installs_device"
13-
stats_ratings_device_identifier: "stats_ratings_device"
14-
stats_installs_os_version_identifier: "stats_installs_os_version"
15-
stats_ratings_os_version_identifier: "stats_ratings_os_version"
16-
stats_crashes_os_version_identifier: "stats_crashes_os_version"
17-
stats_installs_country_identifier: "stats_installs_country"
18-
stats_ratings_country_identifier: "stats_ratings_country"
19-
stats_store_performance_country_identifier: "stats_store_performance_country"
20-
stats_store_performance_traffic_source_identifier: "stats_store_performance_traffic_source"
21-
stats_installs_overview_identifier: "stats_installs_overview"
22-
stats_crashes_overview_identifier: "stats_crashes_overview"
23-
stats_ratings_overview_identifier: "stats_ratings_overview"
24-
earnings_identifier: "earnings"
25-
financial_stats_subscriptions_country_identifier: "financial_stats_subscriptions_country"
12+
google_play_stats_installs_app_version_identifier: "stats_installs_app_version"
13+
google_play_stats_crashes_app_version_identifier: "stats_crashes_app_version"
14+
google_play_stats_ratings_app_version_identifier: "stats_ratings_app_version"
15+
google_play_stats_installs_device_identifier: "stats_installs_device"
16+
google_play_stats_ratings_device_identifier: "stats_ratings_device"
17+
google_play_stats_installs_os_version_identifier: "stats_installs_os_version"
18+
google_play_stats_ratings_os_version_identifier: "stats_ratings_os_version"
19+
google_play_stats_crashes_os_version_identifier: "stats_crashes_os_version"
20+
google_play_stats_installs_country_identifier: "stats_installs_country"
21+
google_play_stats_ratings_country_identifier: "stats_ratings_country"
22+
google_play_stats_store_performance_country_identifier: "stats_store_performance_country"
23+
google_play_stats_store_performance_traffic_source_identifier: "stats_store_performance_traffic_source"
24+
google_play_stats_installs_overview_identifier: "stats_installs_overview"
25+
google_play_stats_crashes_overview_identifier: "stats_crashes_overview"
26+
google_play_stats_ratings_overview_identifier: "stats_ratings_overview"
27+
google_play_earnings_identifier: "earnings"
28+
google_play_financial_stats_subscriptions_country_identifier: "financial_stats_subscriptions_country"
2629

2730
apple_store_source:
28-
app_identifier: "app"
29-
app_store_platform_version_source_type_report_identifier: "app_store_platform_version_source_type"
30-
app_store_source_type_device_report_identifier: "app_store_source_type_device"
31-
app_store_territory_source_type_report_identifier: "app_store_territory_source_type"
32-
crashes_app_version_device_report_identifier: "crashes_app_version"
33-
crashes_platform_version_device_report_identifier: "crashes_platform_version"
34-
downloads_platform_version_source_type_report_identifier: "downloads_platform_version_source_type"
35-
downloads_source_type_device_report_identifier: "downloads_source_type_device"
36-
downloads_territory_source_type_report_identifier: "downloads_territory_source_type"
37-
sales_account_identifier: "sales_account"
38-
sales_subscription_event_summary_identifier: "sales_subscription_events"
39-
sales_subscription_summary_identifier: "sales_subscription_summary"
40-
usage_app_version_source_type_report_identifier: "usage_app_version_source_type"
41-
usage_platform_version_source_type_report_identifier: "usage_platform_version_source_type"
42-
usage_source_type_device_report_identifier: "usage_source_type_device"
43-
usage_territory_source_type_report_identifier: usage_territory_source_type
31+
apple_store_app_identifier: "app"
32+
apple_store_app_store_platform_version_source_type_report_identifier: "app_store_platform_version_source_type"
33+
apple_store_app_store_source_type_device_report_identifier: "app_store_source_type_device"
34+
apple_store_app_store_territory_source_type_report_identifier: "app_store_territory_source_type"
35+
apple_store_crashes_app_version_device_report_identifier: "crashes_app_version"
36+
apple_store_crashes_platform_version_device_report_identifier: "crashes_platform_version"
37+
apple_store_downloads_platform_version_source_type_report_identifier: "downloads_platform_version_source_type"
38+
apple_store_downloads_source_type_device_report_identifier: "downloads_source_type_device"
39+
apple_store_downloads_territory_source_type_report_identifier: "downloads_territory_source_type"
40+
apple_store_sales_account_identifier: "sales_account"
41+
apple_store_sales_subscription_event_summary_identifier: "sales_subscription_events"
42+
apple_store_sales_subscription_summary_identifier: "sales_subscription_summary"
43+
apple_store_usage_app_version_source_type_report_identifier: "usage_app_version_source_type"
44+
apple_store_usage_platform_version_source_type_report_identifier: "usage_platform_version_source_type"
45+
apple_store_usage_source_type_device_report_identifier: "usage_source_type_device"
46+
apple_store_usage_territory_source_type_report_identifier: usage_territory_source_type
4447

4548
apple_store__subscription_events:
4649
- 'Renew'
@@ -55,6 +58,7 @@ models:
5558
+persist_docs:
5659
relation: "{{ false if target.type in ('spark','databricks') else true }}"
5760
columns: "{{ false if target.type in ('spark','databricks') else true }}"
61+
+schema: "app_reporting_{{ var('directed_schema','dev') }}" ## To be used for validation testing
5862

5963
seeds:
6064
app_reporting_integration_tests:
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
{{ config(
2+
tags="fivetran_validations",
3+
enabled=var('fivetran_validation_tests_enabled', false)
4+
) }}
5+
6+
-- this test ensures the daily_activity end model matches the prior version
7+
with prod as (
8+
select *
9+
from {{ target.schema }}_app_reporting_prod.app_reporting__app_version_report
10+
),
11+
12+
dev as (
13+
select *
14+
from {{ target.schema }}_app_reporting_dev.app_reporting__app_version_report
15+
),
16+
17+
prod_not_in_dev as (
18+
-- rows from prod not found in dev
19+
select * from prod
20+
except distinct
21+
select * from dev
22+
),
23+
24+
dev_not_in_prod as (
25+
-- rows from dev not found in prod
26+
select * from dev
27+
except distinct
28+
select * from prod
29+
),
30+
31+
final as (
32+
select
33+
*,
34+
'from prod' as source
35+
from prod_not_in_dev
36+
37+
union all -- union since we only care if rows are produced
38+
39+
select
40+
*,
41+
'from dev' as source
42+
from dev_not_in_prod
43+
)
44+
45+
select *
46+
from final
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
{{ config(
2+
tags="fivetran_validations",
3+
enabled=var('fivetran_validation_tests_enabled', false)
4+
) }}
5+
6+
-- this test ensures the daily_activity end model matches the prior version
7+
with prod as (
8+
select *
9+
from {{ target.schema }}_app_reporting_prod.app_reporting__country_report
10+
),
11+
12+
dev as (
13+
select *
14+
from {{ target.schema }}_app_reporting_dev.app_reporting__country_report
15+
),
16+
17+
prod_not_in_dev as (
18+
-- rows from prod not found in dev
19+
select * from prod
20+
except distinct
21+
select * from dev
22+
),
23+
24+
dev_not_in_prod as (
25+
-- rows from dev not found in prod
26+
select * from dev
27+
except distinct
28+
select * from prod
29+
),
30+
31+
final as (
32+
select
33+
*,
34+
'from prod' as source
35+
from prod_not_in_dev
36+
37+
union all -- union since we only care if rows are produced
38+
39+
select
40+
*,
41+
'from dev' as source
42+
from dev_not_in_prod
43+
)
44+
45+
select *
46+
from final
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
{{ config(
2+
tags="fivetran_validations",
3+
enabled=var('fivetran_validation_tests_enabled', false)
4+
) }}
5+
6+
-- this test ensures the daily_activity end model matches the prior version
7+
with prod as (
8+
select *
9+
from {{ target.schema }}_app_reporting_prod.app_reporting__device_report
10+
),
11+
12+
dev as (
13+
select *
14+
from {{ target.schema }}_app_reporting_dev.app_reporting__device_report
15+
),
16+
17+
prod_not_in_dev as (
18+
-- rows from prod not found in dev
19+
select * from prod
20+
except distinct
21+
select * from dev
22+
),
23+
24+
dev_not_in_prod as (
25+
-- rows from dev not found in prod
26+
select * from dev
27+
except distinct
28+
select * from prod
29+
),
30+
31+
final as (
32+
select
33+
*,
34+
'from prod' as source
35+
from prod_not_in_dev
36+
37+
union all -- union since we only care if rows are produced
38+
39+
select
40+
*,
41+
'from dev' as source
42+
from dev_not_in_prod
43+
)
44+
45+
select *
46+
from final

0 commit comments

Comments
 (0)