Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/add union schema #19

Merged
merged 17 commits into from
Mar 11, 2025
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,24 @@
# dbt_youtube_analytics version.version
# dbt_youtube_analytics v0.5.0
[PR #19](https://github.com/fivetran/dbt_youtube_analytics/pull/19) includes the following updates:

## Feature Updates
- Introduced the ability to union multiple schemas or databases. For more information on how to leverage this feature, refer to the [README](https://github.com/fivetran/dbt_youtube_analytics/blob/main/README.md#unioning-multiple-youtube-analytics-connections).

## Breaking Changes:
- Following the unioning functionality, we have added a new field `source_relation` which identifies the source of each record.
- Updated the source table references in order to execute the union macro successfully.
- The `channel_basic` reference has been changed to `channel_basic_a_2`
- The `channel_demographics` reference has been changed to `channel_demographics_a_1`

## Documentation
- Added Quickstart model counts to README. ([#18](https://github.com/fivetran/dbt_youtube_analytics/pull/18))
- Corrected references to connectors and connections in the README. ([#18](https://github.com/fivetran/dbt_youtube_analytics/pull/18))

## Under the Hood
- Updated the uniqueness tests to include `source_relation`.
- Updated Copyright and README format.
- Added validation tests for `youtube__video_report`.

# dbt_youtube_analytics v0.4.0

The following changes were all made as a result of the [latest updates to the Fivetran YouTube Analytics connector](https://fivetran.com/docs/applications/youtube-analytics/changelog#june2023).
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright [yyyy] [name of copyright owner]
Copyright © 2025 Fivetran Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
23 changes: 19 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
<p align="center">
# Youtube Analytics Transformation dbt Package ([Docs](https://fivetran.github.io/dbt_youtube_analytics/))

<p align="left">
<a alt="License"
href="https://github.com/fivetran/dbt_youtube_analytics/blob/main/LICENSE">
<img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" /></a>
Expand All @@ -10,7 +12,6 @@
<img src="https://img.shields.io/badge/Contributions-welcome-blueviolet" /></a>
</p>

# Youtube Analytics Transformation dbt Package ([Docs](https://fivetran.github.io/dbt_youtube_analytics/))
## What does this dbt package do?
- Produces modeled tables that leverage data in the format described by the [YouTube Channel Report schemas](https://fivetran.com/docs/applications/youtube-analytics#schemainformation) and builds off the output of our [Youtube Analytics source package](https://github.com/fivetran/dbt_youtube_analytics_source).
- Transform the core object tables into analytics-ready models.
Expand Down Expand Up @@ -57,7 +58,7 @@ Include the following Youtube Analytics package version in your `packages.yml` f
# packages.yml
packages:
- package: fivetran/youtube_analytics
version: [">=0.4.0", "<0.5.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=0.5.0", "<0.6.0"] # we recommend using ranges to capture non-breaking changes automatically
```
Do NOT include the `youtube_analytics_source` package in this file. The transformation package itself has a dependency on it and will install the source package as well.

Expand Down Expand Up @@ -103,6 +104,20 @@ If an individual source table has a different name than the package expects, add
vars:
youtube_analytics_<default_source_table_name>_identifier: your_table_name
```
#### Unioning Multiple Youtube Analytics Connections
If you have multiple Youtube Analytics connections in Fivetran and want to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table(s) into the final models. You will be able to see which source it came from in the `source_relation` column(s) of each model. To use this functionality, you will need to set either (**note that you cannot use both**) the `union_schemas` or `union_databases` variables:

```yml
# dbt_project.yml
...
config-version: 2
vars:
##You may set EITHER the schemas variables below
youtube_analytics_union_schemas: ['youtube_analytics_one','youtube_analytics_two']

##Or may set EITHER the databases variables below
youtube_analytics_union_databases: ['youtube_analytics_one','youtube_analytics_two']
```

### (Optional) Step 6: Orchestrate your models with Fivetran Transformations for dbt Core™
<details><summary>Expand for details</summary>
Expand All @@ -119,7 +134,7 @@ This dbt package is dependent on the following dbt packages. These dependencies
```yml
packages:
- package: fivetran/youtube_analytics_source
version: [">=0.4.0", "<0.5.0"]
version: [">=0.5.0", "<0.6.0"]
- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
- package: dbt-labs/dbt_utils
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'youtube_analytics'
version: '0.4.0'
version: '0.5.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
vars:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion docs/run_results.json

This file was deleted.

10 changes: 5 additions & 5 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: youtube_analytics_integration_tests_2
schema: youtube_analytics_integration_tests_4
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: youtube_analytics_integration_tests_2
schema: youtube_analytics_integration_tests_4
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: youtube_analytics_integration_tests_2
schema: youtube_analytics_integration_tests_4
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: youtube_analytics_integration_tests_2
schema: youtube_analytics_integration_tests_4
threads: 8
databricks:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: youtube_analytics_integration_tests_2
schema: youtube_analytics_integration_tests_4
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
7 changes: 5 additions & 2 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
name: 'youtube_analytics_integration_tests'
version: '0.4.0'
version: '0.5.0'
profile: 'integration_tests'
config-version: 2
vars:
youtube_analytics_source:
youtube_analytics_channel_basic_a_2_identifier: "youtube_channel_basic_data"
youtube_analytics_channel_demographics_a_1_identifier: "youtube_channel_demographics_data"
youtube_analytics_video_identifier: "youtube_video_data"
youtube_analytics_schema: youtube_analytics_integration_tests_2
youtube_analytics_schema: youtube_analytics_integration_tests_4

models:
+schema: "youtube_analytics_{{ var('directed_schema','dev') }}"

seeds:
youtube_analytics_integration_tests:
Expand Down
45 changes: 45 additions & 0 deletions integration_tests/tests/consistency/consistency__video_report.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

with prod as (
select *
from {{ target.schema }}_youtube_analytics_prod.youtube__video_report
),

dev as (
select *
from {{ target.schema }}_youtube_analytics_dev.youtube__video_report
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
53 changes: 53 additions & 0 deletions integration_tests/tests/integrity/integrity__video_report.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

with staging as (

select
video_id,
source_relation,
sum(comments) as comments,
sum(dislikes) as dislikes,
sum(likes) as likes,
sum(shares) as shares,
sum(subscribers_gained) as subscribers_gained,
sum(subscribers_lost) as subscribers_lost,
sum(views) as views,
sum(watch_time_minutes) as watch_time_minutes
from {{ ref('stg_youtube__channel_basic') }}
group by 1, 2
),

report as (

select
video_id,
source_relation,
sum(comments) as comments,
sum(dislikes) as dislikes,
sum(likes) as likes,
sum(shares) as shares,
sum(subscribers_gained) as subscribers_gained,
sum(subscribers_lost) as subscribers_lost,
sum(views) as views,
sum(watch_time_minutes) as watch_time_minutes
from {{ ref('youtube__video_report') }}
group by 1, 2
)

select *
from staging
full outer join report
on staging.video_id = report.video_id
and staging.source_relation = report.source_relation
where
staging.comments != report.comments
and staging.dislikes != report.dislikes
and staging.likes != report.likes
and staging.shares != report.shares
and staging.subscribers_gained != report.subscribers_gained
and staging.subscribers_lost != report.subscribers_lost
and staging.views != report.views
and staging.watch_time_minutes != report.watch_time_minutes
3 changes: 2 additions & 1 deletion models/intermediate/int_youtube__age_pivot.sql
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,11 @@ with demographics as (
select
date_day,
video_id,
source_relation,
{{ dbt_utils.pivot(column='age_group', values=dbt_utils.get_column_values(ref('stg_youtube__channel_demographics'), 'age_group'),then_value='views_percentage') }}
from demographics

{{ dbt_utils.group_by(n=2) }}
{{ dbt_utils.group_by(n=3) }}
)

select *
Expand Down
3 changes: 2 additions & 1 deletion models/intermediate/int_youtube__gender_pivot.sql
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,11 @@ with demographics as (
select
date_day,
video_id,
source_relation,
{{ dbt_utils.pivot(column='gender', values=dbt_utils.get_column_values(ref('stg_youtube__channel_demographics'), 'gender'),then_value='views_percentage') }}
from demographics

{{ dbt_utils.group_by(n=2) }}
{{ dbt_utils.group_by(n=3) }}
)

select *
Expand Down
27 changes: 24 additions & 3 deletions models/youtube.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,16 @@ version: 2
models:
- name: youtube__video_report
description: Each record represents the daily aggregation of your YouTube video performance.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- daily_video_id
- source_relation
columns:
- name: daily_video_id
description: Unique identifier which represents a composite key made up of the date_day and video_id.
tests:
- not_null
- unique
- name: date_day
description: The date the video was viewed
- name: video_id
Expand Down Expand Up @@ -51,6 +55,8 @@ models:
description: Total watch time in minutes the video received
- name: video_duration_seconds
description: The total duration of the video in seconds
- name: source_relation
description: The source of the record, if the unioning functionality is being used.

- name: youtube__age_demographics_pivot
description: Each record represents a daily video view percentage with the age ranges pivoted out for quicker analysis.
Expand All @@ -59,6 +65,7 @@ models:
combination_of_columns:
- date_day
- video_id
- source_relation
columns:
- name: date_day
description: The date the video was viewed
Expand Down Expand Up @@ -88,6 +95,8 @@ models:
description: Total number of views percent attributed to the age range 55 - 64 years old
- name: AGE_13_17_view_percentage
description: Total number of views percent attributed to the age range 13 - 17 years old
- name: source_relation
description: The source of the record, if the unioning functionality is being used.

- name: youtube__demographics_report
description: Each record represents a daily video view percentage by gender, age, and country.
Expand All @@ -99,6 +108,7 @@ models:
- age_group
- country_code
- gender
- source_relation
columns:
- name: date_day
description: The date the video was viewed
Expand Down Expand Up @@ -126,6 +136,8 @@ models:
description: Gender of the user who watched the video. Either 'MALE', 'FEMALE', or 'GENDER_OTHER'
- name: views_percentage
description: Total percent of views the user makes up for the video
- name: source_relation
description: The source of the record, if the unioning functionality is being used.

- name: youtube__gender_demographics_pivot
description: Each record represents a daily video view percentage with the gender options pivoted out for quicker analysis.
Expand All @@ -134,6 +146,7 @@ models:
combination_of_columns:
- date_day
- video_id
- source_relation
columns:
- name: date_day
description: The date the video was viewed
Expand All @@ -155,14 +168,20 @@ models:
description: Total number of views percent attributed to the viewers who identify as female
- name: GENDER_OTHER_view_percentage
description: Total number of views percent attributed to the viewers who identify as neither male or female
- name: source_relation
description: The source of the record, if the unioning functionality is being used.

- name: youtube__video_metadata
description: Each record represents an individual video enriched with metadata.
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- video_id
- source_relation
columns:
- name: video_id
description: Unique identifier of the video
tests:
- unique
- not_null
- name: _fivetran_synced
description: Timestamp of when the record was synced by Fivetran
Expand Down Expand Up @@ -253,4 +272,6 @@ models:
- name: medium_thumbnail_url
description: The medium quality thumbnail url
- name: high_thumbnail_url
description: The high quality thumbnail url
description: The high quality thumbnail url
- name: source_relation
description: The source of the record, if the unioning functionality is being used.
Loading