Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE][EXPERIMENTAL] Dependency graph based testing strategy and related pipeline #3738

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
62dfac7
feat: Add script to scripts/ dir
cdkini Nov 23, 2021
b811e4f
feat: Use script in pipeline
cdkini Nov 23, 2021
5415422
test: Add test change
cdkini Nov 23, 2021
1a40916
feat: Add git integration
cdkini Nov 23, 2021
ab92796
test: Add another test change
cdkini Nov 23, 2021
1328d14
chore: Revert data context
cdkini Nov 23, 2021
50292cc
test: Another test change
cdkini Nov 23, 2021
47a9236
test: Another test change
cdkini Nov 23, 2021
9ffd9ae
docs: Add docstr to script
cdkini Nov 23, 2021
2917fcd
chore: Bring down depth to 2
cdkini Nov 23, 2021
e69d070
docs: Additional docstr updates
cdkini Nov 23, 2021
0f97f51
docs: Additional docstr updates
cdkini Nov 23, 2021
8486139
docs: Add comment in graph traversal func
cdkini Nov 23, 2021
1d32194
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Nov 23, 2021
5da95b5
feat: Misc tweaks to algo
cdkini Nov 24, 2021
8b99f40
feat: Misc changes
cdkini Nov 24, 2021
1d3fb19
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Nov 24, 2021
872c930
chore: Revert test changes
cdkini Nov 24, 2021
371244f
docs: Update all docstrs
cdkini Nov 24, 2021
bb8af24
docs: Comment on script output
cdkini Nov 24, 2021
4fd4889
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Nov 24, 2021
5568228
feat: Add support for picking up test files in diff
cdkini Nov 24, 2021
eb8e5de
refactor: Move logic from helper into `determine_files_to_test`
cdkini Nov 24, 2021
9cee8d0
test: Test change
cdkini Nov 24, 2021
236d21d
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Nov 29, 2021
6f450bc
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Nov 29, 2021
8d2eaae
chore: Add graphs for demonstration purposes
cdkini Nov 29, 2021
17fd8b3
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 2, 2021
cbb2a64
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 3, 2021
e02aeb0
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 4, 2021
8a7b9b5
chore: General cleanup
cdkini Dec 4, 2021
9251dd1
refactor: Misc cleanup
cdkini Dec 4, 2021
04a3b80
fix: Add CLI arg to CI/CD
cdkini Dec 4, 2021
6c89411
fix: Update check for proper args
cdkini Dec 4, 2021
4bcf7ad
Update scripts/determine_tests_to_run.py
cdkini Dec 5, 2021
c97f0ec
Update azure-pipelines.yml
cdkini Dec 5, 2021
a68a94f
feat: Use argparse for --depth arg
cdkini Dec 5, 2021
b186e50
test: Trigger pipeline
cdkini Dec 5, 2021
326781d
feat: Add --filter opt
cdkini Dec 5, 2021
cb1467f
fix: Leave CLI integration alone due to errors
cdkini Dec 5, 2021
2b599be
Update azure-pipelines.yml
cdkini Dec 5, 2021
e288ac3
chore: Cleanup of AST traversal
cdkini Dec 5, 2021
b0a64c5
Merge branch 'feature/script-for-streamlined-testing-in-azure-pipelin…
cdkini Dec 5, 2021
c3818d1
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 7, 2021
94ecaea
chore: Try change in renderer
cdkini Dec 7, 2021
cc15d0e
chore: Revert test changes
cdkini Dec 7, 2021
7715aab
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 7, 2021
ed133f7
chore: Move changes to their own YAML
cdkini Dec 7, 2021
f74b682
chore: Test change
cdkini Dec 7, 2021
30d4b26
feat: Limit Azure YAML to 3.8
cdkini Dec 8, 2021
e71dcf9
feat: Clean up pipeline YAML
cdkini Dec 8, 2021
2215050
chore: Restore test change
cdkini Dec 8, 2021
ab2b438
feat: Misc changes
cdkini Dec 8, 2021
8385768
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 8, 2021
366eddc
feat: Further clean up pipeline YAML
cdkini Dec 8, 2021
b07eed5
test: See what final pipeline looks like
cdkini Dec 8, 2021
cf48f35
chore: Revert test changes
cdkini Dec 8, 2021
a82a4f7
chore: Add back remainder of matrices
cdkini Dec 8, 2021
1e49934
Merge branch 'develop' of github.com:great-expectations/great_expecta…
cdkini Dec 8, 2021
300d1c9
fix: Fix edge case paths
cdkini Dec 8, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
333 changes: 333 additions & 0 deletions azure-pipelines-dependency-graph-testing.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,333 @@
# This pipeline is meant to run the GE test suite with an experimental test runner strategy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

# The significant changes between this YAML and the primary azure-pipelines.yml file are:
# - Only tests compatability and comprehensive matrices
# - Removes stages irrelevant to this strategy (i.e. CLI integration and usage stats)
# - Utilizes a custom script to filter the test files selected and passed on to pytest

trigger:
branches:
include:
- pre_pr-*
- develop
- main

resources:
containers:
- container: postgres
image: postgres:11
ports:
- 5432:5432
env:
POSTGRES_DB: "test_ci"
POSTGRES_HOST_AUTH_METHOD: "trust"
- container: mysql
image: mysql:8.0.20
ports:
- 3306:3306
env:
MYSQL_ALLOW_EMPTY_PASSWORD: "yes"
MYSQL_DATABASE: test_ci
- container: mssql
image: mcr.microsoft.com/mssql/server:2019-latest
env:
ACCEPT_EULA: Y
MSSQL_SA_PASSWORD: ReallyStrongPwd1234%^&*
MSSQL_DB: test_ci
MSSQL_PID: Developer
ports:
- 1433:1433

variables:
isMain: $[eq(variables['Build.SourceBranch'], 'refs/heads/main')]
isDevelop: $[eq(variables['Build.SourceBranch'], 'refs/heads/develop')]
GE_USAGE_STATISTICS_URL: "https://qa.stats.greatexpectations.io/great_expectations/v1/usage_statistics"

stages:
- stage: scope_check
pool:
vmImage: 'ubuntu-20.04'
jobs:
- job: changes
steps:
- task: ChangedFiles@1
name: CheckChanges
inputs:
verbose: true
rules: |
[ContribChanged]
contrib/**

[ExperimentalChanged]
contrib/experimental/**

[DocsChanged]
docs/**
tests/integration/docusaurus/**
tests/integration/fixtures/**
tests/test_sets/**

[GEChanged]
great_expectations/**
tests/**
/*.txt
/*.yml

- stage: lint
dependsOn: scope_check
pool:
vmImage: 'ubuntu-latest'

jobs:
- job: lint
condition: or(eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true), eq(variables.isMain, true))
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: 3.7
displayName: 'Use Python 3.7'

- script: |
pip install isort[requirements]==5.4.2 flake8==3.8.3 black==21.8b0 pyupgrade==2.7.2
EXIT_STATUS=0
isort . --check-only --skip docs/ || EXIT_STATUS=$?
black --check --exclude docs/ . || EXIT_STATUS=$?
flake8 great_expectations/core || EXIT_STATUS=$?
pyupgrade --py3-plus || EXIT_STATUS=$?
exit $EXIT_STATUS

- stage: required
dependsOn: [scope_check, lint]
pool:
vmImage: 'ubuntu-18.04'

jobs:
- job: compatibility_matrix
condition: or(eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true), eq(variables.isMain, true))
variables:
GE_pytest_opts: '--no-sqlalchemy --no-spark'
strategy:
matrix:
Python36-Pandas023:
python.version: '3.6'
numpy.version: '1.14.1'
pandas.version: '0.23.4'
scipy.version: '0.19.0'
GE_pytest_pip_opts: '--requirement requirements-dev-base.txt --constraint constraints-dev.txt'
Python37-Pandas025:
python.version: '3.7'
numpy.version: '1.14.1'
pandas.version: '0.25.3'
# numpy 1.20 and pandas 0.25.3 do not coexist happily
scipy.version: '0.19.0'
GE_pytest_pip_opts: '"numpy<1.20" --requirement requirements-dev-base.txt --constraint constraints-dev.txt'
Python38-PandasLatest:
python.version: '3.8'
numpy.version: 'latest'
pandas.version: 'latest'
scipy.version: 'latest'
GE_pytest_pip_opts: '--requirement requirements-dev-base.txt --constraint constraints-dev.txt'

steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(python.version)'
displayName: 'Use Python $(python.version)'

- bash: python -m pip install --upgrade pip # --upgrade pip==20.2.4
displayName: 'Update pip'

- bash: pip install numpy
condition: eq(variables['scipy.version'], 'numpy')
displayName: 'Install numpy latest'

- bash: pip install pandas
condition: eq(variables['pandas.version'], 'latest')
displayName: 'Install pandas latest'

- bash: pip install scipy
condition: eq(variables['scipy.version'], 'latest')
displayName: 'Install scipy latest'

- bash: pip install pandas==$(pandas.version)
condition: ne(variables['pandas.version'], 'latest')
displayName: 'Install pandas - $(pandas.version)'

- script: |
pip install $(GE_pytest_pip_opts)
pip install --requirement requirements.txt
# Consider fragmenting *all* integration tests into separate folder and run
pip install .
displayName: 'Install dependencies'

- script: |
pip install pytest pytest-cov pytest-azurepipelines
python scripts/determine_tests_to_run.py --depth 3 | xargs pytest $(GE_pytest_opts) --napoleon-docstrings --junitxml=junit/test-results.xml --cov=. --cov-report=xml --cov-report=html --ignore=tests/cli --ignore=tests/integration/usage_statistics
displayName: 'pytest'

- task: PublishTestResults@2
condition: succeededOrFailed()
inputs:
testResultsFiles: '**/test-*.xml'
testRunTitle: 'Publish test results for Python $(python.version)'

- task: PublishCodeCoverageResults@1
inputs:
codeCoverageTool: Cobertura
summaryFileLocation: '$(System.DefaultWorkingDirectory)/**/coverage.xml'
reportDirectory: '$(System.DefaultWorkingDirectory)/**/htmlcov'

- job: comprehensive
condition: or(eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true), eq(variables.isMain, true))

services:
postgres: postgres

variables:
GE_pytest_opts: ''

strategy:
matrix:
Python36:
python.version: '3.6'
pandas.version: 'latest'
GE_pytest_pip_opts: '"pyspark<3.0.0" --requirement requirements-dev.txt --constraint constraints-dev.txt'
Python37:
python.version: '3.7'
pandas.version: 'latest'
GE_pytest_pip_opts: '"pyspark<3.0.0" --requirement requirements-dev.txt --constraint constraints-dev.txt'
Python38:
python.version: '3.8'
pandas.version: 'latest'
GE_pytest_pip_opts: '--requirement requirements-dev.txt --constraint constraints-dev.txt'
Python39:
python.version: '3.9'
pandas.version: 'latest'
GE_pytest_pip_opts: '--requirement requirements-dev.txt --constraint constraints-dev.txt'

steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(python.version)'
displayName: 'Use Python $(python.version)'

- bash: python -m pip install --upgrade pip # pip==20.2.4
displayName: 'Update pip'

- script: |
sudo apt-get install -y pandoc
pip install pypandoc
displayName: 'Install pandoc'

- bash: pip install pandas
condition: eq(variables['pandas.version'], 'latest')
displayName: 'Install pandas latest'

- bash: pip install pandas==$(pandas.version)
condition: ne(variables['pandas.version'], 'latest')
displayName: 'Install pandas - $(pandas.version)'

- script: |
pip install --requirement requirements.txt
echo "about to run pip install $(GE_pytest_pip_opts)"
pip install $(GE_pytest_pip_opts)
pip install .
displayName: 'Install dependencies'

- script: |
pip install pytest pytest-cov pytest-azurepipelines
python scripts/determine_tests_to_run.py --depth 3 | xargs pytest $(GE_pytest_opts) --napoleon-docstrings --junitxml=junit/test-results.xml --cov=. --cov-report=xml --cov-report=html --ignore=tests/cli --ignore=tests/integration/usage_statistics
displayName: 'pytest'

- task: PublishTestResults@2
condition: succeededOrFailed()
inputs:
testResultsFiles: '**/test-*.xml'
testRunTitle: 'Publish test results for Python $(python.version)'

- task: PublishCodeCoverageResults@1
inputs:
codeCoverageTool: Cobertura
summaryFileLocation: '$(System.DefaultWorkingDirectory)/**/coverage.xml'
reportDirectory: '$(System.DefaultWorkingDirectory)/**/htmlcov'

- stage: db_integration
pool:
vmImage: 'ubuntu-latest'

dependsOn: [scope_check, lint]

jobs:
- job: mysql
condition: or(eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true), eq(variables.isMain, true))

services:
mysql: mysql

variables:
python.version: '3.8'

steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(python.version)'
displayName: 'Use Python $(python.version)'

- bash: python -m pip install --upgrade pip # pip==20.2.4
displayName: 'Update pip'

- script: |
printf 'Waiting for MySQL database to accept connections'
until mysql --host=localhost --protocol=TCP --port=3306 --user=root --password='' --execute "SHOW DATABASES"; do
printf '.'
sleep 1;
done;
displayName: Wait for database to initialise

- script: |
echo "SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));" > mysql_setup_script.sql
mysql --host=localhost --protocol=TCP --port=3306 --user=root --password='' --reconnect < mysql_setup_script.sql
displayName: 'Configure mysql'

- script: |
pip install --requirement requirements-dev-base.txt --requirement requirements-dev-sqlalchemy.txt --constraint constraints-dev.txt
pip install --requirement requirements.txt
pip install .
displayName: 'Install dependencies'

- script: |
pip install --requirement requirements.txt
pip install pytest pytest-cov pytest-azurepipelines
python scripts/determine_tests_to_run.py --depth 3 | xargs pytest --mysql --no-postgresql --no-spark --napoleon-docstrings --junitxml=junit/test-results.xml --cov=. --cov-report=xml --cov-report=html --ignore=tests/cli --ignore=tests/integration/usage_statistics
displayName: 'pytest'

- job: mssql
condition: or(eq(stageDependencies.scope_check.changes.outputs['CheckChanges.GEChanged'], true), eq(variables.isMain, true))

services:
mssql: mssql

variables:
python.version: '3.8'

steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(python.version)'
displayName: 'Use Python $(python.version)'

- bash: python -m pip install --upgrade pip # pip==20.2.4
displayName: 'Update pip'

- script: |
sqlcmd -U sa -P "ReallyStrongPwd1234%^&*" -Q "CREATE DATABASE test_ci;" -o create_db_output.txt

- script: |
pip install --requirement requirements-dev-base.txt --requirement requirements-dev-sqlalchemy.txt --constraint constraints-dev.txt
pip install --requirement requirements.txt
pip install .
displayName: 'Install dependencies'

- script: |
pip install pytest pytest-cov pytest-azurepipelines
python scripts/determine_tests_to_run.py --depth 3 | xargs pytest --mssql --no-postgresql --no-spark --napoleon-docstrings --junitxml=junit/test-results.xml --cov=. --cov-report=xml --cov-report=html --ignore=tests/cli --ignore=tests/integration/usage_statistics
displayName: 'pytest'
Loading