SEAD Bugs Import

Preview markdown files in a separate window. Markdown is formatted exactly the same as on GitHub.

Overview

Prerequisites

To build and run the application you will need:

OpenJDK 1.8
Maven 3
Spring Boot 1.5.20.RELEASE
Spring Data JPA
PostgreSQL Driver
Jackcess - Java Library for MS Access
Hibernate
H2 in-memory database for running test suite

Installation

git clone https://github.com/humlab-sead/sead_bugs_import
cd sead_bugs_import
mvn -Dmaven.test.skip=true clean
mvn -Dmaven.test.skip=true package


docker run --rm -it sead_bugs_import:latest bash

Usage

See wiki How to perform a Bugs Import for a non-technical description of how to import Bugs data into the SEAD database.

java -jar jar_file --file=mbd-file [--importers=..] [--validate-schema]

% wget -d --user-agent="Mozilla/5.0 (Windows NT x.y; rv:10.0) Gecko/20100101 Firefox/10.0"  https://www.bugscep.com/downloads/bugsdata.zip
% unzip /path/to/file.zip -d temp_for_zip_extract
% java -jar target/bugs.import-0.1-SNAPSHOT.jar --file=./data/bugsdata_20190503.mdb --importers=Bibliography,Lab

Options

--file= - string Bugs MS Access database file to import. (required)
--importers= - string Comma separated list of named importers to run, where each item is a simple type name (no namespace) without the "Importer" suffix.
--validate-schema - Validate schema (create/update) but do not run.
--no-run - Do not run.

Steps to perform a bugs import

Verify configuration
Setup target database (e.g. sead_staging_bugs)
Download and unpack latest bugs database from http://www.bugscep.com, link to file
Run Java application to do an import of downloaded bugs database.

% java -jar target/bugs.import-0.1-SNAPSHOT.jar --validate-schema
% java -jar target/bugs.import-0.1-SNAPSHOT.jar --file="./data/bugsdata_yyyymmdd.mdb"

Check flagged events/errors in import lag (ROGER - WHERE EXACTLY?)
Apply manual updates to SEAD records which cannot be fixed through corrections in BugsCEP or by altering import routines/mappings (e.g. where SEAD functionality exheeds Bugs)

Verify configuration

The application uses PostgreSQL as target database in production, and an in-memory database (H2) when running tests. Before your run this application, it would be good to check properties defined in application.properties file.

spring.datasource.driverClassName=org.postgresql.Driver
spring.datasource.url=jdbc:postgresql://host:port/sead_staging
spring.datasource.username=username
spring.datasource.password=123456

spring.jpa.database-platform=org.hibernate.dialect.PostgreSQLDialect
spring.jpa.show-sql=false
spring.jpa.hibernate.ddl-auto=none
# spring.jpa.generate-ddl=true

Set xyz to update if target database schema should be updated to conform to the applications data model. Set to none if model is correct, or if changes are handled manually outside of code (which is better). (TODO: Add service that at startup, or when given a flag, generates update ddl similar to using update).

TODO: Move DB specifics to a Spring Boot profile instead (need source code changes @Profile(profile) attribute. (spring.profile.active)

Sample logging configuration:

logging.level.org.hibernate.event.internal.EntityCopyAllowedLoggedObserver=debug
logging.level.org.hibernate.type=trace

spring.jpa.properties.hibernate.format_sql=true
spring.jpa.properties.hibernate.event.merge.entity_copy_observer=log

Setup target database

The application can, for test purposes, be run on an empty database owned by sead_master:

create database sead_staging_bugs_empty owner to sead_master;

When run on an empty database the spring.jpa.hibernate.ddl-auto must be set to update or create. The application will create the schemas (bugs_import and public) upon startup which is useful if you want to do a schema diff between the SEAD schema to the schema expected by the application.

Normally, though, the import should be incremental by using a previously imported target database as the current target. This feature is not tested! The three SQL-scripts that reside in the ./sql folder can be used to setup a new target database based on an existing SEAD database (including data).

Script	Note
00_bugs_setup_staging_database.psql	Create a new "sead_staging_bug"s database.
01_bugs_import_schema.psql	Create the `bugs_import` schema.
02_setup_bugs_import_value_translations.psql	Populates `bugs_import` schema tables

The commands must be executed as a superuser using e.g. psql or pgAdmin. You might first need to terminate existing processes in the source and (new) target database.

#!/bin/bash

REPOSITORY="https://raw.githubusercontent.com/humlab-sead/sead_change_control/master"

source_db="template_bugs_import"
target_db="sead_staging_bugs"

dbhost=`cat ~/.default.sead.server`
dbuser=`cat  ~/.default.sead.username`

dbexec() {
    psql -bqw -h $dbhost -U $dbuser -c "$1" $target_db
}

kickout() {
    xsql="select pg_terminate_backend(pid) from pg_stat_activity where datname = '$1' and pid <> pg_backend_pid();"
    psql -bqw -h $dbhost -U $dbuser -c "$xsql" "postgres"
}

apply_crs() {
    project=$1
    items=$2
    for CR in ${items}; do
        echo "Applying ${project}/${CR}..."
        rm -f ${CR}.sql
        wget -q -nv ${REPOSITORY}/${project}/deploy/${CR}.sql --output-document=${CR}.sql
        psql --echo-errors --quiet -h $dbhost -U $dbuser --no-password --file="${CR}.sql" $target_db
        rm -f ${CR}.sql
    done
}

BugsCRs="\
    CS_BUGS_20190503_SETUP_SCHEMA\
    CS_BUGS_20190503_ADD_TRANSLATIONS"

GeneralCRs="\
    CS_DATINGLAB_20190503_ADD_UNKNOWN_LAB\
    CS_METHOD_20190503_ADD_BUGS_METHODS\
    CS_TAXA_20190503_ADD_SPECIES_ASSOC_TYPES\
    CS_TAXA_20190503_ATTRIBUTE_TYPE_LENGTH"

kickout $target_db
dropdb --echo --if-exists --host=$dbhost --username=$dbuser --no-password "$target_db"

kickout $source_db
createdb --echo --owner=sead_master --host=$dbhost --username=$dbuser --no-password --encoding="UTF8" -T "$source_db" "$target_db"

apply_crs  "bugs" "$BugsCRs"
apply_crs  "general" "$GeneralCRs"

echo "done!"

Download latest bugs database

The latest version of the from Bugs download page e.g. using wget (or a browser):

% wget http://www.bugscep.com/downloads/bugsdata.zip

Unpack the file into a folder that can be accessed by the bugs import application. The path is specified as an command line argument to the import application.

Authors

Development using MS Visual Code

Main flow

Significant classes

Import flow

Logging JPA Queries

You can add logging of (beautified) SQL queries to standard out via the following application.properties:

spring.jpa.show-sql=true
spring.jpa.properties.hibernate.format_sql=true

Running tests

The Java Test Runner extension enables running Java tests in vscode. Note that Java by Red Hat and Debugger for Java also must be installed and enabled in the current vscode workspace. You can run individual tests by using the CodeLens Run|Debug commands above the function in the code. You can also run individual tests by rightclicking nodes in the Test Explorer's tree.

Select Java Test Runner as output window, or click Running tests... on activity bar to see output from test ryns.

Tests are also automatically run from command line when e.g. mvn package or mvn test is executed.

Name		Name	Last commit message	Last commit date
Latest commit History 249 Commits
.mvn/wrapper		.mvn/wrapper
bugsdata		bugsdata
configuration		configuration
doc		doc
docker		docker
scripts		scripts
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
mindmup.md		mindmup.md
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SEAD Bugs Import

Overview

Prerequisites

Installation

Usage

Options

Steps to perform a bugs import

Verify configuration

Setup target database

Download latest bugs database

Authors

Development using MS Visual Code

Main flow

Significant classes

Import flow

Logging JPA Queries

Running tests

Credits

License

Issues

About

Releases

Packages

Contributors 6

Languages

humlab-sead/sead_bugs_import

Folders and files

Latest commit

History

Repository files navigation

SEAD Bugs Import

Overview

Prerequisites

Installation

Usage

Options

Steps to perform a bugs import

Verify configuration

Setup target database

Download latest bugs database

Authors

Development using MS Visual Code

Main flow

Significant classes

Import flow

Logging JPA Queries

Running tests

Credits

License

Issues

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages