-
Notifications
You must be signed in to change notification settings - Fork 3
DataReporter
The DataReporter
object handles importing MoTeC log files into the Datastore for futher analysis via Datamaster. Ideally, only a single machine/ user uses DataReporter
to updated a central Datastore (ie. hosted on a network drive). With the remaining users connecting to the shared Datastore but not adding to it via DataReporter
. This is not a strict requirement, however the
Before DataReporter
can be used the following must first occur:
- Install MoTeC i2Pro
- Create an Google Drive for storing MoTeC Log Files
- Create an OAuth 2.0 Client Secret and ID for using Google Drive API See Google's Instructions using an account that has access to Google Drive Folder used for storing MoTeC Log Files.
- Create the Master SQLite Database using MasterDirectory.sql. DB Browser for sqlite is a pretty solid open source program for creating and working with sqlite databases. The path to this file is the
master_directory_path
. - Create a directory for storing datasource's *.MAT files, this is the
datastore_path
. It is recommended (but not required) to place the master directory inside of the datastore folder. - Update
config.ini
to include:-
client_id
,client_secret
from step 3 -
master_directory_path
from step 4 -
datastore_path
from step 5
-
Importing MoTeC Log files can take a significant amount of time to run (~1000 files/hr), but once done checking for new files via dr.RefreshDatastore
takes minutes (~1000/min). Additionally once imported Datamaster can quickly process hundreds of datasource with minimal effort. Given the large amount of time required for the initial export, it is recommended that the Datastore and Master Directory be stored on a networked drive so multiple users can access the Datastore without personally running an import.
Additionally, by maintaining a shared version of the Datastore that is updated by a single (or limited number of) users, most user can be spared having to set up DataReporter
.
Unfortunately, accessing files over a network connection can be significantly slower that accessing files stored locally. While the time difference can be minimal over a fast connection, some user may want to keep a local copy of the Datastore and regularly check the shared version for updates. This can be done simply by changing the values of master_directory_path
and datastore_path
in config.ini
to point to the local copy.
Once DataReporter
has been set up, refreshing the Datastore is a simple matter of calling:
dr = DataReporter;
dr.RefreshDatastore;
And waiting for the process to complete. Please note that while DataReporter will lock down the current MATLAB session, by opening a second window on the same (or different) machine, Datamaster can be used to examine datasources as they are exported.
The first step of the export process is to poll the Google Drive API for a list of every file the user (The account used to create the OAuth 2.0 Client Secret/ID) has access to with an *.LD
or *.LDX
extension. The API in turn returns the following information:
Property Name | Description |
---|---|
id |
The name Google uses internally for the file |
name |
The filename of the file |
md5Checksum |
The MD5 Checksum of the file, Used later for detecting modifications/ duplicates |
modifiedTime |
The last time the file was modified |
webContentLink |
The URL that can be used to download the file from Google Drive |
Recall that each MoTeC Log File is really 2 files one foo.LDX
and one foo.LD
. The next step in the export process is to match each *.LD file to it's *.LDX counterpart. The problem is that some files have been duplicated and other are matching.
Case | Description |
---|---|
Everything Matches | An *.LD File can be matched to an *.LDX file on name alone |
Duplicate File | Multiple .LD/.LDX Files share the same MD5 Hash (ie. Are copies of each other) but each can be paired to an *.LDX file based on name |
Missing File | An *.LD File has no matching *.LDX, or vise versa |
In the case of missing files, DataReporter will simply ignore the orphan file and move on. For Duplicate Files, Datamaster will pick the oldest *.LD file and the *LDX file with the a modifiedTime
closest the the *.LD's modifiedTime
. This procedure was designed with the intent of always picking the original/ unmodified version.
Each *.LD/ *.LDX file pair is then downloaded from Google Drive using their webContentLink
. Once downloaded the MoTeC i2Pro API is used to export the log file to a MATLAB *.MAT File. Interestingly the i2Pro API claims to require a license to use, luckily the features of the API that are needed by DataReporter
can be used without a license. This may disappear in a future release of i2Pro but does work as of MoTeC i2Pro 1.1.2.475. Additionally the *.LDX file is directly parsed using regular expressions to extract information stored in the log file details, such as the Venue and Driver.
The export *.MAT File is then reopen and channel data is re saved using single floating point precision rather than MATLAB's default double floating point precision. Doing this cuts the file size in half with minimal reduction in precision, as the ADL3 logs channel data using single floating point precision in the first place.