This azure function uses the Pandas library with the Beautiful Soup extension in order to check if any dataset has been updated on the Historic England listing website and any heritage dataset from Historic Scotland.
The NCRON expression for this repository is set to run daily at 7am.
"schedule": "0 0 7 * * *"
This is purely for testing, and it is recommended that a longer expression be used when uploaded to Azure, for example monthly:
"schedule": 0 0 7 1 1-12 *
- When the timer is triggered it will send a
request.get()
call to the website. Upon a successful request code therequest.text
will be passed to a Pandas dataframe and processed. - A dataframe containing a reference to only the updated datasets will be output to Blob storage as a CSV.
- The output CSV is uniquely identified by appending a DateTime to the filename:
HE_event_{Datetime}.csv
- For simplicity an Azure Logic App then checks the Blob storage for updates and then emails the results.
To do:
- Update NCRON
- Add Historic Scotland code
- Add CADW code