This Streamlit application provides an interface for reviewing contracts against a set of guidelines. It allows users to evaluate contract clauses, provide reasons for compliance or non-compliance, and suggest improvements.
- Display contract text with selectable paragraphs
- Review guidelines one by one
- Evaluate guideline quality and suggest improvements
- Determine if a guideline is followed, not followed, or not applicable
- Provide reasons for guideline status with quality evaluation
- Add comments with quality evaluation
- Suggest updated clause text with improvement options
- Save review results to Google Sheets
- Navigate through guidelines with progress tracking
- Python 3.6+
- Streamlit
- pandas
- streamlit-gsheets
-
Clone this repository:
git clone https://github.com/SpotDraft/DataAnnotation.git cd DataAnnotation
-
Install the required packages:
pip install -r requirements.txt
-
Set up your Google Sheets connection:
- Create a Google Cloud project and enable the Google Sheets API
- Create a service account and download the JSON key
- Share your Google Sheet with the service account email
-
Create a
.streamlit/secrets.toml
file with your Google Sheets credentials:spreadsheet = "YOUR_SPREADSHEET_ID" [connections.gsheets] private_key_id = "YOUR_PRIVATE_KEY_ID" private_key = "YOUR_PRIVATE_KEY" client_email = "YOUR_CLIENT_EMAIL" client_id = "YOUR_CLIENT_ID" type = "YOUR_TYPE" project_id = "YOUR_PROJECT_ID" auth_uri = "YOUR_AUTH_URI" token_uri = "YOUR_TOKEN_URI" auth_provider_x509_cert_url = "YOUR_AUTH_PROVIDER" client_x509_cert_url = "YOUR_CLIENT_X509" universe_domain = "YOUR_DOMAIN"
Run the Streamlit app:
streamlit run app.py
Navigate to the provided URL to access the Contract Review Interface.
The application expects the following worksheets in your Google Sheet:
- "Data": Contains the guidelines and contract text
- "ReviewResults": Stores the review results
The "Data" worksheet should have the following columns:
id
: Unique identifier for each guidelineguideline
: The text of the guidelinecontract
: The full text of the contract (this will be the same for all rows)reason
: ...status
: ...comment
: ...updated_clause_text
: ...
Example:
id | guideline | contract | reason | status | comment | updated_clause_text |
---|---|---|---|---|---|---|
1 | Ensure all parties are clearly identified | Full contract text here... | counterparty is missing | NOT_FOLLOWED | Please add counterparty | Counterparty is.. |
... | ... | ... |
The "ReviewResults" worksheet will store the review results with the following columns:
id
: Unique identifier for the guideline (matching the input data)guideline
: The text of the guidelineguideline_quality
: Quality assessment of the guideline (Pending, Excellent, Good, Better, Bad)guideline_improvement
: Suggested improvement area for the guidelineguideline_improvement_other
: Custom improvement suggestion if "Other" was selectedstatus
: Whether the guideline is FOLLOWED, NOT_FOLLOWED, or NOT_APPLICABLEreason
: Explanation for the statusreason_quality
: Quality assessment of the reason (Pending, Excellent, Good, Better, Bad)reason_improvement_other
: Custom improvement suggestion for the reason if "Other" was selectedcomment
: Additional commentscomment_quality
: Quality assessment of the comment (Pending, Excellent, Good, Better, Bad)comment_improvement_other
: Custom improvement suggestion for the comment if "Other" was selectedselected_sources
: Comma-separated list of paragraph numbers from the contract text that are relevant to this guidelineupdate_clause_text
: Suggested updated text for the relevant clauseupdate_clause_improvement
: Assessment of the updated clause text (NA, Changes too extensive, Does not meet guideline, Content inaccuracies, Other)update_clause_improvement_other
: Custom improvement suggestion for the updated clause if "Other" was selected
Contributions are welcome! Please feel free to submit a Pull Request.