This project is designed to scrape Google News for news clips and articles related to a specific person of interest. It utilizes Python with libraries such as BeautifulSoup and requests to automate the collection of news from the web.
To set up this project on your local machine, follow these steps:
- Clone the repository:
git clone https://github.com/ethanchwang/scrape_gnews_press_clips.git
- Navigate to the project directory:
cd scrape_gnews_press_clips
- Install required Python packages:
pip install -r requirements.txt
To start scraping Google News for news clips about the specified person, run the script with:
python clips_html.py --query chuck_schumer --area new_york --time_frame 48h --topics_to_query housing,education,labor --language span
ish
You can also specify additional command-line arguments for customization. Refer to the script’s help for more options:
python scrape_google_news.py --help
I created this tool for use the in the particular communications office I was working in, so the formatting is tailored to what they were asking for.