This project is a web scraper tool used to extract university program data from YÖK Atlas (Turkish Higher Education Council's information system).
- Extracts data from Mathematical (SAY), Verbal (SÖZ), and Equal-Weight (EA) departments
- Stores program information in JSON format
- All data is collected in a single file
- The following information is collected for each program:
- Program ID
- University name
- Faculty/department name
- Program name
- Program description
- Placement ranking
- Placement score
- Score type category
- Node.js
- npm or yarn
- Clone the project:
git clone https://github.com/yourusername/yokatlas_scraper.git
cd yokatlas_scraper
- Install dependencies:
npm install
To run the application, use the following command in your terminal:
node scrapper.js
The scraper will display the data extraction status for each page as it runs. Results are saved to the yok_atlas_all_results.json
file.
The extracted data is saved in the following JSON format:
[
{
"id": "203910699",
"university": "KOÇ ÜNİVERSİTESİ",
"department": "Tıp Fakültesi",
"program": "Tıp",
"description": "Program description",
"rank": 24,
"score": 555.4201,
"category": "say"
},
...
]
id
: Program ID numberuniversity
: University namedepartment
: Faculty or department nameprogram
: Program namedescription
: Program descriptionrank
: Placement ranking (null if not available)score
: Placement score (null if not available)category
: Score type (say, soz, ea)
- "Dolmadı" (Not Filled) values for placement ranking and score are processed as null
- The scraper checks for "Eski Kılavuz Kodu" (Old Guide Code) condition to extract correct IDs
- Random waiting time is applied between pages for rate limiting