Skip to content

ku-sh-24/OLYMPIC-DATA-PROJECT

Repository files navigation

Olympic Data Analysis

[https://ku-sh-24-olympic-data-project-olympic-analysis-v1zr6v.streamlit.app/](url)

This project utilizes 120 years of Olympic history data, focusing on athletes and their results. The analysis is conducted using Jupyter Notebook and presented via a web application created with Streamlit in PyCharm.

Libraries Used

  • Streamlit
  • Pandas

Basic Framework

The analysis is structured around the following main sections:

  1. Medal Tally
  2. Overall Analysis
  3. Country-wise Analysis
  4. Athlete-wise Analysis

Data Overview

  • Time Window: The data spans from 1896 to 2016, encompassing almost 120 years of Olympic history.
  • Datasets: Two datasets are available:
    • Dataset containing information about athletes and their performances.
    • Dataset containing the full names of country abbreviations.

Data Preprocessing

  • Handling Historical Changes: Some countries did not exist in the past, and some underwent name changes. Special attention was given to researching and addressing these cases.
  • One-Hot Encoding: Medals were encoded using one-hot encoding for analysis purposes.

Usage

To run the analysis:

  1. Clone this repository.
  2. Ensure you have the necessary libraries installed (streamlit, pandas, etc.).
  3. Run the Streamlit app in PyCharm or any other Python environment.

Major Issues and Solutions

Major issues occurred with ranking the nations since the data was organized athlete-wise. The team sports medals were being counted multiple times, resulting in duplicates in the medal tally. To address this, duplicates with the same team, NOC region, etc., were dropped.

Functions were created in PyCharm to show the analysis and present it using Streamlit methods. Different conditions were implemented based on the choices made in the radio buttons to enhance user interaction. Continuous validation was performed with specific cases to ensure the code accurately reflected the data available on the internet.

Additional Notes

  • The analysis aims to provide insights into Olympic data spanning over a century, covering various aspects such as medal distribution, historical trends, and individual athlete performances.

Usage

To run the analysis:

  1. Clone this repository.
  2. Ensure you have the necessary libraries installed (streamlit, pandas, etc.).
  3. Run the Streamlit app in PyCharm or any other Python environment.

Major Issues and Solutions

Major issues occurred with ranking the nations since the data was organized athlete-wise. The team sports medals were being counted multiple times, resulting in duplicates in the medal tally. To address this, duplicates with the same team, NOC region, etc., were dropped.

Functions were created in PyCharm to show the analysis and present it using Streamlit methods. Different conditions were implemented based on the choices made in the radio buttons to enhance user interaction. Continuous validation was performed with specific cases to ensure the code accurately reflected the data available on the internet.

Additional Notes

  • The analysis aims to provide insights into Olympic data spanning over a century, covering various aspects such as medal distribution, historical trends, and individual athlete performances.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages