This project utilizes 120 years of Olympic history data, focusing on athletes and their results. The analysis is conducted using Jupyter Notebook and presented via a web application created with Streamlit in PyCharm.
- Streamlit
- Pandas
The analysis is structured around the following main sections:
- Medal Tally
- Overall Analysis
- Country-wise Analysis
- Athlete-wise Analysis
- Time Window: The data spans from 1896 to 2016, encompassing almost 120 years of Olympic history.
- Datasets: Two datasets are available:
- Dataset containing information about athletes and their performances.
- Dataset containing the full names of country abbreviations.
- Handling Historical Changes: Some countries did not exist in the past, and some underwent name changes. Special attention was given to researching and addressing these cases.
- One-Hot Encoding: Medals were encoded using one-hot encoding for analysis purposes.
To run the analysis:
- Clone this repository.
- Ensure you have the necessary libraries installed (
streamlit
,pandas
, etc.). - Run the Streamlit app in PyCharm or any other Python environment.
Major issues occurred with ranking the nations since the data was organized athlete-wise. The team sports medals were being counted multiple times, resulting in duplicates in the medal tally. To address this, duplicates with the same team, NOC region, etc., were dropped.
Functions were created in PyCharm to show the analysis and present it using Streamlit methods. Different conditions were implemented based on the choices made in the radio buttons to enhance user interaction. Continuous validation was performed with specific cases to ensure the code accurately reflected the data available on the internet.
- The analysis aims to provide insights into Olympic data spanning over a century, covering various aspects such as medal distribution, historical trends, and individual athlete performances.
To run the analysis:
- Clone this repository.
- Ensure you have the necessary libraries installed (
streamlit
,pandas
, etc.). - Run the Streamlit app in PyCharm or any other Python environment.
Major issues occurred with ranking the nations since the data was organized athlete-wise. The team sports medals were being counted multiple times, resulting in duplicates in the medal tally. To address this, duplicates with the same team, NOC region, etc., were dropped.
Functions were created in PyCharm to show the analysis and present it using Streamlit methods. Different conditions were implemented based on the choices made in the radio buttons to enhance user interaction. Continuous validation was performed with specific cases to ensure the code accurately reflected the data available on the internet.
- The analysis aims to provide insights into Olympic data spanning over a century, covering various aspects such as medal distribution, historical trends, and individual athlete performances.