Skip to content
View juniorcl's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report juniorcl

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
juniorcl/README.md

👋 Hi, I'm Clébio Júnior

Data Scientist with over 4 years of experience applying machine learning techniques and data analysis to solve real-world business problems with measurable impact.

Throughout my career, I worked on projects involving the extraction and processing of structured and unstructured data, predictive modeling for credit scoring and sales forecasting, anomaly detection, and customer behavior analysis — always with a focus on turning data into actionable insights.

I have hands-on experience with libraries such as scikit-learn, spaCy, pdfplumber, pytesseract, and pandas, as well as techniques like NLP, clustering, and supervised learning.

Get my resume

Currículo em Português Resume in English

Connect with me

Linkedin Badge Medium Badge Kaggle Badge Gmail Badge

👨‍💻 About me

  • 🎓 Master's degree in Natural Sciences (UENF) and Bachelor's degree in Physics (IFF)
  • 📊 Experienced in predictive modeling, clustering, and NLP
  • 🛠️ Skilled with Python, scikit-learn, spaCy, pandas, pdfplumber, pytesseract
  • 🖥️ Hands-on experience with unstructured data extraction and interactive dashboards in SAS
  • ✍️ I share technical content on Medium

💼 Professional Experience

Data Scientist | Vert Analytics (Oct/2024 – Present, Remote)

  • Developed solutions for unstructured data extraction (PDFs and images) using pdfplumber, Tesseract OCR, regex.
  • Built interactive SAS dashboards for time series analysis and anomaly detection.
  • Applied NLP (spaCy) to analyze social media comments, identifying customer concerns and dissatisfaction patterns.

Data Scientist | Datarisk (Jan/2022 – Aug/2024, Remote)

  • Built credit scoring models, sales forecasting, and customer segmentation using machine learning and clustering.
  • Developed predictive models for customer behavior (default risk, plan upgrade likelihood, job instability).
  • Delivered insights supporting strategic decision-making and risk reduction.

Data Scientist | Be.X! (Mar/2021 – Jan/2022, Remote)

  • Processed structured and unstructured data using regex and data cleaning techniques.
  • Implemented outlier detection algorithms based on business rules for risk mitigation.
  • Created ML models for delivery delay prediction, improving logistics operations.

📊 GitHub Stats

GitHub Stats
Top Langs

Pinned Loading

  1. data-science-toolkit data-science-toolkit Public

    Repository with a set of functions to help the data science project.

    Jupyter Notebook

  2. webscraping-comment-tripadvisor webscraping-comment-tripadvisor Public

    A study project who aims to create a web scarping for getting informations about comments on the site

    Python 2

  3. transaction-fraud-detection transaction-fraud-detection Public

    A data science project to predict whether a transaction is a fraud or not.

    Jupyter Notebook 190 99

  4. churn-prediction churn-prediction Public

    It is a data science project with a focus on predicting whether a customer will be churn or not. This term is used to refer to customers who find themselves evaded.

    Jupyter Notebook 3 4

  5. olist-delivery-forecast olist-delivery-forecast Public

    Jupyter Notebook 1

  6. lmaoclost/Machine-Health lmaoclost/Machine-Health Public

    A Study project that predict disease with Machine Learning, NodeJS and Data Science

    TypeScript 5