Skip to content

Latest commit

 

History

History
52 lines (30 loc) · 5.78 KB

01.1-AILB.md

File metadata and controls

52 lines (30 loc) · 5.78 KB

Home
Prerequisites: 00.1-ST - 00.2-ST
Labs: 01-AIOV - YOU ARE HERE - 01.2-AILB - 02-AIOV - 02.1-AILB - 02.2-AILB - 02.3-AIOV - 03-AIOV - 03.1-AILB - 03.2-AILB - 03.3-AIOV - 04-AIOV - 04.1-AILB - 04.2-AIOV - 05-AIOV - 05.1-AILB - 05.2-AIOV - 05.3-AILB - 06-AIOV - 06.1-AILB - 06.2-AILB - 06.3-AILB - 06.4-AILB - 06.5-AILB - 06.6-AILB - 06.7-AILB - 07-AIOV - Heretics Methodology

01.1-AILB - Deep Dive

Exploiting AI - Becoming an AI Hacker

📒 AI Deep Dive

This overview is a deepdive into the interworkings of AI, creating a dataset, to having a trained and tuned AI model. This lab will take a deeper look into how and why AI works and is created from the ground up.

Overview

The following is how an AI is more or less "Created", an AI goes through many phases before becoming a fully interactive LLM.

Preprocessing

Preprocessing is foundational in AI model development, involving tasks like cleaning, normalization, and feature extraction to transform raw data into a suitable format for algorithms. For instance, in text datasets, this includes removing stop words, handling special characters, correcting spelling errors, and converting text to lowercase. Numeric data may undergo scaling and outlier removal. Feature extraction identifies and selects relevant attributes from the data, ensuring they are informative for the specific AI task at hand.

Tokenization

Tokenization is required in natural language processing (NLP). Tokenization breaks text into tokens such as words, subwords, or characters. Tokenization is required for text analysis tasks, sentiment analysis, named entity recognition, and machine translation. Tools like NLTK, spaCy, and Hugging Face Transformers provide various tokenization methods suitable for different languages and tasks.

Text Representation

Text representation converts tokens into machine-readable formats like vectors or matrices. Techniques such as Word2Vec, GloVe, and FastText encode semantic meaning into vector spaces, enabling algorithms to understand relationships between words. Word embeddings capture semantic relationships, such as "king" - "man" + "woman" ≈ "queen". These representations are essential for tasks like document classification, information retrieval, and semantic similarity calculations.

Model Architecture

Model architecture dictates how data flows through a machine learning model. Feedforward neural networks (FNNs) process data in a straightforward manner from input to output layers. Convolutional Neural Networks (CNNs) excel in analyzing grid-like data such as images through convolutional and pooling layers. Recurrent Neural Networks (RNNs) process sequential data, making them suitable for tasks like speech recognition and time series prediction. Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks address the vanishing gradient problem in RNNs, enabling longer-term dependencies. Transformers, with self-attention mechanisms, revolutionized NLP tasks by capturing global dependencies in sequences, essential for tasks like language translation and text generation.

Model Training

Model training adjusts parameters using optimization algorithms like Gradient Descent, Stochastic Gradient Descent (SGD), or Adam. These algorithms minimize a defined loss function such as Mean Squared Error (MSE) for regression tasks or Cross-Entropy Loss for classification tasks. Training involves splitting data into training and validation sets to prevent overfitting and ensure generalizability. Hyperparameter tuning optimizes model performance by adjusting parameters like learning rate and batch size, enhancing convergence and reducing training time.

Model Evaluation

Model evaluation assesses performance using metrics like accuracy, precision, recall, and F1-score, selected based on the specific task requirements. Accuracy measures the proportion of correct predictions, while precision and recall evaluate the model's ability to correctly identify relevant instances and capture all relevant instances, respectively. F1-score, the harmonic mean of precision and recall, provides a balanced measure of a model's performance across different thresholds.

Model Refinement

Model refinement improves performance through techniques like hyperparameter tuning, regularization, and ensemble methods. Hyperparameter tuning optimizes parameters not directly learned during training, such as regularization strength or dropout rates, to enhance model performance on unseen data. Regularization techniques like L1 and L2 penalties prevent overfitting by constraining model complexity. Ensemble methods combine predictions from multiple models to improve accuracy and robustness, using approaches like bagging (Bootstrap Aggregating), boosting (e.g., AdaBoost), and stacking, which combine diverse models to leverage their individual strengths and reduce weaknesses.

NEXT: 01.2-AILB

PREVIOUS: 01-AIOV