Machine Learning in Practice: Detection and Classification (AICLS)
AI - Artificial Intelligence, AI for developers
This course provides a practical framework for building detection and classification systems using machine learning. You will cover the model lifecycle: data preparation and feature engineering, algorithm selection and evaluation metrics (precision/recall, ROC-AUC, F1) and model interpretation.
It focuses on avoiding common mistakes such as data leakage and bad validation, ensures reproducibility and safe data handling, and covers deployment and operations: Docker/REST, monitoring and drift detection, retraining, sample notebooks, pipelines and SOC/MLOps integration.
Location, current course term
Contact us
The course:
Hide detail
-
Introduction & ML basics
-
Terminology: supervised/unsupervised, classification vs regression, anomaly detection
-
ML vs DL: when tree-based models suffice and when CNN/LSTM/Transformers are needed
-
Data leakage, train/test split, cross-validation — common mistakes and prevention
-
Detection domains (security use-cases)
-
Phishing & spam: text/URL features, domain reputation
-
DGA/malicious domains: length, entropy, n-grams, WHOIS/TLS attributes
-
Log anomalies: outlier detection, DBSCAN, Isolation Forest
-
Data sources: Alexa/Tranco, PhishTank, internal logs, synthetic data
-
Feature engineering & model selection
-
Lexical/statistical features for domains/URLs, TF-IDF, hashing trick, embeddings
-
XGBoost/LightGBM, SVM, RandomForest vs simple CNN/LSTM
-
scikit-learn pipelines, preventing leakage, reproducibility
-
Hyperparameter tuning: GridSearchCV vs Optuna
-
Metrics and validation
-
Confusion matrix, ROC vs PR curve (imbalanced data)
-
F1, balanced accuracy, MCC
-
Model interpretation: SHAP/LIME
-
K-fold CV, stratification, time-based split
-
Deployment and operations
-
Model export (pickle/joblib vs ONNX)
-
Docker + REST API (FastAPI/Flask), CI/CD pipeline
-
Quality monitoring, prediction logging, drift detection, retraining (Airflow/cron), versioning (MLflow/DVC)
-
Hands-on workshop
-
From CSV to classifier: data prep, feature engineering, model training
-
Metric visualization (ROC/PR), SHAP interpretation
-
Mini-deploy (local REST API) + manager-facing report
-
Bonus / extensions
-
Active learning, semi-supervised approaches
-
Feature store & model registry (Feast, MLflow)
-
SIEM/SOC integration, auto-enrichment of alerts
-
MLOps basics: monitoring, governance, reproducibility
-
Assumed knowledge:
-
Basic Python skills and a basic understanding of machine learning.
-
Schedule:
-
2 days (9:00 AM - 5:00 PM )
-
Language:
-