Machine Learning in Practice: Detection and Classification (AICLS)

AI - Artificial Intelligence, AI for developers

This course provides a practical framework for building detection and classification systems using machine learning. You will cover the model lifecycle: data preparation and feature engineering, algorithm selection and evaluation metrics (precision/recall, ROC-AUC, F1) and model interpretation.

It focuses on avoiding common mistakes such as data leakage and bad validation, ensures reproducibility and safe data handling, and covers deployment and operations: Docker/REST, monitoring and drift detection, retraining, sample notebooks, pipelines and SOC/MLOps integration.

Location, current course term

Contact us

Custom Customized Training (date, location, content, duration)

The course:

Hide detail
  • Introduction & ML basics
    1. Terminology: supervised/unsupervised, classification vs regression, anomaly detection
    2. ML vs DL: when tree-based models suffice and when CNN/LSTM/Transformers are needed
    3. Data leakage, train/test split, cross-validation — common mistakes and prevention
  • Detection domains (security use-cases)
    1. Phishing & spam: text/URL features, domain reputation
    2. DGA/malicious domains: length, entropy, n-grams, WHOIS/TLS attributes
    3. Log anomalies: outlier detection, DBSCAN, Isolation Forest
    4. Data sources: Alexa/Tranco, PhishTank, internal logs, synthetic data
  • Feature engineering & model selection
    1. Lexical/statistical features for domains/URLs, TF-IDF, hashing trick, embeddings
    2. XGBoost/LightGBM, SVM, RandomForest vs simple CNN/LSTM
    3. scikit-learn pipelines, preventing leakage, reproducibility
    4. Hyperparameter tuning: GridSearchCV vs Optuna
  • Metrics and validation
    1. Confusion matrix, ROC vs PR curve (imbalanced data)
    2. F1, balanced accuracy, MCC
    3. Model interpretation: SHAP/LIME
    4. K-fold CV, stratification, time-based split
  • Deployment and operations
    1. Model export (pickle/joblib vs ONNX)
    2. Docker + REST API (FastAPI/Flask), CI/CD pipeline
    3. Quality monitoring, prediction logging, drift detection, retraining (Airflow/cron), versioning (MLflow/DVC)
  • Hands-on workshop
    1. From CSV to classifier: data prep, feature engineering, model training
    2. Metric visualization (ROC/PR), SHAP interpretation
    3. Mini-deploy (local REST API) + manager-facing report
  • Bonus / extensions
    1. Active learning, semi-supervised approaches
    2. Feature store & model registry (Feast, MLflow)
    3. SIEM/SOC integration, auto-enrichment of alerts
    4. MLOps basics: monitoring, governance, reproducibility
Assumed knowledge:
Basic Python skills and a basic understanding of machine learning.
Schedule:
2 days (9:00 AM - 5:00 PM )
Language: