Python — data analysis with Polars (PYDATA1B)

Programming, Python

Work with data in Excel, Power Query, SQL or Pandas but hit speed limits? Polars is a modern Python library that handles millions of rows much faster than traditional tools. Learn high performance, scalability and speed to overcome current bottlenecks.

On the workshop you'll learn to process large datasets in Polars inside Jupyter Notebook, reusing your SQL skills and table-tool experience. Learn the Expressions API, optimize queries and build repeatable workflows for automation.

Location, current course term

Contact us

Custom Customized Training (date, location, content, duration)

The course:

Hide detail
  • Data analysis tools
    1. Problem formulation
    2. Overview of data analysis tools
    3. Why Polars — speed, efficiency, modern API
  • Jupyter Notebook
    1. Installation and startup
    2. Text and code in one document
    3. Running individual steps
    4. Documenting processing steps
  • DataFrame and basic operations
    1. Creating DataFrame
    2. Data types in Polars
    3. Selecting columns and filtering rows
    4. Adding and transforming columns
  • Expressions API
    1. What expressions are and why they matter
    2. Column selection: pl.col(), pl.all(), pl.exclude()
    3. Conditions: pl.when().then().otherwise()
    4. Method chaining
  • Data sources
    1. Tabular formats (Excel, CSV)
    2. Database sources (SQL)
    3. Parquet format for big data
    4. Working with multiple files at once
  • Data processing
    1. Transforming tabular data
    2. Data type conversion
    3. Handling missing values
    4. Joining tables (join, concat)
  • Data aggregation
    1. Grouping and aggregation functions
    2. Multiple aggregations at once
    3. Pivot tables
  • Performance optimization
    1. Lazy evaluation — automatic query optimization
    2. When to use eager vs lazy mode
    3. Streaming for big data
    4. Visualizing the query plan
  • Error handling
    1. Reading Polars error messages
    2. Common errors: data types, missing columns
    3. Debugging and data validation
  • Outputs and presenting results
    1. Tables and charts (Matplotlib)
    2. Export to various formats
Assumed knowledge:
Basic Python (variables, loops, functions); experience with Excel, SQL, Power Query or Pandas is advantageous.
Schedule:
2 days (9:00 AM - 5:00 PM )
Course price:
392.00 € ( 474.32 € incl. 21% VAT)
Language: