establish an intuition about where to focus efforts and time when creating DS and ML pipelines.
(assuming a dataset post preparation and cleaning is available)
Note: the content of this educational talk is also being used in presentations.
slide deck used in the presentations and video materials.
used in the presentations and video materials.
Note: these Notebooks have been developed using older versions of packages (particularly scikit-learn). They have notbeen verified with current versions.
Demo Notebook Feature Engineering
Madelon Dataset paper
Design of experiments for the NIPS 2003 variable selection benchmark
Isabelle Guyon – July 2003
Boruta Feature Selection Heuristic
Feature Selection with the Boruta Package
Miron B. Kursa (University of Warsaw)
Witold R. Rudnicki (University of Warsaw)