Top 10 Python Functions to Automate the Steps in Data Science

DATASET can be downloaded here -> https://www.kaggle.com/vetrirah/customer

Steps for Applied Machine Learning (ML) for Hackathons :

  1. Understand the Problem Statement & Import Packages and Datasets.

  2. Perform EDA (Exploratory Data Analysis) - Understanding the Datasets :

    • Explore Train and Test Data and get to know what each Column / Feature denotes.
    • Check for Imbalance of Target Column in Datasets.
    • Visualize Count Plots & Unique Values to infer from Datasets.
  3. Remove Duplicate Rows from Train Data if present.

  4. Fill/Impute Missing Values Continuous - Mean/Median/Any Specific Value & Categorical - Others/ForwardFill/BackFill.

  5. Feature Engineering

    • Feature Selection - Selection of Most Important Existing Features.
    • Feature Creation - Creation of New Feature from the Existing Features.
  6. Split Train Data into Train and Validation Data with Predictors(Independent) & Target(Dependent).

  7. Data Encoding - Label Encoding, OneHot Encoding and Data Scaling - MinMaxScaler, StandardScaler, RobustScaler
  8. Create Baseline ML Model for Multi Class Classification Problem
  9. Improve ML Model,Fine Tune with MODEL Evaluation METRIC - "Accuracy" and Predict Traget "Outcome"
  10. Result Submission, Check Leaderboard & Improve "Accuracy" Score

1. Understand the Problem Statement & Import Packages and Datasets :