Module II·Article IV·~2 min read

Machine Learning in Products: from Idea to Production

Data and Analytics for Business

Turn this article into a podcast

Pick voices, format, length — AI generates the audio

ML Project Lifecycle

1. Problem Definition: The ML task must correspond to the business objective. “Reduce customer churn” → “predict the probability of churn in the next 30 days” (classification).

2. Data Collection and Preparation: 80% of ML project time. Data cleaning, handling missing values, feature engineering (creating features from raw data).

3. Model Selection and Training: simple models (logistic regression, decision tree) → ensembles (Random Forest, XGBoost) → neural networks. Rule: start with simple.

4. Model Evaluation: metrics depend on the task: Accuracy, Precision, Recall, F1 (classification); RMSE, MAE (regression); AUC-ROC (probabilistic classification).

5. Deployment (MLOps): a model in production ≠ model in a notebook. MLOps — practices for deploying and maintaining ML models: versioning, monitoring data drift, retraining.

6. Monitoring: models degrade. Data drift (change in input data) and concept drift (change in dependency) require regular retraining.

Feature Engineering: the Art of Creating Features

The quality of features is more important than the complexity of the algorithm. Examples of feature engineering: from transaction date → day of the week, hour of the day, days until the end of the month; from text → TF-IDF, word embeddings; from geolocation → distance to the nearest competitor.

When ML Is NOT Needed

ML is not always the best tool. Rules: (1) if the problem is solved by rules — use rules; (2) if there is little data (<1000 examples for most tasks) — ML doesn’t work; (3) if automation is not needed — use analytics.

Practical Assignment

An e-commerce company wants to predict if a customer will return next month. Data: purchase history for 2 years, 500,000 customers. (1) Formulate the ML task (type, target variable). (2) Suggest 10 features for the model. (3) Which metric will you use for evaluation? Why? (4) How to use the results of the model in business?

§ Act · what next