Module II·Article IV·~2 min read
Machine Learning in Products: from Idea to Production
Data and Analytics for Business
Turn this article into a podcast
Pick voices, format, length — AI generates the audio
ML Project Lifecycle
1. Problem Definition: The ML task must correspond to the business objective. “Reduce customer churn” → “predict the probability of churn in the next 30 days” (classification).
2. Data Collection and Preparation: 80% of ML project time. Data cleaning, handling missing values, feature engineering (creating features from raw data).
3. Model Selection and Training: simple models (logistic regression, decision tree) → ensembles (Random Forest, XGBoost) → neural networks. Rule: start with simple.
4. Model Evaluation: metrics depend on the task: Accuracy, Precision, Recall, F1 (classification); RMSE, MAE (regression); AUC-ROC (probabilistic classification).
5. Deployment (MLOps): a model in production ≠ model in a notebook. MLOps — practices for deploying and maintaining ML models: versioning, monitoring data drift, retraining.
6. Monitoring: models degrade. Data drift (change in input data) and concept drift (change in dependency) require regular retraining.
Feature Engineering: the Art of Creating Features
The quality of features is more important than the complexity of the algorithm. Examples of feature engineering: from transaction date → day of the week, hour of the day, days until the end of the month; from text → TF-IDF, word embeddings; from geolocation → distance to the nearest competitor.
When ML Is NOT Needed
ML is not always the best tool. Rules: (1) if the problem is solved by rules — use rules; (2) if there is little data (<1000 examples for most tasks) — ML doesn’t work; (3) if automation is not needed — use analytics.
Practical Assignment
An e-commerce company wants to predict if a customer will return next month. Data: purchase history for 2 years, 500,000 customers. (1) Formulate the ML task (type, target variable). (2) Suggest 10 features for the model. (3) Which metric will you use for evaluation? Why? (4) How to use the results of the model in business?
§ Act · what next