Business Problems & Data Science Solutions

Somsak Chanaim

International College of Digital Innovation, CMU

June 6, 2025

Business Problems

Business Problem refers to a situation or obstacle that prevents a business from achieving its goals.

Examples include:

  • declining profits

  • falling sales

  • losing customers

  • rising costs

  • inefficient operations

If left unaddressed, these issues can affect a company’s competitiveness or long-term survival.

Why Do Businesses Need Data Science?

  • Businesses have large volumes of data but lack deep analytical insights

  • Enables data-driven decision-making

  • Helps reduce costs, improve efficiency, and forecast future trends

Data Driven Business Problem Solving

Data-Driven Business Problem Solving

This approach involves using available data to analyze root causes and make strategic decisions based on evidence, rather than just “intuition” or “experience” alone.

Steps in Solving Business Problems with Data

Example: Sales of Product Group A dropped by 30% in the last quarter.

  1. Clearly Define the Problem

    • A good problem should be clear, measurable, and time-bound.

    • It should be reframed as a data-answerable question.

  2. Collect Relevant Data


  • Daily/itemized sales data

  • Customer purchasing behavior

  • Promotion and competitor activity

  • Customer feedback and reviews

  1. Analyze the Data
Technique Use Case Example
Descriptive Analytics Describe what happened “Sales dropped on weekdays”
Diagnostic Analytics Identify the cause “Customers aged 20–30 stopped buying”
Predictive Analytics Forecast what may happen “Sales may decline next quarter”
Prescriptive Analytics Recommend actions “Adjust pricing or launch a campaign”
  1. Communicate Results Clearly

    • Dashboards

    • Data storytelling

    • Meeting slides focusing on impact and actionable insights

  2. Make Decisions and Monitor Outcomes

    • Test solutions (e.g., A/B testing)

    • Measure impact through KPIs

    • Apply continuous improvement cycles

Real-World Data Use Case

Problem: Customers Are Not Returning

Data Used:

  • Purchase history

  • Time gap between purchases

  • Customer satisfaction scores (Net Promoter Score - NPS)

Solution Approaches:

  • Analyze Customer Lifetime Value (CLV)

  • Perform segmentation and send targeted promotions

  • Build a recommendation system to encourage repeat purchases

Examples of Business Problem Types

Private Sector

  • Marketing: Low customer retention, low conversion rates, low brand awareness

  • Operations: Overstocked inventory, demand fluctuation, slow processes

  • Finance: Poor cost control, inaccurate profit forecasts, cash flow issues

  • Human Resources: High employee turnover, low engagement, lack of tech talent

Government / Public Sector

  • Unequal access to public services: Slow complaint handling, staff shortages

  • Misallocated budget: Inability to measure policy impact

  • Lack of data for policy decisions: e.g., outdated data on low-income citizens

  • Corruption and low transparency: Weak internal auditing systems

  • Urban issues: Traffic, PM2.5, public health — require data-driven spatial and behavioral management

Individual Level

  • Personal finance: Overspending, lack of savings, unsure about investments

  • Career development: Unaware of market-relevant skills, lack of career planning

  • Health & behavior: Lack of personal health data, risky habits

  • Lifelong learning: Don’t know how to upskill or choose the right courses

  • Life decisions: Choosing a house/car, planning for family life

Translating Business Problems → Data Problems

Business Problem Data Problem Technique Used
Customer churn Predict churn Classification
Poor product sales Forecast sales Time Series Forecasting
Ineffective promotions Segment customers Clustering
Fraud detection Identify abnormal behavior Anomaly Detection

Key Business Metrics to Know

  • Customer Lifetime Value (CLV)

  • Customer Retention Rate

  • Return on Investment (ROI)

  • Inventory Turnover Rate

  • Click-Through Rate (CTR)

1. Customer Lifetime Value (CLV)

The total value a customer is expected to generate for the business throughout their entire relationship.

Formula:

\[ \text{CLV} = \text{Average Purchase Value} \times \text{Purchase Frequency} \times \text{Customer Lifespan} \]

Example:

  • Average purchase = 500 THB/month

  • Buys every month → 12 times/year

  • Remains a customer for 3 years

\[ \text{CLV} = 500 \times 12 \times 3 = 18{,}000 \text{ THB} \]

2. Customer Retention Rate

The percentage of existing customers who continue to use the service over a given period.

Formula:

\[ \text{Retention Rate} = \left( \frac{E - N}{S} \right) \times 100 \]

  • \(E\): Number of customers at the end of the period

  • \(N\): Number of new customers acquired

  • \(S\): Number of existing customers at the beginning

Example:

  • Start of year: 1,000 customers

  • End of year: 1,200 customers (400 new)

\[ \text{Retention Rate} = \left( \frac{1{,}200 - 400}{1{,}000} \right) \times 100 = 80\% \]

3. Return on Investment (ROI)

Measures the return on an investment to evaluate its efficiency or profitability.

Formula:

\[ \text{ROI} = \left( \frac{\text{Net Profit}}{\text{Investment Cost}} \right) \times 100 \]

Example:

  • Investment: 100,000 THB

  • Net profit: 30,000 THB

\[ \text{ROI} = \left( \frac{30{,}000}{100{,}000} \right) \times 100 = 30\% \]

4. Inventory Turnover Rate

Measures how quickly inventory is sold and replenished. A higher rate means lower holding time.

Formula:

\[ \text{Inventory Turnover} = \frac{\text{Cost of Goods Sold (COGS)}}{\text{Average Inventory}} \]

Example:

  • Annual COGS = 1,000,000 THB

  • Average inventory value = 250,000 THB

\[ \text{Inventory Turnover} = \frac{1{,}000{,}000}{250{,}000} = 4 \text{ cycles/year} \]

5. Click-Through Rate (CTR)

Measures the percentage of viewers who clicked on an ad or link.

Formula:

\[ \text{CTR} = \left( \frac{\text{Number of Clicks}}{\text{Number of Impressions}} \right) \times 100 \]

Example:

  • Ad impressions: 10,000

  • Clicks: 300

\[ \text{CTR} = \left( \frac{300}{10{,}000} \right) \times 100 = 3\% \]

Types of Business Data

  • Behavioral data (e.g., customer actions)

  • Transaction data (e.g., purchase history)

  • Real-time logs (e.g., system or clickstream data)

  • External data (e.g., weather, social media)

Data Science Tools

  • Languages: Python, R, SQL

  • Libraries: Pandas, Scikit-learn, Statsmodels

  • Visualization: Tableau, Power BI, matplotlib

  • Cloud Platforms: Google Colab, Azure, AWS

Choosing the Right Model for the Problem

Type Example Models When to Use
Classification Logistic Regression, Random Forest To predict categories
Regression Linear Regression, XGBoost To predict numeric values
Clustering K-Means, DBSCAN To find hidden groups
Recommendation Collaborative Filtering To recommend products
Time Series ARIMA, Prophet To forecast future trends

Case Study Example

Sales dropped by 20% last month

  1. Investigate using Time Series Decomposition

  2. Analyze the marketing campaign → Promotion Analysis

  3. Perform Customer Segmentation

  4. Build a Sales Forecasting Model

How to Present the Results?

  • Create a dashboard for clear and interactive insights

  • Use data storytelling

  • Focus on business impact, not just model accuracy

Ethics & Pitfalls

  • Data Privacy & GDPR (General Data Protection Regulation)

  • Bias in data can lead to flawed models

  • Ensure model results are explainable and understandable to stakeholders

Cases Study

Sales Forecasting

Business Problem: Need to forecast daily sales per branch to optimize inventory management.

Models Commonly Used

  • Time Series Models: ARIMA, SARIMA

  • Gradient Boosting: XGBoost, LightGBM

  • Deep Learning: LSTM (Long Short-Term Memory)

Workflow

  1. Collect Data: Daily sales, promotions, holidays, temperature, etc.

  2. Feature Engineering: Create new features such as lag sales, holiday dummy variables

  3. Model Selection & Training: Choose and train the model using historical data

  4. Model Evaluation: Assess using metrics like MAE, RMSE

  5. Deployment: Use the model to forecast future sales

  6. Feedback Loop: Continuously update and improve the model based on performance

🎬 Netflix: Personalized Recommendation System

Problem: Users don’t know what to watch → Churn risk

Problem: Users don’t know what to watch → Churn risk

Models Used

  • Collaborative Filtering: Matrix Factorization (SVD, ALS)

  • Content-Based Filtering: TF-IDF + Cosine Similarity

  • Deep Learning: Autoencoders, Neural Collaborative Filtering

Workflow

  1. Collect user viewing data (user–movie interaction logs)

  2. Build user and content profiles (user embeddings, movie features)

  3. Train models to predict preferences (e.g., probability of watching)

  4. Generate personalized recommendations

  5. A/B test to evaluate user satisfaction

  6. Improve the model based on feedback and new interactions

☕ Starbucks: Location Analytics for New Store Placement

Problem: Where should we open a new store to ensure profitability?

Problem: Where should we open a new store to ensure profitability?

Models Used:

  • Geospatial Clustering: K-Means, DBSCAN

  • Regression Models: Random Forest Regressor

  • Predictive Modeling: Gradient Boosting, Decision Trees

Workflow:

  1. Collect data: existing store locations, foot traffic, revenue, competitor density

  2. Create GIS maps: plot coordinates and connect with spatial data

  3. Cluster locations to find areas similar to high-performing stores

  4. Forecast expected sales at potential new locations

  5. Recommend high-potential areas for expansion

  6. Monitor actual performance post-launch

🚗 GrabCar: Dynamic Pricing Strategy

Problem:

  • Passengers complain about unexpected fare increases or excessively high prices during peak hours

  • However, without price adjustments, drivers may reject requests → the challenge is to balance passenger satisfaction vs. driver incentives

Models Used:

  • Regression Models: Linear, Ridge, Lasso, Gradient Boosting

  • Reinforcement Learning: Multi-Armed Bandit, Deep Q-Network (DQN)

  • Demand Forecasting: XGBoost, LSTM (for time-dependent patterns)

  • Geospatial Analytics: Heatmaps, Clustering

Data Science Pipeline:

  1. Collect relevant data

    • Ride request volume by location

    • Trip duration, traffic density

    • Cancellation and surge pricing behavior

    • External factors: weather, holidays, events

  2. Analyze Demand & Supply

    • Detect demand surges via heatmaps and time series

    • Identify areas with driver shortages

  3. Forecast future demand

    • Predict the number of ride requests in the next 15 minutes using time series/ML models
  4. Apply Dynamic Pricing Algorithms

    • Adjust fares based on supply-demand ratio

    • Use Reinforcement Learning to find the most effective pricing

  5. Conduct A/B Testing

    • Compare fixed pricing vs. dynamic pricing groups

    • Evaluate usage, wait times, customer satisfaction

  6. Deploy & Monitor

    • Continuously update models based on seasonal, event-based, or behavioral changes

✅ Outcomes:

  • Reduced passenger wait times → Higher satisfaction

  • Maintained driver motivation → More accepted rides

  • Improved ride matching → Lower cancellation rates

  • Increased average driver earnings during peak hours

🛍️ Zara: Inventory Optimization

Problem: Fast fashion trends change rapidly → unsold inventory

Problem: Fast fashion trends change rapidly → unsold inventory

Models Used:

  • Clustering: K-Means, Hierarchical Clustering

  • Classification: Random Forest, Logistic Regression

  • Forecasting: Time Series + Machine Learning

Workflow:

  1. Analyze sales data by store and product

  2. Cluster stores based on sales behavior

  3. Forecast demand per store using customer behavior patterns

  4. Align inventory with the target segment of each store

  5. Use ML to identify “low-demand” items → reduce overproduction

  6. Track weekly performance and adjust models based on actual sales

Chapter Summary

  • Understanding the business problem is the first step

  • Match the problem with the right Data Science techniques

  • Present actionable and practical results

  • “Data is valuable only when turned into action.”