Doreen Xu

Senior Data Scientist

7+ years building pricing, forecasting, and decision-support systems that drive measurable business impact across logistics and tech.

Doreen Xu

About

I'm a Senior Data Scientist who turns complex data into systems that make better decisions at scale. Over the past 7+ years, I've designed and deployed production-grade models across dynamic pricing, demand forecasting, and revenue optimization — work that has driven double-digit growth in real markets.

At SF Express (China's largest logistics company), I built the price elasticity and cost estimation models at the core of a company-wide dynamic pricing system spanning 470+ routes. At Ximalaya (China's #1 audio platform, 600M+ users), I led a cross-functional team to develop ML models that replaced subjective judgment with data-driven decisions in copyright acquisitions.

I hold dual Master's degrees from Georgia Tech in Computational Science & Engineering and Analytics, and a B.S. in Economics from the University of Minnesota. I'm based in Atlanta and bring a blend of rigorous quantitative training and hands-on experience shipping models that create measurable business impact.

Projects

Featured work with measurable business impact

E-commerce Demand Forecasting

Independent Project

21% YoY Revenue Growth

Developed seasonal demand forecasting model to eliminate recurring peak-season stockouts. Built automated competitor price monitoring system using Amazon API for dynamic pricing decisions.

Time SeriesForecastingPythonAPI Integration

Dynamic Pricing System

SF Express — $30B+ Revenue

470+ Routes Optimized

Led development of price elasticity and cost estimation models as core components of a company-wide dynamic pricing system. Validated through real-market experiments achieving 14% morning volume increase.

Price ElasticityXGBoostA/B TestingHive

Copyright Value Estimation

Ximalaya — 600M+ Users

<20% MAPE Achieved

Led cross-functional team to develop and deploy a CatBoost-based copyright value estimation model, enabling data-driven investment decisions that previously relied on subjective judgment.

CatBoostTeam LeadershipCross-functional

Customs Declaration Value Recommendation

SF Express

2 Countries Deployed

Applied NLP (Bag-of-Words) and Keras quantile regression on historical customs data to recommend declaration value ranges. Deployed to Japan and South Korea order channels.

NLPKerasQuantile Regression

Fresh Grocery Demand Forecasting

SF Express — Consulting

~30% MAPE (11.4M records)

Built t+3 day SKU-level demand forecasting for a fresh grocery chain across 5 stores with RF + CatBoost ensemble and engineered calendar, weather, and sales distribution features.

Random ForestCatBoostFeature Engineering

Experience

Senior Data Scientist

Ximalaya

China's #1 audio platform — 600M+ users, comparable to Spotify/Audible

Jul 2021 — Sep 2023
Shanghai, China
  • Led cross-functional team (1 PM, 1 engineer, 2 data analysts) to deploy CatBoost-based copyright value estimation model (MAPE < 20%)
  • Built business dashboards with role-specific KPI views serving genre managers through senior leadership
  • Designed cross-departmental album grading system for standardized data-driven traffic allocation

Big Data Analysis & Mining Engineer

SF Express — SF Technology Co., Ltd

China's FedEx — largest logistics company, NYSE-listed, $30B+ revenue

Jun 2018 — Jul 2021
Shenzhen, China
  • Led development of price elasticity and cost estimation models for company-wide dynamic pricing across 470+ origin-destination pairs
  • Designed data infrastructure using Hive and Tableau with DWS-layer warehouse tables and dashboards
  • Built Python simulation to evaluate delivery timeliness improvement for new product launch

Data Scientist

Big Data Analytics Trading Inc

Feb 2017 — May 2018
Atlanta, GA
  • Built Python scraping pipelines and price monitoring system with ETL and API integrations for Amazon arbitrage

Skills

Machine Learning

Linear RegressionLog-linearQuantile RegressionXGBoostLightGBMCatBoostRandom ForestClusteringTime Series Analysis

Analytics & Optimization

Demand ForecastingRevenue OptimizationPrice Elasticity ModelingDynamic PricingA/B Testing

Data & Engineering

PythonPandasNumPyScikit-learnSQLHiveTableauGitDockerSparkAWSETL/Data Pipelines

Publication

"Dynamic Pricing Strategy for Logistics Revenue Management using Data Mining Technology."

ISCSIC 2020, ACM

Education

Georgia Institute of Technology

M.S. in Computational Science & Engineering, M.S. in Analytics

GPA: 3.5/4.0 Atlanta, GA | 2016

University of Minnesota, Twin Cities

B.S. in Economics with Distinction

GPA: 3.8/4.0 Minneapolis, MN | 2013

Let's Connect

I'm open to Senior Data Scientist, ML Engineer, and analytics roles in Atlanta or remote. Let's talk about how I can help your team make better data-driven decisions.

Atlanta, GA