讲座题目:Fast-Slow Dual-Agent Deep Reinforcement Learning for Dynamic Pricing and Replenishment
讲座人:彭一杰 副教授
主持人:李勇建 教授
讲座时间:2024年9月6日10:00
讲座地点:MG冰球突破官网网站A501-3
讲座摘要:
We study the dynamic pricing and replenishment problems under inconsistent decision frequencies. Our analysis first focuses on a single-product scenario, considering competitor strategies and market conditions. Different from the traditional demand assumption, the discreteness of demand and the parameter within the Poisson distribution as a function of price introduce complexity into analyzing the problem property. We prove that single-period profit function is concave with respect to product price and inventory within their respective domains. The model is subsequently enhanced by integrating a decision tree-based machine learning approach, trained on real market data that includes product attributes, customer behavior, and temporal data. Using a two-timescale stochastic approximation scheme, we effectively address the discrepancies in decision frequencies between pricing and replenishment, ensuring convergence to the limit points of the corresponding ordinary differential equations. We further enhance our methodology by incorporating deep reinforcement learning techniques. Numerical results from both single and multiple product implementations validate the effectiveness of our methods.
讲座人简介:
彭一杰,北京大学光华管理学院副教授,博士生导师。北京大学人工智能研究院多智能体与社会智能中心执行主任、北京大学武汉人工智能研究院多智能体与决策智能实验室主任、北京大学信息技术高等研究院多智能体与工业智能实验室主任。本科毕业于武汉大学数学与统计学院,从复旦大学管理学院获博士学位。在美国马里兰大学和乔治梅森大学分别从事过博士后与助理教授工作。主要研究方向包括仿真建模与优化、金融工程与风险管理、人工智能、健康医疗等。主持优秀青年科学基金、原创探索计划、杰出青年科学基金等。在《Operations Research》,《INFORMS Journal on Computing》和《IEEE Transactions on Automatic Control》等高质量期刊与人工智能顶会上发表学术论文,曾获INFORMS Outstanding Simulation Publication Award、教育部第九届高等学校科学研究优秀成果二等奖。目前担任Asia-Pacific Journal of Operational Research、Journal of Systems Science and Information副主编、《系统管理学报》领域主编,北京运筹学会副理事长、全国工业统计学教学研究会金融科技与大数据分会副理事长、管理科学与工程协会理事。