BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//IORA - Institute of Operations Research and Analytics - ECPv6.15.11//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://iora.nus.edu.sg
X-WR-CALDESC:Events for IORA - Institute of Operations Research and Analytics
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:Asia/Singapore
BEGIN:STANDARD
TZOFFSETFROM:+0800
TZOFFSETTO:+0800
TZNAME:+08
DTSTART:20250101T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=Asia/Singapore:20260604T100000
DTEND;TZID=Asia/Singapore:20260604T113000
DTSTAMP:20260603T041211
CREATED:20260602T031622Z
LAST-MODIFIED:20260602T031622Z
UID:27613-1780567200-1780572600@iora.nus.edu.sg
SUMMARY:DAO-ISEM-IORA Seminar Series: Jiheng Zhang
DESCRIPTION:Name of Speaker\n\n\nJiheng Zhang \n\n\n\n\nSchedule \n\n\n4 Jun 2026\, 10am – 11.30am \n (60 min talk + 30 min Q&A)\n\n\n\nVenue \n\n\nBIZ1-0302\n\n\n\nLink to register \n(via Zoom)\n\nhttps://nus-sg.zoom.us/meeting/register/wv9WK1alQ6SrTblAvPtglw\n\n\n\n\nTitle\n\n\nAI for OR and OR for AI\n\n\n\n\nAbstract \n\n\n\nThis talk explores the interplay between Operations Research (OR) and Large Language Models (LLMs) through two complementary research directions. \n\n\nIn the first part\, we study the scheduling of LLM inference workloads across large GPU clusters. LLM inference involves two phases: a compute-intensive prefill phase that processes user input\, and a memory-bound decode phase that generates output tokens. When these phases share GPU resources\, prefill tasks throttle concurrent decodes\, creating state-dependent contention that is further complicated by workload heterogeneity across applications. We formulate this as a multiclass many-server queueing network with state-dependent service rates\, grounded in empirical iteration-time measurements. We analyze the fluid approximation and solve steady-state linear programs that characterize optimal resource allocation. We design gate-and-route policies that regulate prefill admission and decode routing\, and prove their asymptotic optimality in the many-GPU limit. We further extend the framework to incorporate service level indicators such as latency and fairness. Numerical experiments demonstrate that our policies outperform standard serving heuristics. \n\n\nIn the second part\, we consider the reverse direction: using LLMs to automate OR analysis. Formulating optimization models from natural language and generating executable solver code could reduce reliance on scarce expert knowledge\, but LLMs suffer from probabilistic inconsistency and existing methods face a data efficiency dilemma. We propose OR-R1\, a framework that integrates supervised fine-tuning with Test-Time Group Relative Policy Optimization (TGRPO). TGRPO extracts reliable training signals from unlabeled data by generating multiple candidate solutions and treating the solver-verified consensus as the ground truth. We provide theoretical guarantees for gradient convergence and show that this voting-based proxy consistently maximizes true solution accuracy. OR-R1 requires only 10% of the training data used by prior methods\, yet achieves state-of-the-art accuracy across eight OR benchmarks. \n\n\n\n\n\nAbout the Speaker\n\n\nJiheng Zhang is a Professor in the Department of Industrial Engineering and Decision Analytics at HKUST\, where he also holds a joint appointment in the Department of Mathematics. His research interests include stochastic modeling and optimization\, statistical learning\, numerical methods\, and algorithms\, with applications in operations management\, large communication networks\, and financial technology. He serves as an associate editor for several top journals\, including Operations Research\, Stochastic Systems\, and Probability in the Engineering and Informational Sciences. Since 2018\, he has been the Director of the EPI-One Lab\, leading various applied projects with industry partners such as Huawei and Webank. He holds several patents in areas such as large-scale production planning and blockchain consensus mechanism design. He earned his Ph.D. in Operations Research from the H. Milton Stewart School of Industrial and Systems Engineering at the Georgia Institute of Technology in 2009. He also holds an M.S. in Mathematics from Ohio State University and a B.S. in Mathematics from Nanjing University.
URL:https://iora.nus.edu.sg/events/dao-isem-iora-seminar-series-jiheng-zhang/
CATEGORIES:IORA Seminar Series
END:VEVENT
END:VCALENDAR