
AnestheSim
Gymnasium Environments for Reinforcement Learning in Automated Anesthesia Drug Dosing
Hass Dhia -- Smart Technology Investments Research Institute
Three Gymnasium-compatible RL environments for automated anesthesia drug dosing: propofol infusion control via the Marsh three-compartment pharmacokinetic model with Hill pharmacodynamic BIS prediction, remifentanil effect-site concentration targeting via the Minto model, and combined propofol-remifentanil anesthesia management using the Greco synergistic interaction surface. Includes configurable difficulty tiers with patient variability and surgical stimulation events, heuristic TCI clinical baselines, PPO RL agents, and a benchmark suite across three difficulty levels. Key finding: pharmacokinetic timescale, not task complexity, is the primary determinant of RL sample efficiency in drug dosing control. 109 tests, MIT licensed.







