
HemoSim
Gymnasium Environments for Reinforcement Learning in Hemostasis and Anticoagulation Management
Hass Dhia - Smart Technology Investments Research Institute
Four Gymnasium-compatible RL environments for hemostasis and anticoagulation management: warfarin dose titration with CYP2C9/VKORC1 pharmacogenomics (Hamberg 2007), heparin infusion optimization with aPTT monitoring (Raschke 1993), direct oral anticoagulant selection for atrial fibrillation (RE-LY, ROCKET-AF, ARISTOTLE trials), and disseminated intravascular coagulation management with multi-component blood product therapy (ISTH DIC scoring). Implements an 8-state reduced coagulation cascade ODE model derived from Hockin et al. (2002), population PK/PD models with pharmacogenomic patient variability, and clinical protocol baselines (IWPC, Raschke nomogram). Key finding: PPO achieves 83.4% improvement over clinical baselines in DOAC management, where learned drug-dose optimization outperforms guideline-based selection. 142 tests, MIT licensed.












