Skip to content
Abstract neural vascular network visualization
All Research

Research Program

Reinforcement Learning

Open-source simulation platforms for RL in scientific applications. Every project ships code, data, and a paper.

Publications

8 publications in Reinforcement Learning

AnestheSim paper title page
2026Reinforcement LearningAnesthesiologyDrug DosingOpen Source

AnestheSim

Gymnasium Environments for Reinforcement Learning in Automated Anesthesia Drug Dosing

Hass Dhia -- Smart Technology Investments Research Institute

Three Gymnasium-compatible RL environments for automated anesthesia drug dosing: propofol infusion control via the Marsh three-compartment pharmacokinetic model with Hill pharmacodynamic BIS prediction, remifentanil effect-site concentration targeting via the Minto model, and combined propofol-remifentanil anesthesia management using the Greco synergistic interaction surface. Includes configurable difficulty tiers with patient variability and surgical stimulation events, heuristic TCI clinical baselines, PPO RL agents, and a benchmark suite across three difficulty levels. Key finding: pharmacokinetic timescale, not task complexity, is the primary determinant of RL sample efficiency in drug dosing control. 109 tests, MIT licensed.

NeuroSim paper title page
2026Reinforcement LearningBrain-Computer InterfacesNeuroscienceOpen Source

NeuroSim

A Gymnasium Platform for Reinforcement Learning in Brain-Computer Interfaces

Hass Dhia — Smart Technology Investments Research Institute

Gymnasium-compatible RL environment suite for brain-computer interfaces with three environments modeling motor imagery decoding, intracortical cursor control, and P300 speller navigation. Includes pluggable signal models (electrode drift, fatigue, co-adaptation, noise), a conditional VAE neural surrogate, CSP+LDA classical baseline, PPO RL baseline, and a five-tier benchmark suite. 158 tests, MIT licensed.

VascularSim paper title page
2026Reinforcement LearningMedical RoboticsSimulationOpen Source

VascularSim

A Gymnasium Platform for Microrobot Navigation in Patient-Derived Vascular Networks

Hass Dhia — Smart Technology Investments Research Institute

Open-source simulation platform providing a complete stack for training RL agents to navigate blood vessel graphs: TubeTK data ingestion, three Gymnasium environments with physics-based observations (VascularNav, FlowAwareNav, MagneticNav), analytical hemodynamics and magnetic field models, a neural flow surrogate, PPO baseline agents, and a benchmark suite across 5 difficulty tiers. 139 tests, MIT licensed.

PeptideGym paper title page
2026Reinforcement LearningDrug DiscoveryPeptide EngineeringOpen Source

PeptideGym

Gymnasium-Compatible RL Environments for Therapeutic Peptide Design

Hass Dhia — Smart Technology Investments Research Institute

Three Gymnasium-compatible RL environments for therapeutic peptide design: antimicrobial peptides (AMP), cyclic peptides, and T-cell epitopes. Includes heuristic biophysical scoring models, PPO and random baseline agents, reward shaping analysis revealing mode collapse boundaries, and a benchmark suite. First systematic demonstration that per-step reward shaping magnitude determines whether RL agents learn meaningful peptide sequences or degenerate to single-residue exploitation. 125 tests, MIT licensed.

OncoSim paper title page
2026Reinforcement LearningRadiation TherapyOncologyOpen Source

OncoSim

Gymnasium Environments for Reinforcement Learning in Radiation Therapy Treatment Planning

Hass Dhia — Smart Technology Investments Research Institute

Three Gymnasium-compatible RL environments for radiation therapy treatment planning: beam angle optimization, dose fractionation scheduling, and adaptive replanning. Includes analytical pencil beam dose calculation, linear-quadratic cell survival, TCP/NTCP radiobiological models, configurable difficulty tiers, and baseline agents (random, heuristic, PPO). PPO achieves 11.7x improvement on beam selection and 15.4x on adaptive replanning over clinical heuristics. 141 tests, MIT licensed.

GlucoSim paper title page
2026Reinforcement LearningGlucose ManagementDiabetesOpen Source

GlucoSim

Gymnasium Environments for Reinforcement Learning in Glucose Management

Hass Dhia — Smart Technology Investments Research Institute

Three Gymnasium-compatible RL environments for Type 1 diabetes glucose management: basal rate optimization, meal bolus dosing, and full closed-loop insulin delivery. Includes the Bergman minimal glucose-insulin model, Dalla Man gut absorption dynamics, a CGM sensor noise model, 30 virtual patients across three age groups, configurable difficulty tiers, heuristic clinical baselines, PPO RL agents, and a five-tier benchmark suite. Key finding: composite reward functions with safety constraints are necessary to differentiate learned policies from naive baselines in glucose management RL. 117 tests, MIT licensed.

VentiSim paper title page
2026Reinforcement LearningMechanical VentilationCritical CareOpen Source

VentiSim

Gymnasium Environments for Reinforcement Learning in Mechanical Ventilation

Hass Dhia -- Smart Technology Investments Research Institute

Three Gymnasium-compatible RL environments for mechanical ventilation: tidal volume control via inspiratory pressure adjustment, PEEP optimization for oxygenation, and full ventilator parameter management for ARDS patients. Implements a single-compartment lung mechanics model coupled with a simplified gas exchange model, configurable difficulty tiers with patient variability and disease progression, heuristic clinical baselines, and PPO agents. Key finding: PPO improvement over baselines scales monotonically with action dimensionality, from 11.8% in 1D to 65.0% in 4D control. 230 tests, MIT licensed.

CardioSim paper title page
2026Reinforcement LearningCardiac ElectrophysiologyDrug DosingOpen Source

CardioSim

Gymnasium Environments for Reinforcement Learning in Cardiac Electrophysiology

Hass Dhia -- Smart Technology Investments Research Institute

Three Gymnasium-compatible RL environments for cardiac electrophysiology: pacemaker rate optimization via the FitzHugh-Nagumo model with a cardiac conduction system simulator, antiarrhythmic drug dosing using the FitzHugh-Nagumo model with single-compartment PK/PD dynamics, and defibrillation timing via the Aliev-Panfilov model with probabilistic shock success. Includes configurable difficulty tiers, heuristic clinical baselines, and PPO agents. Environments span a difficulty spectrum from learnable (drug dosing, pacing) to open-challenge (defibrillation timing). 134 tests, MIT licensed.

8

Publications

17

Domains

8

Open-Source Repos

8

Published Packages

Collaborate with us

We welcome research collaborations, dataset contributions, and open-source partnerships across any discipline. Reach out to discuss.