MEng Robotics at UIUC, building learning-based systems for humanoids, autonomous vehicles, and mobile robots.
Terrain-aware locomotion for a 29-DOF Unitree G1, trained with PPO in Isaac Lab under a stand→walk→DR curriculum. A CNN encodes height scans alongside proprioception into the actor-critic, with domain randomization across mass, friction, PD gains, and pushes.
Real-world deployments, novel algorithms, and sim-to-real systems across humanoids, AVs, and mobile robots. Featured projects showcase the deepest work.

Conversational navigation on a Unitree Go1 + Jetson Orin: a fine-tuned NaVILA (8B VLA) emits mid-level language actions (forward / turn / stop) from RGB streams. Added a new "ask user" action token so the policy defers under ambiguous instructions, asks a GPT-4o-generated clarification question, and resumes after the operator answers via a dual-variant instruction amender. Diagnosed that NaVILA expects an 8-frame linspace memory bank rather than a sliding window - correcting it raised offline accuracy ~51 points. LoRA fine-tuned over teleop + 4 DAgger iterations with ~1 Hz offboard inference and 8-bit quantization on an RTX workstation; the ask-user checkpoint is the first to fire the action correctly on the live robot.
A conversational extension of NaVILA on a Unitree Go1 quadruped: the vision-language-action policy defers under ambiguous instructions, asks the operator a clarifying question, and resumes - all in the language modality it already speaks.

Precision USB-A insertion on a UR5e: a CV port localizer + IK reaches a pre-insert pose, then a SAC residual policy layers joint-delta corrections using F/T feedback during contact. Trained in MuJoCo with domain randomization over friction, mass, port pose, and F/T sensor noise; the F/T-noise DR was the load-bearing factor for clean sim-to-real - the policy transferred zero-shot to the real UR5e and consistently completes the insertion. Also benchmarked an OpenVLA-based learned base controller against the classical CV + IK stack.
Precision USB-A insertion on a UR5e under tight clearance, learned via residual reinforcement learning layered on top of a classical CV + IK base controller.

AutoShield successor on the UIUC Polaris GEM e4. A MID-style diffusion Transformer (DDPM-trained, DDIM 10-step inference, joint multi-agent cross-attention) predicts pedestrian trajectories over the AutoShield LiDAR + RGB-D tracker, running live on-vehicle. An MPPI motion planner (K=600 rollouts, joint steering+acceleration) and a text-promptable goal-selection module (YOLO-World / LangSAM open-vocabulary detection, LiDAR-fused to an MPPI goal pose) each validated in simulation.
AutoShield successor on the UIUC Polaris GEM e4: a diffusion-based pedestrian predictor, an MPPI motion planner, and a text-promptable goal-selection module.

Open-vocabulary 6D tracking extending FoundationPose with Moondream2 VLM scene analysis, SAM-3 text-prompted segmentation, and hierarchical mesh acquisition (GT → Objaverse-XL retrieval → TripoSR generation). Composite scoring (IoU + Depth + Silhouette) selects best proxy; achieves 100% ADD-S AUC on texture-rich objects and supports language-driven dynamic target switching mid-task on YCB-Video.
Open-vocabulary 6D pose tracking driven by natural-language prompts, extending FoundationPose.

Replaced separate EfficientNet + BERT encoders with a frozen unified CLIP backbone (128-dim adapter projection) in the HumanVLA teacher-student pipeline, resolving catastrophic forgetting observed when fine-tuning visual encoders. Behavior Cloning + DAgger distillation across 615 episodes in 4 HITR room types; 62.1% success on unseen tasks (+1.9% over baseline, −4.4% placement error).
A language-guided humanoid policy that navigates HITR rooms and rearranges objects from natural-language instructions.

Modular ROS 2 autonomy stack on the UIUC Polaris GEM e4: Ouster OS1-128 LiDAR → DBSCAN clustering + EMA tracking fused with OAK-D LR RGB-D via YOLOv11 detection (weighted 0.8/0.2 distance, 0.3/0.7 bearing). Forward-simulated TTC drives a 3-state safety FSM; Stanley lateral + PID longitudinal control with PACMod2 hard-brake override. 91% field success.
Pedestrian-intent prediction and safety-filtered driving on the UIUC Polaris GEM e4. Owned the pedestrian stack in a 4-person team.

Two-tier PPO hierarchy for dual Unitree G1 humanoids in IsaacLab: low-level [256,256] MLP drives 29-DOF joint-position-delta actions through a 7-phase standing→omnidirectional curriculum (30M steps, 256 envs); high-level [64,64] navigation policy converges agents from 10 m to 0.5 m with 95% success. 25+ shaped reward terms enforce gait symmetry, contact penalties, and energy regularization.
Two coordinated Unitree G1 humanoids walking toward each other under a hierarchical PPO policy in IsaacLab.

Full-stack embodied AI on physical Booster K1 Humanoid: Gemini VLM semantic goal planning with ROS2 Nav2 autonomous navigation and Snap AR Spectacles AR collaboration.
Embodied AI on a physical Booster K1 humanoid, paired with a Snap AR Spectacles co-pilot.

Led 7-person team. Kalman-filtered admittance controller maintaining <150 N brush-to-glass contact. Full ROS2 stack on Jetson Orin Nano with SMACH FSM. Validated in live Changi Airport deployment.
AGV that climbs to 2 m and cleans curved glass at Changi Airport. SUTD capstone, 7-person team lead.

Novel PP variant incorporating bicycle-model lateral error decomposition (cross-track slippage + heading error) into the control law. DIAPP adds a dynamics correction term c_d·e_lat,total to the steering output; RDIAPP extends this with RPP-style curvature-regulated velocity. At 70 km/h on an S-track, RDIAPP achieves 1.79 m max CTE vs. PP's 30.26 m (diverges); on a U-track, 2.49 m RMS CTE vs. 31.6 m for PP.
Novel Pure Pursuit variant that doesn't diverge on consecutive high-speed curves. SUTD SHARP thesis.
Autonomous lane-following on Jetson Xavier + ZED2: dual-filter perception (Gaussian-weighted adaptive grayscale + HLS color space) with bitwise AND fusion targeting 8% pixel density, vanishing-point radial-scan histogram rejecting racetrack markings, and sliding-window 2nd-order polynomial fit. Custom proportionate lateral controller with distance-weighted steering (f(c_y) = C/(c_y − h_y + ε)). Completed full 12-min autonomous laps across variable lighting conditions.
Outdoor autonomous lane-following on a scaled 4WD race car, robust across changing light.

Autonomous unknown-maze exploration on TurtleBot3 (LattePanda): frontier-based exploration with DFS cluster grouping and centroid-based goal selection, A* costmap planner with B-spline path smoothing, Pure Pursuit trajectory follower, and fine-tuned YOLOv11n with a novel spatial deduplication tracker projecting bounding-box centroids into the map frame via LiDAR range lookup for unique instance counting.
Autonomous maze exploration with object instance counting on a TurtleBot3.

Software Lead. Cascaded PID for 6-DOF underwater stability with IMU/depth fusion via state-space control. Real-time CV for autonomous target acquisition. SAUVC 2023 Finalist.
Autonomous underwater vehicle for the SAUVC 2023 competition. Software Lead, SAUVC 2023 Finalist.

Low-cost quadcopter with off-the-shelf components. Integrated propulsion, power distribution, and flight electronics for stable teleoperation.
Personal build: a low-cost teleoperable quadcopter from off-the-shelf parts.

Custom PCB with motor control, dynamic air-cooling, slope-based actuation, and anti-roll basket mechanism for incline stability.
Self-stabilizing electric grocery trolley.

CPG-based motion control for a soft batoid robot with sinusoidal X/Pitch/Roll propulsion and strain-gauge obstacle detection.
CPG-driven motion control for a soft batoid-style underwater robot.

Low-cost EM leak detector with Bluetooth connectivity for non-invasive monitoring of variable-length fluid pipes.
Low-cost electromagnetic leak detector for variable-length fluid pipes.

Angsana seed-inspired autorotating craft with single-flap motor trajectory control and aerodynamic calibration for controlled descent.
Autorotating aerial craft inspired by the Angsana seed.

8-bar lifting mechanism with novel intake for cube-frame collection; Best Design Award at Asia-Pacific Robotics Championship.
Competition robot for cube-frame collection.

Benchmarked 7 monocular/stereo VO algorithms (DSO, SVO, CNN-SVO, DF-VO, TartanVO, ORB-SLAM3, DROID-SLAM) on Oxford, Munich, and Singapore rain datasets via ATE. Our DROID-SLAM + CGRP + Heuristic variant achieved the lowest stereo long-range ATE; DF-VO best monocular under 500 m. Published at IEEE CASE 2023.
VO benchmarking under rain across 7 algorithms and 3 cities.

Benchmarked 9 nonlinear feature families (MMSE, RQA, DFA, entropies, FD, Hjorth, Hurst, LLE, LZC) across 7,500 focal/non-focal EEG signals. LS-SVM (polynomial-3, 10-fold CV) achieved 87.93% accuracy / 89.97% sensitivity. Published in FGCS, Elsevier 2019.
Nonlinear feature benchmarking for focal vs. non-focal EEG classification on the Bern-Barcelona database.

CNN-RNN time-series model predicting Cat 1 Lightning Risk from weather-station features: rainfall, wind speed, temperature, humidity, and wind direction.
Time-series forecasting of Category 1 Lightning Risk warnings from weather-station signals.

BiLSTM + DistilBERT classifier for IMDA movie-review sentiment, combining contextual transformer embeddings with sequential modeling. ~96% accuracy beating classical NLP baselines.
Sentiment classifier on IMDA movie reviews.

HMMs with first- and second-order Viterbi decoding for entity recognition and sentiment tagging in informal news text.
Sequence labelling on informal news text with Hidden Markov Models.
Production deployments, real-vehicle autonomy, and simulation at scale - from 200+ AMR fleets to port-side AV trials.



All content respects applicable NDAs and Codes of Conduct - no confidential information or internal source code developed during these engagements is shown.
Full-stack robotics - from perception and planning to learning and deployment.
A focused academic path through engineering, robotics, and CS.



Peer-reviewed contributions to robotics and biomedical AI.
Comprehensive evaluation of 7 VO algorithms (DSO, SVO, CNN-SVO, DF-VO, TartanVO, ORB-SLAM3, DROID-SLAM) on monocular and stereo setups across Oxford RobotCar, 4Seasons (Munich), and Singapore heavy-rain datasets. Proposed DROID-SLAM + CGRP + Heuristic variant achieving lowest stereo ATE for long-range rain localization; DF-VO identified as best monocular approach for <500 m.
Download PaperFirst systematic comparison of 9 nonlinear feature families (52 features, all p < 0.01) for focal vs. non-focal EEG classification on the full 7,500-signal Bern-Barcelona database. LS-SVM (polynomial-3, 10-fold CV) achieved 87.93% accuracy / 89.97% sensitivity. MMSE identified as top-ranked discriminator; proposed recurrence, bispectrum, and cumulant plots for visual class separation. DOI: 10.1016/j.future.2018.08.044
Download Paper