Mihir Kulkarni
I am a Robotics & AI Software Engineer at Prime Vision, where I build and deploy ML and Computer Vision systems powering a fleet of 900+ autonomous mobile robots across 40+ US distribution centers and postal facilities.

I hold a Master of Science in Robotics Engineering from Worcester Polytechnic Institute (May 2024), with a focus on perception, computer vision, and deep learning. Previously, I interned at MathWorks on embedded C++ code generation for Simulink, and at IIT Bombay's ARMS-Lab on multi-drone simulation. At WPI, I worked on 3D perception projects including Structure from Motion, NeRF, semantic point cloud segmentation, and feature matching research under Prof. Ziming Zhang in the VISlab.

Education

Worcester Polytechnic Institute
Master of Science in Robotics Engineering
Worcester Polytechnic Institute, Massachussets, USA  
(Aug 2022  -  May 2024)
  • CGPA - 3.88/4.0
  • Specialization  -  Perception, CV, Deep Learning
Bachelor of Technology, Electronics and Telecommunication Engineering
College of Engineering Pune, India  
(Aug 2018  -  May 2022)
  • CGPA - 8.05/10.0

Experience

Robotics & AI Software Engineer
Arlington, VA (June 2024  -  Present)
Keywords: Python, ML, Computer Vision, OCR, Docker, Kubernetes, NVIDIA Triton, Kafka, Azure, CI/CD
Autonomous Robotic Sorting: Built and deployed production ML and Computer Vision systems (OCR, visual perception) powering a fleet of 900+ autonomous mobile robots (AMRs) across 40+ US distribution centers and postal facilities, working across the full stack from model development to real-world edge deployment.
Developed Python backend services and automated data and CI/CD pipelines (FastAPI, PostgreSQL, S3, Azure Databricks, Kafka, GitHub Actions) for dataset management, model training workflows, and continuous deployment at scale. Containerized and deployed GPU-accelerated inference with Docker on Kubernetes, serving low-latency predictions via NVIDIA Triton on Linux edge hardware (Dell NativeEdge).
Software Intern in the EDG Group mentored by Zijad Galijasevic  
Natick, MA (May 2023 - Aug 2023)
Enabling C++ code generation support for the Xilinx Zynq SoC boards and Embedded Linux Boards: Enabling C++ code generation support for Xilinx Zynq SoC Blockset in the Embedded Processor Modelling team. Providing C++ support for Linux-based embedded development boards to optimize development tools for robotics applications.
Indian Institute of Technology, Bombay - ARMS-Lab
Research Intern Supervised by Prof. Arpita Sinha  
Mumbai, India (June 2021 - Aug 2021)
Simulating Trochoidal Patterns using Multiple Drones in Gazebo: Simulated trochoidal patterns in Gazebo using multiple drones for surveillance of hilly/steep regions.Implemented a generalized consensus strategy for single-integrator kinematic agents for precise control of the drones. Implemented using the PX4, MAVROS packages in ROS and verified the results in MATLAB.
Binary Robotics
Project Intern.
Pune, India (Oct 2020  -  Dec 2020)
Keywords: ROS, SLAM, Navigation
Mobile Robots: Worked on Mobile Robots for Autonomous navigation through ICUs for hospital staff assistance. Worked on Localization and Mapping techniques (SLAM) using a 2D Lidar and implemented Hector slam with monte carlo (amcl) and movebase packages. Implemented ROS Navigation stack on a Raspberry Pi 4 with ROS running on top of Ubuntu 18.04
Robotics and Automation Laboratory, COEP
Undergraduate Researcher
Pune, India (Oct 2020  -  Jan 2021)
Keywords:STM32, Arduino, Altium Designer, OpenCV, PyTorch
Lead a team of 15 students in the ABU Robocon with team COEP for 3 years. Responsible for programming, perception, sensor fusion, circuit & PCB designing, and testing robots.

Projects

AutoPano
   

Stitched multiple images to create a Panorama using classical image processing as well as Deep Learning methods. Implemented concepts like Homography, RANSAC, Adaptive Non-Maximal Suppression, Corner Detection and Feature Matching. Implemented a supervised Deep Learning model using the HomographyNet CNN architecture in PyTorch.
Deep Learning based Robotic Grasping of unknown objects
   

Developed an end-to-end pipeline for optimal grasping of objects of variable shape and orientation using vision. Implemented VGG16 and ResNet50-based architectures with Transfer Learning in PyTorch, achieving >90% success rate on a custom 3D-printed 5-DoF robotic arm with MoveIt! and KinectV2 depth camera.
V-SLAM Implementation and object detection using Kinect V2
   

Worked on RTABMap implementation using the KinectV2 depth camera. Simulated the same using the Turtlebot3 in Gazebo. Performed CNN based Object Detection using the YOLO V3 Tiny framework in combination with map generation from RTABMap.
Point Cloud Feature Detection for Visual Grasping
   

Designed an algorithm to generate a convolution mask for optimal grasp estimation of multiple objects. Performed Point Cloud segmentation and transformed this Point Cloud to a 2D image.Convolved the generated mask over the 2D image to obtain the grasp parameters
Structure from Motion (SFM) and NeRF

Reconstructed a 3D scene from a finite set of images using triangulation, camera pose estimation, non-linear PnP, and bundle adjustment. Employed NeRF for photo-realistic visualization of complex scenes.
LiDAR Semantics

Applied Point-to-Point and Point-to-Plane ICP to stitch multiple LiDAR point clouds from the KITTI dataset. Performed semantic segmentation on RGB images and projected the semantic labels onto the Point Cloud.
Feature Matching Comparison: SuperGlue vs. LoFTR

Evaluated and contrasted two state-of-the-art feature matching models, SuperGlue and LoFTR, across various scenarios to benchmark accuracy and robustness.
Planning & Perception for Autonomous Vehicles in Urban Environments (CARLA)

Implemented a conformal lattice-based local planner and Haar Cascade classifier for pedestrian avoidance using the CARLA Python API in a simulated urban driving environment.

Publications

Deep Learning based end-to-end Grasping Pipeline on a low-cost 5-DOF Robotic arm
M. Kulkarni, P. Junare, M. Deshmukh, and P. Bartakke
Visual SLAM Combined with Object Detection for Autonomous Indoor Navigation Using Kinect V2 and ROS
M. Kulkarni, P. Junare, M. Deshmukh and P. P. Rege