Dhiraj Singh

Backend/DevOps Engineer
Hi, I'm Dhiraj Singh, a 22-year-old based in Bangalore,
with a B.Tech in C.S.E. I’m a former state-level spellbee champion and a hackathon winner.
I previously worked as a Backend/DevOps Engineer at Heva AI, which is a recognized startup that was awarded by NEWS forums [1]. It is also an IIT-incubated health tech startup [2], where I worked in a paid role.

Outside of work, I read and write technical blogs about challenges I’ve faced, solve Cloud labs to stay prepared for any situation, and have been contacted multiple times (around 6) by Google Cloud and AWS teams for platform insights.
I’ve also participated in several in-depth sessions with their UI/UX teams and Project Managers (PMs) where they paid me $150/hr.

On top of that, I'm a big enthusiast of microservices, APIs, and databases — I can read about them all day.

Experience

Heva AI (IIT Incubated)
Aug 2024 — Sep 2025
Backend/DevOps Engineer
  • Downloaded 17,000+ EDF files from servers using Python scripts in hours (file sizes: 25–500 MB).
  • Migrated 23 servers and 77 buckets from US East 1 → Asia South East to reduce CPU/runtime latency and per request cost.
  • Optimized workloads on the largest CPUs (c2-highcpu-288) and GPUs (H100 80GB ×8).
  • Managed 6TB of PNG images with proper naming conventions, tagging, and storage class for quick, cost-effective access.
  • Recommended best automation tools for both AWS and GCP environments to ML and SDE teams, improving efficiency across 500+ services.
  • Developed APIs capable of handling 15,000 constant requests per second for model interactions.
  • Designed and maintained 80+ automation pipelines using Jenkins and GitHub Actions.
Delvion
June 2024 — Aug 2024
Backend Engineer
  • Processed 80M+ image frames from 500,000+ videos, storing 6 TB+ of data in Amazon S3 with lifecycle policies, tiering, and Transfer Acceleration for cost efficiency.
  • Deployed and integrated containerized ML services using Docker and Kubernetes, and implemented CI/CD workflows with GitHub Actions to ensure fast, reliable delivery in iterative client-facing cycles.

Projects

Dynamic ML Model Lifecycle Manager
Aug 2025
Built a Dynamic ML Model Lifecycle Manager that handles upload/loading, model versioning, health checks, and usage tracking and also swaps old models with new ones based on triggers from RPS. Deployed on AWS using Terraform for infrastructure setup, Docker for containerizing and shipping models, and API Gateway with CloudWatch for monitoring. The entire codebase is written in Python, using Flask APIs for model communication.
View on GitHub
Chaos Testing Setup with Integrated Monitoring
Sep 2025
Built an OTT backend service with user service and PostgreSQL database, and video service in a Microservice architecture communicating via APIs for a complex infra. In a controlled way, tested each service like pod crash, memory crash, CPU crash, and more. Deployed on GCP using Kubernetes and Docker with best practices like labeling, staging, CPU and memory limits, and HPA, with a well-structured folder structure.
View on GitHub

Tech Blog

[Article Title]

[Article Title Here]

[Brief description of your article - 2-3 sentences explaining what the article covers and what readers will learn]

Read on Medium
[Article Title]

[Article Title Here]

[Brief description of your article - 2-3 sentences explaining what the article covers and what readers will learn]

Read on Medium
[Article Title]

[Article Title Here]

[Brief description of your article - 2-3 sentences explaining what the article covers and what readers will learn]

Read on Medium