Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic
Sarath Pillai

Sarath Pillai

Devops/Site Reliability Engineer
Hyderabad

Summary

Tech-savvy DevOps/SRE Engineer with solid background in Linux, Automation & Public Clouds(AWS, GCP, Azure, OpenStack). An engineer at heart, with in-depth knowledge & experience in architecting large scale microservice based containerized environments(Kubernetes, ECS, Nomad), with scalability. Last but not the least, a humble perennial learner, who loves to read/document/write & share [My Blog: https://www.slashroot.in]

Overview

9
9
years of professional experience

Work History

Senior Cloud Infrastructure Engineer

Kraken Digital Asset Exchange
Remote
06.2023 - Current
  • Workflow architecture for delivering a large scale microservices stack
    on container orchestration platform using Gitlab, Nomad, ConsulDesign & Implement a large scale Nomad Orchestration cluster in the
    cloud & on-premise.
  • Implement large scale preview environments with all microservices incorporated for proper code review and testing before deployment to production. Basically a dev environment per Merge request with entire stack.
  • Wrote Custom terraform provider for NOMAD deployment, as the community one lacked several features (for example, the community provider up until recently did not scale back containers if it did not match state file.)
  • Wrote slack bots for managing alerts (Ack, Escalate, wake up on call), send cost reports for cloud etc.
  • Design & Implement a large scale Nomad Orchestration cluster in the
    cloud & on-premise.
  • Design & Implement cross region backup and restore mechanism for
    all possible data stores used by the application (SQL, NoSQL, Search
    Data, Log data )
  • Implement Elastic scaling functionality for all microservices across
    datacentres and public clouds, based on application specific custom
    metrics.
  • Implement & Manage Blockchain network nodes (Etherium, Bitcoin
    etc.) to be used by applications to query against.
  • Implement end to end flow for delivery of applications to differnet
    environments using terraform IAC.
  • Implement monitoring as a service for application developers to
    consume.

Senior DevOps Engineer

IMIMobile
06.2021 - 06.2023
  • Wrote a Custom API for provisioning infrastructure based on requests from customers. The stack was Golang, API gateway, Docker for packaging, ECS for orchestration.
  • Worked on implementing a high trafficked Elasticsearch cluster with more than 300TB of data for search and queries.
  • Wrote multiple application specific metric exporters in Golang, for time series metrics.
  • Wrote a fully automated deployment solution for Azure VNET, NSG, Private and Public Zones, Azure Application gateway using Terraform.
  • Automated deployment of microservices applications on top of AKS(Azure Kubernetes Service) using terraform.
  • Architected a self serve platform in AWS, for onboarding Applications that follow different modes of deployment(Serverless, ECS Based Container Applications, Kubernetes based EKS applications)
  • Migrated more than 100 applications to container based deployment in large AWS ECS clusters using terraform
  • Implemented a container image delivery pipeline that auto promotes images from lower level environments in the cloud to higher level environments.
  • Wrote modules for CI/CD process for API Gateway, with api definitions in AWS Cloud
  • Wrote a backend python based API for creating custom infrastructure on demand for different customers.
  • Wrote a generic ECS terraform module that can accomodate N number of applications, specified by the developers.
  • Implemented a monorepo architecture: Basically a single Git repository with N number of applications, and logics to build/deploy each items individually.
  • Implemented test driven infrastructure deployment for multiple AWS accounts across different teams.
  • Implemented and end to end infrastructure and deployment automation for a serverless application (AWS Lambda + API Gateway)

Integration Engineer

Facebook, Inc.
06.2020 - 06.2021
  • Integrating Wi-Fi network solutions, Facebook software and internet.org core features with ISPs' infrastructure
  • Manage the technical relationship with the ISP partners, provide technical support and handle service outages
  • Automate Facebook's core components in partner's site using chef.
  • Create and plot metrics for data analysis.
  • Lay out network architecture to setup and integrate wifi solution for Facebook at partner locations.
  • Own back-end services like our Hadoop data warehouses, front-end services like Chat and News Feed, infrastructure components like our Memcache infrastructure, and everything in between
  • Write and review code, develop documentation and capacity plans, and debug problems.
  • Build and release features to different geographic regions.

Lead Operations Engineer

Sony Pictures Entertainment
06.2018 - 06.2020
  • Designed and automated a Mesos Based highly available cluster manager, for running different resource intensive tasks in SoftLayer datacenter.
  • Built an entire Mesos (for cluster management), Marathon (for running regular tasks in Mesos), and Docker (for running container's inside mesos slave servers)
  • Built an automated service discovery implementation with service discovery tool called as Consul
  • Implemented and maintained a Spark Cluster for fast data crunching, in analytics platform.
  • Built multiple core application Docker images using Dockerfile automation.
  • Built and maintained a private Docker registry for storing confidential images, which are part of different application clusters.
  • Built a full build and deployment pipeline using Jenkins and Docker images
  • Maintained an elasticsearch cluster for text indexing and processing, all in mesos/marathon stack.
  • Automation of all server fleets using custom made chef recipes for Mesos Master's and Mesos Slaves.
  • Built a completely automated and secure deployment of two factor authentication OpenVPN for customers and administrators to login to the environment.
  • Designed a Gluster based storage system for high volume storage requirement for storing Huge log files for analytics platform

System Administrator

Media.net
06.2016 - 06.2018
  • Established network specifications and analyzed workflow, access, information and security requirements.
  • Deployment and maintenance of Linux systems and app software in multiple clusters across 4 data centers totaling approximately 1000 mostly Dell PowerEdge Servers.
  • Deploying a robust Monitoring infrastructure with enterprise class monitoring tool Nagios
  • Plotting various graphs to get an overview of the infrastructure with tools like Cacti
  • Deploying Configurations and modifications to Thousands of servers with the help of Configuration management tool Puppet
  • Maintaining connectivity between multiple data centers by building a robust IPSec tunnel based network.
  • Load Balancing of multiple web clusters serving millions of requests per day with the help of F5 Big IP Load balancer.
  • Managing Code with the help of Version Controlling using GIT
  • Finding out and Mitigating DDOS (Distributed Denial Of Service Attacks), by coordinating with our DDOS mitigation partner.

Education

Bachelor of Engineering - Electrical, Electronics And Communications Engineering

Rajasthan Technical University
Rajasthan, India
04.2001 -

Skills

    AWS Cloud

undefined

Timeline

Senior Cloud Infrastructure Engineer

Kraken Digital Asset Exchange
06.2023 - Current

Senior DevOps Engineer

IMIMobile
06.2021 - 06.2023

Integration Engineer

Facebook, Inc.
06.2020 - 06.2021

Lead Operations Engineer

Sony Pictures Entertainment
06.2018 - 06.2020

System Administrator

Media.net
06.2016 - 06.2018

Bachelor of Engineering - Electrical, Electronics And Communications Engineering

Rajasthan Technical University
04.2001 -
Sarath PillaiDevops/Site Reliability Engineer