Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Timeline
Generic

Sai Arun

Hyderabad

Summary

DevOps/Build/Release/ Infrastructure and a Site-Reliability engineer with 11+ years of experience in deployments, infrastructure management, solving complex problems and exhibiting better troubleshooting skills.

A Senior software engineer responsible to build, deploy, monitor and maintain microservices applications along with infrastructure provisioning and management. Experienced in migration of microservices between AWS-GCP. An ingenious SRE, part of Tech-Ops Infra team with responsibility of configuring, installing, maintaining, monitoring, troubleshooting websites, domains, application servers(linux), and Databases for numerous clients across the globe. Played key-role in Supporting development, deployment operations in different environments. Ensuring the reliability, scalability, and availability of the applications on various cloud providers. Providing inputs for effective capacity planning and cost optimization of the resources. Experience in working on DevOps/Agile operations process and tools area (Code review, Build & Release automation, Environment, Service, Incident, and Change Management). An AWS Solutions Architect Associate and Predix Cloud Platform certified professional. Have a good understanding and working knowledge of container orchestration technologies. Implemented CI/CD for deploying applications on cloud environments. Experienced in the full Software Development Life Cycle (SDLC) and Methodologies & validation. Expertise in Agile Testing Methodologies &Software Test Life Cycle (SDLC). A good team player and collaborator and can communicate clearly and concisely, both orally and in writing.

Overview

12
12
years of professional experience
6
6
Certifications

Work History

Senior Software Engineer

Dun & Bradstreet
05.2022 - Current

A Senior software engineer part of DevOps team responsible to build, deploy and monitor microservice applications for five products.

Responsible to create and maintain CI/CD through harness and infrastructure terraform tools.

Played a prominent role in aws-gcp migration to migrate 20+ microservices along with underlying resources, tools, methods and architecture involved.

Setup and maintained GitHub branching, PR rules, auto-merge actions, checks, web-hooks, ADGroups and worked on designing branching strategies.

Coordinated more than 40+ production releases, based on agile methodology with a timeline of three week sprint.

Created and modified Harness CI/CD pipelines to build, deploy microservice applications on aws & gcp providers. Also helped other teams for harness adoption.

Added, modified environment and secret variables of microservices app and deployed through terraform.
Created, modified and maintained secrets through SOPS encryption mechanism.

Worked on tagging requirements on infrastructure for various products through terraform as part of maintenance and compliance.

Worked on restoring, optimizing and creating rds oracle and postgressql clusters. Also, cloudsql postgressql & mysql clusters.

Integrated splunk observability to ecs services as part of monitoring microservices apps.

Used kubectl to perform day-to-day operations like monitor, restart, delete, check logs of pods and deployments.

Created complete infrastructure for cloud-storage buckets as part of gcp migration.

Implemented end-end workflow to migrate all data between s3togcs buckets using IAM, sqs, s3, Pub-sub, gcs, storage-transfer-service.

Worked on creating, configuring and mofifying helm charts for deployment of microservice apps on GKE.

Site Reliability Architect

Global Logic, Client(Adobe)
02.2021 - Current
  • A Site Reliability Architect part of TechOps Infra team with responsibility of configuring, installing, maintaining, monitoring, troubleshooting websites, domains, application servers, and Databases.
  • Ensured Performance, availability and scalability of adobe hosted applications.
  • Understanding and implementation of Ip, url, phishing-allow whitelisting inside instance configuration xml file , apache-conf xml file and restarted application, apache servers.
  • Configured Subdomain delegation(t,m,res and MX) to help clients onboard their subdomains on to Adobe hosted servers.
  • Ensured data security for domains by planning and creating valid CSR along with private key using open-ssl cli, as part of SSL cert installation on top of subdomains and Installed SSL/TLS certificates on subdomains hosted on aws, azure, apache and Data Center.
  • Created, Modified and updated DNS (A, TXT, CNAME) records on DC and Route53 Hosted Zones and involved in domain migration.
  • Maintained the application stability by scaling configuration values of max connections, Javascript max memory, poll delay inside server configuration xml file as per server requirements.
  • Managed workflows optimization by reloads and restarts of workflows, services such as MTA, pipelined, tracking, web and other application servers of Adobe, as part troubleshooting and maintaining the servers.
  • Captured TCP dump for network log analysis and further issue tracking and troubleshooting,
  • Performed Database Vacuum as a scheduled activity to remove accumulated bloat on psql database servers.
  • Monitored and retrieved Ec2 and RDS instance metrics such as CPU utilization, Read and write input/output operations, DB connections, memory utilization to improve the instance health for growing applications.
  • Participated in Fixing and updating SPF, DKIM and DMARC records of subdomains as part of domain validation.
  • Created and configured ELB, ALB as part of SSL certification installation and renewals.
  • Involve in design of Scalability of applications and underlying infrastructure such as size and type of ec2 , rds instances as per requirement.
  • Collected snapshots of the rds instance as part of the DB Data backups and restoration activity.
  • Monitored the logs, processes, network, memory using linux utilities.
  • Created external sftp accounts for client requests for seamless transfer of files between the servers.
  • Created, configured and validated sftp private/public keys to establish connectivity for users.
  • Performed sftp ip whitelisting using Aws VPC security groups to ensure access is restricted to authenticated users.
  • Added and modified Usernames, passwords for different accounts of application, such as admin and sftp one backend linux servers.
  • Retrieved information for Databases, tables on psql database instances.
  • Executed long running and block queries to check running and paused workflows and killed processes as needed to achieve instance stability.
  • Monitored, collected web, workflow, mta, and other application server logs for further analysis and understanding, also used splunk for the same for longer time periods.
  • Used Jira for working, logging and resolving issues and collaborated with Customer-care, R&D, Product Engineering teams.
  • Used Confluence for documentation assistance and slack for collaboration and pager-duty for alert management.

Infrastructure Technology Specialist

Cognizant Technology Solutions, Us Bank
01.2020 - 02.2021
  • Used Rancher UI to monitor Kubernetes clusters, and manage pods and micro service applications.
  • Used Kubectl to view, manage and troubleshoot Istio Service Mesh(virtualservice.yaml, destination.yaml rules,deployment.yaml) and workloads(pods) on top of Rancher.
  • Experience in using Istio as service mesh for networking, service discovery, traffic routing and metrics.
  • (Amazon Web services - EC2, IAM, S3,CFT, Cloud Watch, ECS),Cloud foundry(predix cloud) Build Automation - Maven, ant, grunt, gulp, npm, make.
  • Office Tools - Microsoft Office and Outlook.
  • Application Servers -Web logic, Tomcat, IIS, and Xamp.
  • Messaging Queue - Rabbit MQ, KAFKA Monitoring tools - APM, Cloud watch, New relics, Prometheus, Kibana, Splunk, kiali Active Directory, SAML, Open Id Connect, API's - Rest, Soap Operating Systems - MS Windows XP/2000/2007, Windows 7 and Windows 10, Macos, Linux, UNIX.
  • IDE, Dev Tools - Eclipse, STS.
  • SDLC & Testing- Agile Methodologies, Junit, Manual Testing, JIRA, service now, confluence, Rally.
  • Kiali is used for observability, visualization and Istio configurations.
  • Worked on Git-Lab as SCM and Cloud bees Jenkins to deploy container applications on top of Kubernetes Clusters.
  • Modified Docker-file and Jenkins-file to make sure deployments are successful.
  • Configured set of Yaml files such as gateway, virtual service, service, deployment, destination for traffic management and deployment of micro service applications on kubernetes.
  • Worked with various teams to analyze and troubleshoot issues encountered for workloads in DEV, IT and UAT environments on Kubernetes.
  • Collected scan reports of any open threats and vulnerabilities using security scan tools like black- duck-fortify.

Infrastructure Technology Specialist

Cognizant Technology Solutions, CapitalOne
02.2019 - 02.2020
  • Chennai, TamilNadu.
  • Created Infra a Standalone environment using Docker-compose.yaml which includes application micro-services and other services like Kafka, zookeeper, consul and Mongo DB used by dev team for testing functionality locally.
  • Used GitHub for source code and version control management.
  • Used Marathon for container management (start, stop, config change) and mesos for container log management.
  • Modified Docker file to containerize application micro services and worked with dev team to deploy those apps to Cluster successfully.
  • As part of ECS cluster creation, used Terra form parameter files, Lambdas, and s3 events and run tasks on ecs cluster.
  • Used Cft template for creation and re-hydration of AWS infrastructure.
  • Worked on to migrate recipes written on chef to shell scripts, Cloud formation templates and S3 buckets.
  • Used Ansible for provisioning AWS environment such as Auto scaling groups (launch configuration) and get install able from s3 buckets.
  • Worked on Amazon Web Services like IAM, S3, Ec2 (EBS),Cft, cloud watch, sqs, auto-scaling and load balancers.
  • Performed Create, Read, Update and Delete operations on s3 buckets and folders involving data files.
  • Experience in using Consul for Service Discovery, key/value store for dynamic configuration of services.
  • Experience in using Monitoring Tools such as Cloud watch and grafana for visualizations in determining metrics such as Throughput, Cpu Utilization, Network Traffic and custom metrics like Memory.
  • Used Splunk, elk for data file and log monitoring.
  • Monitored data files using tools like Robo3T connected to MongoDB and Splunk, elk for file, log monitoring.
  • Experience in monitoring production infrastructure and debugging, docker container logs using docker commands to troubleshoot day to day production infra related issues.
  • Experience on change management, Incident management, problem management and Documentation.
  • Involved in Production On-Call Support to receive Pager-Duty alerts configured to various communication channels like slack and email.

Aws Devops Engineer

CapitaOne
08.2018 - 12.2018
  • CapitalOne, Plano, Texas.
  • Experience in Rehydration of AWS windows Instances for every release of new AMI'S.
  • Worked on Installation and Configuration of Tableau Server on 5 Cluster Nodes (Windows Ec2 Instances) Configurations.
  • Used Cloud Formation Templates to bring up Stack or Infrastructure needed for 5 node clusters (1 Primary and 4 Worker Nodes).
  • Worked on Configuring Security Group Inbound and Outbound rules to Onboard Various Data Sources (Red shift, snowflake, Postgres etc...) on to Tableau.
  • Worked as Circle of Excellence for Tableau Servers running on Dev, QA and Prod AWS Infrastructure.
  • Modified CFT JSON templates involving Windows PowerShell Scripts as part of User Data and other intrinsic functions such as Join for automation and installation of Tableau Server.
  • Used Classic and Application Load balancers to route traffic and monitoring health check for Tableau application.
  • Used Tableau Scripts to create and enable Task Schedulers to automate Copy of Server backup every three hours to S3 Buckets.
  • Attached volumes from snapshots to new Tableau- Api Server instances during every rehydration.
  • Created and Maintained S3 Folders, buckets with Server Side Encryption for maintenance of Setup files, Latest Backups and Installer files required for Tableau servers.
  • Experience in using Monitoring Tools such as Prometheus along with Grafana Dashboard for visualizations in determining metrics such as Throughput, Cpu Utilization, Network Traffic and custom metrics like Memory.
  • Involved in Production On-Call Support to receive Pager-Duty alerts configured to various communication channels like slack and email.
  • Used Route53 to route traffic between ELB's of regions East and West which are mapped to CNAME of Tableau Server.
  • Participated in Failover and Failback over two regions as part of resiliency testing.

DevOps Engineer

GE Digital
10.2016 - 06.2018
  • Experience in working with deployment of apps on Cloud Foundry through CI/CD tools like Jenkins.
  • Worked with DevOps team on spinning up EC2instances, creating IAM Users and Roles for end- users.
  • Used Application Load balancers to route traffic to ec2 instances under target groups and monitoring health check through cloud Watch for cloud applications, S3 buckets and network traffic through VPC flow logs with Iaas as service.
  • Worked with dedicated DevOps team to use chef for installations with help of chef resources, recipes and cookbooks and upload using knife and maintainer servers using chef-client.
  • Experience in using Monitoring Tools for apps such as Nagios, New Relics in determining application performance and metrics such as Throughput, error rate and page size load times, etc.
  • Experience in running applications on Cloud Foundry with Paas as a service.
  • Knowledge in scripting languages like Shell, and JavaScript, Nodejs, python.
  • Used package managers Npm, Bower for installation of front-end and backed dependencies, gulp, grunt, and dist, for compilation of individual units.
  • Experience in Monitoring and verifying logs using tools and services such as Kibana/Log stash.
  • Modified Nodejs and Java scripts to automate tasks like deploying apps into cloud over various revisions of apps showcased on portal.
  • Installed plugins and integrated Jenkins with build automation tools such as maven, ant for packaging and compilation(deployable artifacts (jar) from source code).
  • Automating tasks through Cron and or backing up and restoring data in Linux.
  • Managed VPN user and Active Directory Services by setting passwords, unlocking accounts, etc.
  • Used Google Analytics for Tracking apps and Generating Usage Reports for apps and Portal.
  • Expertise in Asset and time-series data ingestion of APM (GED Product) using Data loaders, Postman and other API tools.

Linux Admin

Red Hat
07.2012 - 05.2014
  • Conqsys, Noida, Uttar Pradesh.
  • Installed packages using YUM and, Package Manager (RPM) on Red hat or Centos, Debian Linux and knowledge on usage of apt-get, dpkg on Ubuntu.
  • Creation of File Systems and Logical Volume spaces, performed file system management, and troubleshooting.
  • The file system Installation &administration along with file and directory changes and provided necessary permissions on Red hat 5.x, 6.x.
  • Experience in virtualization technologies like Virtual Box, for installation of various flavors of Linux.
  • Ability to create disk spaces for storage and disaster recovery when needed and performed various OS and packages installations on Windows and Linux servers.

Education

Bachelor of Technology - Electrical, Electronics And Communications Engineering

JNTU University
09.2008 - 01.2005

Master of Science and Engineering: Master of Science - And Engineering, Electronics and Communications

Oklahoma Christian University
08.2014 - 04.2016

Skills

Config file Formats - HTML,XML, JSON and YAML

Scripting- Shell Scripting,Python, Groovy Scripting,JavaScript

Database - Postgres SQL,MongoDB, Mysql, Oracle

Container Orchestration - Docker, Kubernetes, Istio, Rancher Marathon, Istio

CI/CD - Jenkins, Version Control - GitHub,Harness

Infrastructure & Config Management - Terraform & Ansible

Cloud computing – (Amazon Web services - EC2, IAM, S3,CFT, Cloud Watch, ECS),Cloud foundry(predix cloud), GCP(GKE, GCS, STORAGE TRANSFER,CLOUDSQL,CLOUDARMOR,INGRESS)

Build Automation - Maven, ant, grunt, gulp, npm, make

Office Tools - Microsoft Office and Outlook

Application Servers -Web logic, Tomcat, IIS, and Xamp

Messaging Queue - Rabbit MQ, KAFKA

Monitoring tools - APM, Cloud watch, New relics, Prometheus, Kibana, Splunk, kiali

Active Directory, SAML, Open Id Connect, API's - Rest, Soap

Operating Systems - MS Windows XP/2000/2007, Windows 7 and Windows 10, Macos, Linux, UNIX

IDE, Dev Tools - Eclipse, STS

SDLC & Testing- Agile Methodologies, Junit, Manual Testing, JIRA, service now, confluence, Rally

Accomplishments

  • Certification number - UC-b8860e37-6eb0-464c-aef8- 30937681c814.


Certification

Aws Certified Solutions Architect, Associate

Timeline

Senior Software Engineer

Dun & Bradstreet
05.2022 - Current

Site Reliability Architect

Global Logic, Client(Adobe)
02.2021 - Current

Infrastructure Technology Specialist

Cognizant Technology Solutions, Us Bank
01.2020 - 02.2021

Infrastructure Technology Specialist

Cognizant Technology Solutions, CapitalOne
02.2019 - 02.2020

Aws Devops Engineer

CapitaOne
08.2018 - 12.2018

DevOps Engineer

GE Digital
10.2016 - 06.2018

Master of Science and Engineering: Master of Science - And Engineering, Electronics and Communications

Oklahoma Christian University
08.2014 - 04.2016

Linux Admin

Red Hat
07.2012 - 05.2014

Bachelor of Technology - Electrical, Electronics And Communications Engineering

JNTU University
09.2008 - 01.2005

Aws Certified Solutions Architect, Associate

Certification number - M89XFDT2E2QEQJSP.
Certifications: Predix Certified Developer, Certification
Number - PDX0616C10174.
Certificate of completion - Docker - Introducing
Docker Essentials Containers and more.
Sai Arun