Dynamic HPC Engineer with a proven track record at Cloud Vertex Technologies, adept at optimizing workflows and enhancing system performance. Skilled in AWS services and Linux administration, I excel in proactive problem-solving and fostering stakeholder relationships, ensuring project success and minimal downtime. Committed to delivering innovative solutions that drive efficiency.
Overview
8
8
years of professional experience
1
1
Certification
Work History
HPC Engineer
Cloud Vertex Technologies India Private Limited
05.2024 - Current
Achieved successful project outcomes by maintaining accurate documentation and meeting strict deadlines.
Optimized engineering processes by implementing innovative solutions and streamlining workflow.
Implemented AWS solution to simply file transfer from a remote SFTP to S3
Successfully updated the Nextflow Application to latest version by taking into account the breaking changes and warnings
Enabled the user access to AWS SageMaker and helped to run ML Pipelines
Collaborated with various teams to reduce the application availability downtime
Helped end user to automate ccp4 application job submission by creating a custom python script to parse input from excel and submit the job
Used AWS Codepipeline to deploy AWS Parallel Cluster
Used AWS Cloudformation to deploy AWS PCS (parallel cluster service)
Enabled external vendors to access internal S3 by applying suitable policies
Implemented AWS solution to automatically stop/start EC2 based on a particular schedule/period
Successfully troubleshooted and resolved various application issues
Specialist
HCL Technologies
11.2021 - 05.2024
Collaborated with cross-functional teams to achieve project goals on time and within budget.
Improved customer satisfaction rates through proactive problem-solving and efficient complaint resolution.
Followed all company policies and procedures to deliver quality work.
Apply security patches using BCM (Bright Cluster Manager) to the entire cluster by updating and pushing patched software images.
Renewal and verification of BCM licence.
Monitor Application Licence usage using LSF RTM.
Update Matlab and Mathematica, StarCCM licence manager.
Installing Scientific Simulation applications like Mathematica, Monolix, Comsol, Matlab, StarCCM through EasyBuild.
Involved in Upgrade of Slurm Packages using BCM.
Managing Slurm queues and partitions.
Generating Slurm Cluster usage reports and Job efficiency reports using “sacct”.
Update GPFS Client packages post security patching.
Adding and Removing nodes to and from GPFS Cluster.
Install and configure splunk forwarder to view node health metrics.
Mitigate Security Vulnerabilities as per the Qualys scan recommendations.
Cloud Engineer
Cloud Vertex Technologies India Private Limited
06.2020 - 11.2021
Used metrics to monitor application and infrastructure performance.
Reduced server downtime by proactively monitoring cloud resources and addressing potential issues before they escalated.
Identified, analyzed and resolved infrastructure vulnerabilities and application deployment issues.
Assisted in migration projects from on-premises data centers to cloud environments, ensuring minimal disruption to business operations.
Enhanced cloud infrastructure efficiency by implementing advanced automation techniques and tools.
Provided technical support to internal stakeholders, diagnosing and resolving complex issues related to the organization''s cloud environment.
Evaluated new cloud technologies and recommended solutions that aligned with organizational goals and objectives.
Monitoring the status of HPC jobs.
Executing HPC Operational tasks assigned via Service now ticketing tool.
Deploying On-Demand HPC Clusters using AWS Parallel cluster service.
Deploying HPC Applications and Managing them.
Responsible for applying Linux and Windows security Patches periodically on AWS EC2 instances.
Monitoring health of Cloud Resources using NewRelic and CloudWatch.
Creation and Maintenance of custom Python script to dynamically add AWS EBS Volumes to compute nodes while HPC Job Submission.
Involved in Data transfer validation from AWS S3 source to AWS S3 destination and from local mount point source to AWS S3 destination using custom python script and shell script.
HPC System Administrator
Concept Information Technologies India Pvt. Ltd.
01.2019 - 06.2020
Reduced downtime by proactively identifying and resolving potential issues through thorough system monitoring.
Established effective communication channels between IT support staff and end-users, leading to improved issue resolution times overall.
Simplified troubleshooting processes by creating detailed documentation for system configurations, procedures, and best practices.
Checking the status of MPI, LSF, Slurm Parallel jobs.
Troubleshooting HPC job errors related to Applications like Ansys Fluent, Nfasant, Altair Feko.
Monitoring the SAN Storage health using Lenovo Think System Storage Manager..
Verifying that the Lustre parallel file system is mounted and accessible in all nodes.
Creation of Lustre file system Quota for users.
Monitoring and Managing Cluster DataBackup using Veritas Backup Exec.
Created a Custom Shell script to Check Lustre file system Mount points and mount them if not mounted.
Created a Custom Shell script to shut down cluster in Sequential order, using temperature as trigger point when temperature increases to threshold value, due to Power loss.
Linux System Administrator
DHII Health Tech Pvt. Ltd
06.2017 - 01.2019
Implemented monitoring tools for real-time analysis of system performance, allowing for proactive identification of potential issues before they impacted users.
Improved system performance by optimizing Linux server configurations, implementing efficient backup processes, and performing regular maintenance tasks.
Provided hands-on support for end-users experiencing issues with Linux-based systems or applications, facilitating quick resolutions and minimal disruptions to productivity.
Proactively monitor and manage all Compute, Network, Storage infrastructure to attain 24x7 Availability.
Checking the status of LSF, PBS Parallel jobs.
Created and serviced administrator and user accounts on Linux-based systems.
Made sure that the IBM GPFS parallel file system is Mounted and accessible in all nodes.
Troubleshoot Hardware, OS level issues.
Modified LSF queues according to the requirement.
Monitored Cluster Health using Ganglia Monitoring tool.
Managing Cluster using Cluster Managing tool IBM IMM, HP CMU.
Education
Master of Technology - Embedded Systems
Sreyas Institute of Engineering And Technology
Hyderabad, India
04.2001 -
Bachelor of Technology - Electronics And Communications Engineering
Guru Nanak Institute of Technology
Hyderabad, India
04.2001 -
Board of Intermediate Education - Mathematics Physics Chemistry
Sri Chaitanya Junior College
Hyderabad, India
04.2001 -
Board of Secondary Education - English Mathematics Physics Chemistry
Successfully developed BASH script to compare files in FSx for lustre and corresponding S3, and collect files present only in Lustre and not in S3
Achieved Cost Optimization by introducing AWS instance scheduler solution for automating start/stop EC2.
Simplified file transfer from external sources to internal S3
Created a Custom Shell script to shut down cluster in Sequential order, using temperature as trigger point when temperature increases to threshold value, due to Power loss.
Automate an application job submission using custom python script
Certification
RHCSA
Timeline
HPC Engineer
Cloud Vertex Technologies India Private Limited
05.2024 - Current
Specialist
HCL Technologies
11.2021 - 05.2024
Cloud Engineer
Cloud Vertex Technologies India Private Limited
06.2020 - 11.2021
HPC System Administrator
Concept Information Technologies India Pvt. Ltd.
01.2019 - 06.2020
Linux System Administrator
DHII Health Tech Pvt. Ltd
06.2017 - 01.2019
Master of Technology - Embedded Systems
Sreyas Institute of Engineering And Technology
04.2001 -
Bachelor of Technology - Electronics And Communications Engineering
Guru Nanak Institute of Technology
04.2001 -
Board of Intermediate Education - Mathematics Physics Chemistry
Sri Chaitanya Junior College
04.2001 -
Board of Secondary Education - English Mathematics Physics Chemistry
Financial Analyst at EMC SOFTWARE AND SERVICES INDIA PRIVATE LIMITED/Dell, Technologies India Private LimitedFinancial Analyst at EMC SOFTWARE AND SERVICES INDIA PRIVATE LIMITED/Dell, Technologies India Private Limited