Summary
Overview
Work History
Education
Websites
Languages
Timeline
Generic

GIRIJA SANKAR PANDA

Hyderabad

Summary

Experienced Solution Architect Data engineer with 13 years of experience designing and implementing scalable data architectures across multi-cloud environments, including AWS ,GCP and Azure Demonstrated expertise in building and optimizing ETL pipelines, data lakes, and real-time streaming solutions using Python, PySpark, and Terraform. Proficient in leveraging cloud-native services such as AWS S3, Azure Blob, Athena, EMR, Snowflake, BigQuery, and Google Cloud Storage to deliver efficient and cost-effective solutions. Adept at driving performance optimization, automation, and end-to-end data integration across large-scale systems through strong data engineering skills. Skilled in Kubernetes, Docker, and microservices architecture to ensure highly available, secure, and resilient infrastructure. Extensive experience in Banking ,Health Care, Industry 4.0 initiatives in the manufacturing sector.

Overview

13
13
years of professional experience

Work History

Solution Architect (Data Engineering/ML) GCP

Micron Technology SMAI
Hyderabad, India
08.2022 - Current
  • Advanced Data Pipelines:Architected and optimized complex data pipelines utilizing Databricks for distributed data processing and Snowflake for scalable data warehousing. Implemented efficient ETL processes and optimized data transformation workflows, resulting in a 50% reduction in query execution times and a significant enhancement in data analysis capabilities. Employed Delta Lake's capabilities for ACID transactions and schema enforcement, ensuring high-quality and reliable data pipelines.
  • Delta Lake Architecture:Designed and implemented robust Delta Lake architectures within Databricks, managing data across different stages (Bronze, Silver, and Gold layers). Developed ingestion processes to handle raw data (Bronze layer), refined and cleansed data (Silver layer), and produced analytics-ready data (Gold layer). This multi-layer architecture facilitated efficient data storage, high-performance querying, and effective data management, enhancing downstream analytics and reporting.
  • Data as a Product:Adopted a data-as-a-product approach within Snowflake by creating and managing high-value data products. Defined clear data product specifications, established data governance practices, and implemented robust access controls. This approach improved data accessibility, ensured consistency, and provided tailored data solutions for various business units, driving data-driven decision-making and enhancing operational efficiency.
  • Job Orchestration with Airflow:Configured and managed job orchestration using Apache Airflow to automate and schedule complex data workflows. Implemented DAGs (Directed Acyclic Graphs) for orchestrating data ingestion, processing, and transformation tasks involving Delta Lake. Ensured reliable execution of data pipelines, streamlined data operations, and facilitated timely and accurate data processing by integrating Airflow with Databricks and Delta Lake environments.
  • Snowflake:Implemented RBAC control in snowflake and good experience in working snowpipe and snow task




Data Engineering Lead /AWS Solutions Architect

EPAM Systems, Inc : Novartis
Hyderabad, India
08.2021 - 08.2022
  • Improved AWS infrastructure efficiency by designing and implementing scalable solutions using Terraform to automate infrastructure deployment
  • Engineered and maintained ETL pipelines using PySpark, processing large-scale data from various sources and loading it into AWS S3 and Snowflake for analytical processing, achieving 99.9% data availability. Integrated AWS Athena to query data stored in S3, further optimizing data access for ad-hoc analysis.
  • Implemented real-time data streaming solutions using Apache Kafka and Spark Streaming, enabling timely analysis and visualization of streaming data for operational insights
  • Collaborated with AWS architects to optimize storage and compute resources on S3 and EMR clusters, ensuring cost-effective and scalable data processing capabilities, which led to a 20% reduction in overall operational costs.
  • Led training sessions for junior data engineers, providing mentorship on best practices, coding standards, and the effective use of Databricks, PySpark, and cloud services such as Snowflake and AWS Athena.
  • Implemented data governance and quality checks, establishing best practices to maintain high-quality data within the data lake architecture, leading to a 25% improvement in data accuracy and compliance with industry regulations.
  • Conducted performance tuning and optimization of Spark jobs on Databricks, reducing processing time by 30% and lowering operational costs. Utilized AWS Athena for cost-effective querying and analysis of large datasets stored in S3.
  • Designed and deployed cloud-based infrastructure, using Terraform for provisioning, reducing overall costs while increasing performance and scalability. Streamlined data lake and Snowflake integration for more efficient data processing pipelines.
  • Introduced agile methodologies to improve project management efficiency within the team, resulting in faster project delivery cycles and improved collaboration across teams.
  • Automated repetitive tasks using Python scripts, saving time and resources for higher-priority projects, particularly around Snowflake and AWS infrastructure management.


Lead Software Engineer

Accenture Zurich Insurance
Hyderabad, India
11.2017 - 08.2021
  • Led the development of end-to-end data integration solutions using Azure Data Factory, Databricks,Azure Blob ensuring efficient data movement and transformation across both cloud and on-premises environments.
  • Designed and implemented data pipelines for diverse data sources, enabling seamless data flow and real-time analytics for business intelligence purposes.
  • Exposed database tables via FastAPI, creating RESTful endpoints to serve data for various consumers and applications. This enabled secure and efficient access to structured data, facilitating real-time analytics and decision-making.
  • Containerized the FastAPI application using Docker to ensure portability and consistency across development and production environments.
  • Deployed the containerized FastAPI application to Kubernetes, automating the deployment, scaling, and management of the microservices, leading to increased reliability, high availability, and simplified infrastructure management.


Senior Software Engineer

Accenture: BNP Paribas
Chennai, India
11.2014 - 11.2017


  • Wrote and automated shell scripts to streamline and monitor the movement of data from Hadoop Distributed File System (HDFS) to PostgreSQL, reducing manual intervention by 40%.
  • Used PostgreSQL to store and manage relational datasets for downstream applications, ensuring data integrity and consistency.
  • Managed version control for all data engineering projects using Git, ensuring collaborative development and proper versioning of codebase.
  • Conducted performance tuning and query optimization in Hive and PostgreSQL, reducing query execution times by 25%.
  • Integrated Sqoop for data transfer between PostgreSQL and Hadoop, improving data ingestion processes and ensuring real-time availability for analytics.
  • Collaborated with data analysts and business intelligence teams to provide clean, transformed data for insights and decision-making.
  • Automated routine tasks and data validation processes using shell scripting, improving overall operational efficiency.

Programmer Analyst

Cognizant Technologies :FDMS
Kolkata, India
12.2011 - 11.2014
  • Developed and maintained COBOL programs integrated with DB2 for high-volume transaction processing in the FastData Merchant Service.
  • Created and optimized shell scripts to automate repetitive tasks, enhancing system efficiency and reducing manual intervention.
  • Designed and managed DB2 databases, performing data extraction, transformation, and loading (ETL) processes to ensure accurate and timely data availability.
  • Collaborated with development teams to support merchant service applications, ensuring seamless integration with mainframe systems.
  • Conducted performance tuning and query optimization for DB2 to minimize response times and maximize system throughput.
  • Led database migration projects to upgrade and maintain legacy systems, ensuring data integrity and system stability throughout transitions.
  • Wrote and tested JCL scripts for batch processing, ensuring efficient execution of data processing jobs across the mainframe environment.
  • Provided production support, troubleshooting system issues, and implementing fixes to minimize downtime and improve system reliability.
  • Documented processes, configurations, and system changes for operational and audit purposes, ensuring compliance with internal and regulatory standards.

Education

Bachelor of Information Technology & Computer - Information Technology & Computer

College of Engineering

Languages

English (Native)
Bilingual or Proficient (C2)
Hindi (Intermediate)
Bilingual or Proficient (C2)
French (Intermediate)
Intermediate (B1)

Timeline

Solution Architect (Data Engineering/ML) GCP

Micron Technology SMAI
08.2022 - Current

Data Engineering Lead /AWS Solutions Architect

EPAM Systems, Inc : Novartis
08.2021 - 08.2022

Lead Software Engineer

Accenture Zurich Insurance
11.2017 - 08.2021

Senior Software Engineer

Accenture: BNP Paribas
11.2014 - 11.2017

Programmer Analyst

Cognizant Technologies :FDMS
12.2011 - 11.2014

Bachelor of Information Technology & Computer - Information Technology & Computer

College of Engineering
GIRIJA SANKAR PANDA