Summary

Overview

Work History

Education

Skills

Certification

Timeline

Suraj Akula

Hyderabad

Summary

Results-oriented Data Engineer with 3+ years in building scalable data pipelines and orchestrating workflows using Azure Data Factory and Azure Databricks. Expertise in developing modern cloud-based data lakehouse architectures utilizing Databricks, Delta Lake, Apache Spark, PySpark and SQL. Demonstrated success in optimizing ETL pipelines, enhancing data quality and governance. Possesses strong expertise in Delta Lake features including ACID transactions, schema enforcement, and time travel to support both real-time and batch analytics for informed enterprise decision-making.

Overview

years of professional experience

Certification

Work History

Data Engineer

Modak Analytics, Syngenta

10.2023 - Current

Engineered real-time data ingestion pipelines by using Snap Logic to ingest raw data into an external AWS S3 bucket as Parquet files, which were then processed by Databricks.
Designed and implemented high-performance ETL pipelines on Azure Databricks, processing structured and semi-structured data from multiple enterprise systems
Implemented a robust incremental data loading mechanism using Delta Lake within the Medallion Architecture. Enhanced performance by applying incremental loads across 40+ tables, ensuring data integrity and significantly reducing processing times.
Created and optimized PySpark jobs in Databricks to process and transform large datasets for downstream analytics
Developed a scalable Azure-based data lakehouse architecture using Databricks Delta Lake to support ACID transactions, schema evolution, and time travel for reliable data versioning.
Implemented comprehensive data quality frameworks within Databricks to ensure accuracy, consistency, and integrity across both batch and streaming pipelines
Created detailed documentation for the entire data engineering process, ensuring easy knowledge transfer and onboarding for future projects.
Developed and conducted training sessions for team members, empowering them with the skills needed to effectively utilize the implemented data engineering solutions.
Collaborated in Agile development processes, including sprint planning, backlog grooming, and daily stand-ups to ensure the timely delivery of projects

Data Engineer

Modak Analytics, Abbvie

04.2022 - 09.2023

Designed and implemented a comprehensive Azure Data Factory solution for ETL processes, utilizing SQL, Data Factory pipelines, Databricks, and PySpark to handle the extraction, transformation, and loading of data from diverse sources.
Implemented security best practices by leveraging Azure Key Vault for storing and managing sensitive information such as credentials and secrets.
Incorporated Logic Apps to automate email notifications, enhancing system monitoring and alerting capabilities for timely issue resolution.
Developed and optimized ETL pipelines in Databricks using PySpark and SQL to process and store data
Developed robust batch processing workflows utilizing Delta Lake’s ACID transactions, schema enforcement, and time travel features to maintain high data quality and governance.
Designed scalable PySpark/SQL jobs to efficiently ingest, transform, and load data into Delta Lake, enabling seamless analytics and reporting.
Reduced ETL runtime by 50% through advanced partitioning, Spark tuning and file compaction
Managed CI/CD using Git and GitLab for seamless deployment of Databricks workflows and code
Collaborated with cross-functional teams to gather business requirements and translate them into effective data engineering solutions.

Education

Bachelor of Technology (B.Tech) - Information Technology

Gokaraju Rangaraju Institute of Engineering And Technology

Hyderabad

05.2022

Skills

Platforms:
Azure, Azure Databricks (ADB),Azure Data Lake Gen2 (ADLS), Azure SQL Data Warehouse (DWH), SnapLogic
Languages:
PySpark, Python, SQL
Technologies:
Apache Spark, Apache Iceberg, Apache Kafka, Delta Lake, Spark SQL, Parquet File Format

Cloud Services & Tools:
Azure Data Factory (ADF), Azure SQL, Azure Key Vault, Databricks Workflows, Linked Services, Integration Runtimes, Monitoring Pipelines, Triggers
DevOps & Version Control:
Git, GitLab, CI/CD Pipelines
Data Modeling:
Star Schema, Snowflake Schema, Data Warehousing, Stored Procedures, Views, Functions, Triggers, Data Flows

Certification

Databricks Certified Data Engineer Associate

Timeline

Data Engineer

Modak Analytics, Syngenta

10.2023 - Current

Data Engineer

Modak Analytics, Abbvie

04.2022 - 09.2023

Bachelor of Technology (B.Tech) - Information Technology

Gokaraju Rangaraju Institute of Engineering And Technology

Suraj Akula

Summary

Overview

Work History

Data Engineer

Data Engineer

Education

Bachelor of Technology (B.Tech) - Information Technology

Skills

Certification

Timeline

Data Engineer

Data Engineer

Bachelor of Technology (B.Tech) - Information Technology

Similar Profiles

KUSHAL DALAVAIKUSHAL DALAVAI

SaiRamana AllaSaiRamana Alla

Mahesh KanuriMahesh Kanuri

Govardan ReddyGovardan Reddy

SUBHASHREE DASHSUBHASHREE DASH