Summary

Overview

Work History

Education

Skills

Websites

Accomplishments

Technical Knowledge

Key Projects Handled

Personal Information

Timeline

PABBA SANKAR

Lead Data Engineer

Hyderabad

Summary

Experienced Data Engineer specializing in designing and implementing scalable data pipelines and solutions using AWS, PySpark, and Python. Proven track record in building robust ETL frameworks, real-time and batch data ingestion pipelines, and managing large-scale data processing through distributed systems. Proficient in containerizing applications with Docker and deploying data workflows on cloud-native infrastructure. Skilled in handling diverse data types, including semi-structured and unstructured text data, with practical experience in developing predictive analytics models and conducting text mining for NLP-based applications. Collaborative team player ensuring data availability, quality, and actionable insights for business stakeholders.

Overview

years of professional experience

4013

years of post-secondary education

Work History

Snr. Technical Lead

LTIMINDTREE

08.2022 - Current

Asst. Manager-Analytics

Concentrix Daksh Services India Pvt Ltd

08.2017 - 08.2022

Sr.Data Engineer

IL&FS Technologies Pvt Ltd

07.2016 - 08.2017

Sr.Software Engineer

Shopalyst Technologies

05.2016 - 07.2016

Lead Engineer

HCL Technologies

01.2013 - 04.2016

Education

M.C.A. -

Holy Mary Institute of Technology And Science

B.Sc. - M.P.C.

J.R.N. Rajasthan Vidyapeeth University

12th -

Board of Intermediate

10th - undefined

Board of School Secondary Education

Skills

Spark
Hive
Kafka
Python
R
Django
MySQL
Pycharm
R-studio
R shiny
Logistic Regression
SVM
Random forests
Decision Tree
Topic Summarization
Topic Mining
SQL
Pyspark
Hadoop Ecosystem

Hbase
AWS
s3
ECR
cloudwatch
EC2
EMR
Athena
Lambda
SparkSQL
DataBricks
Shell script
Git
Jenkins
Docker
Airflow
Bit-bucket
Jira

Websites

Accomplishments

Filed A Patent on Defect Classification And Association In A Software Development Environment
Written Technical Blogs for HCL on Call Volume Prediction, Time series Analysis using Machine Learning Algorithms.

Technical Knowledge

Apache Hadoop, HDFS, Spark, Hive, Kafka, Python, R, Java/J2SE, Coginiti Pro, SAS-EG, Django, Servlets, JSP, HTML, JavaScript, XML, Tomcat9, MySQL, Hive, Pycharm, Spyder, R-studio, Eclipse, R shiny, Power BI, Tableau

Key Projects Handled

S&P Global, DI Engine, Sr. Technical Lead, SQL, HIVE, Pyspark and Hadoop, Pyspark, python, pandas, Hadoop Ecosystem (HDFS, Hive, Oozie, Kafka, Hbase), Agile (Value-Driven Delivery), AWS (s3, ECR, cloudwatch, EC2, EMR, Athena, Lambda), SQL, SparkSQL, DataBricks, postgres, Shell script, Pycharm (python), Intellij, Git, Jenkins, Docker, Airflow, 12, We architected and implemented a highly extensible and production-grade data ingestion and transformation framework designed to seamlessly process API response datasets stored in AWS S3, encompassing diverse formats such as CSV, nested JSON, and Excel. The platform features intelligent parsing capabilities—including flattening complex JSON hierarchies, selective tab and cell extraction from Excel, and advanced data reshaping through pivoting and unpivoting followed by a rich suite of transformation logic such as standardized date-time formatting, expression-based column derivations, string operations, type validation, joins, unions, window functions, and generation of unique identifiers. Engineered to support both real-time (live) and large-scale historical data pipelines, the application ensures operational efficiency, scalability, and adaptability across business domains. Curated datasets persisted as Delta Tables on external AWS S3 storage, leveraging Delta Lake’s ACID compliance, schema evolution, partitioning, and time-travel capabilities thereby enabling high-performance analytics and trusted data consumption across the organization.
Aetna, CNX Payment Integrity, Sr. Data Engineer, Pyspark, SparkSql, Hive, Databricks, Oozie, Jira, Bitbucket, python, R, R-Studio, HDFS, 8, Performing Exploratory data analysis for US based healthcare client by using Bigdata (Hive, Hadoop, Azure data bricks). CNX Payment integrity offers a comprehensive suite of payment integrity solutions. The payment integrity Enterprise suite is designed to provide analysts, investigators, managers, policy makers and stakeholders with insights that help health and human services agencies address fraud, abuse, waste, Overpayment. Packaged and Reusable Analytics models for industry standard solutions. ACAS (Automated claim Auditing System) to analyse historical claims and transactions to identify potential savings for overpayment recovery and reduced administrative costs.
Tufts, CNX Payment Integrity, Sr. Data Engineer, Pyspark, SparkSql, Mysql, Azure Data Lake, Databricks, Oozie, Jira, Github, Azure Data Factory, 4, Performing Exploratory data analysis for US based healthcare client by using Bigdata (Hive, Hadoop, Azure data bricks). CNX Payment integrity offers a comprehensive suite of payment integrity solutions. The payment integrity Enterprise suite is designed to provide analysts, investigators, managers, policy makers and stakeholders with insights that help health and human services agencies address fraud, abuse, waste, Overpayment. Packaged and Reusable Analytics models for industry standard solutions. ACAS (Automated claim Auditing System) to analyse historical claims and transactions to identify potential savings for overpayment recovery and reduced administrative costs.
Internal Project (Product), Concentrix Insights Platform, Sr. Data Engineer, python, J2EE, Django, Html, JavaScript, Pyspark, Power BI, 4, Developed Enterprise-wide Self Service Analytics Platform bringing data, analytics, tools which process under one roof for customer deliverables. Secure and Automated Platform to mine and analyse the data to give a 360 view with Embedded and Reusable models which enables analyst to focus on driving business values and insights.

Personal Information

Date of Birth: 03/16/85

Timeline

Snr. Technical Lead

LTIMINDTREE

08.2022 - Current

Asst. Manager-Analytics

Concentrix Daksh Services India Pvt Ltd

08.2017 - 08.2022

Sr.Data Engineer

IL&FS Technologies Pvt Ltd

07.2016 - 08.2017

Sr.Software Engineer

Shopalyst Technologies

05.2016 - 07.2016

Lead Engineer

HCL Technologies

01.2013 - 04.2016

B.Sc. - M.P.C.

J.R.N. Rajasthan Vidyapeeth University

10th - undefined

Board of School Secondary Education

M.C.A. -

Holy Mary Institute of Technology And Science

12th -

Board of Intermediate

Similar Profiles

Rakesh YadavRakesh Yadav
Onshore Technical Lead / Service Manager at Logistics Managed service ApplicationsOnshore Technical Lead / Service Manager at Logistics Managed service Applications
Sujitha SSujitha S
Technical Lead at Prodapt SolutionsTechnical Lead at Prodapt Solutions
Syed Mohideen IsmailSyed Mohideen Ismail
Technical Architect[Backend Chapter Lead] at Ejada SystemsTechnical Architect[Backend Chapter Lead] at Ejada Systems
KISHORE AADADAKISHORE AADADA
Technical Lead (Big Data) at CitiusTech Healthcare TechnologyTechnical Lead (Big Data) at CitiusTech Healthcare Technology

CREATE PROFILE

Summary

Overview

Work History

Snr. Technical Lead

Asst. Manager-Analytics

Sr.Data Engineer

Sr.Software Engineer

Lead Engineer

Education

M.C.A. -

B.Sc. - M.P.C.

12th -

10th - undefined

Skills

Websites

Accomplishments

Technical Knowledge

Key Projects Handled

Personal Information

Timeline

Snr. Technical Lead

Asst. Manager-Analytics

Sr.Data Engineer

Sr.Software Engineer

Lead Engineer

B.Sc. - M.P.C.

10th - undefined

M.C.A. -

12th -

Similar Profiles

Rakesh YadavRakesh Yadav

Sujitha SSujitha S

Syed Mohideen IsmailSyed Mohideen Ismail

KISHORE AADADAKISHORE AADADA