Summary
Overview
Work History
Education
Skills
Academiccredentials
Projectshandled
Professional Highlights
Personal Information
Languages
Certification
Timeline
Generic

ANWES PRADHAN

Sr DATA ENGINEER
Hyderabad

Summary

A self-motivated person who believes in growth and is very open to new ideas. I am looking for the place where my skills and expertise is being fully utilized and prove to be an asset to the company.

8+ Years of overall IT experience in Data warehousing solutions into SPARK, Google Cloud Platform (GCP), PYSPARK, Big Data Hadoop ETL, PYTHON, Informatica, Oracle, RDBMS 5+ Years of exclusive experience as Data Engineer in PYSPARK, SPARK, Hadoop and BigData eco-system components like HDFS, Hive, Sqoop, NiFi, NoSQL 2+ Years into Google Cloud (GCP) infrastructure Extensively drive the data processing through SPARK-SQL and SPARK-CORE Experience in BigQuery, Data Flow, DataProc, GCS, Cloud SQL, Cloud Function, Cloud Scheduler and Cloud Storage Expertise in SQL, ETL, Informatica, Oracle, Snowflake and NO-SQL. Knowledge on SPARK-STREAMING Experienced in creating Data Pipeline and loading large sets of structured, semi-structured and unstructured data in TB, GB into file system like GCS, HDFS using Hadoop/SQOOP, into RDBMS using Informatica and into Database using Oracle External Loader Extensively worked on data extraction, transformation and loading data from Application source system, Log server, RDBMS, Flat files, Excel. Strong HIVE-SQL, SQL, SPARK-SQL, SPARK-CORE Understanding in end-to-end Data Warehousing concept and ETL processing from Source to Data Mart/Data Lake Hand on experience in performance optimization in tuning, identifying and resolving performance bottlenecks in various levels of data pipelines and transformation. Involved in total SYSTEM DEVELOPMENT LYFE CYCLE (SDLC) Experience in UNIT testing in (Hadoop, ETL and DB Testing) and Data validation. Knowledge on scheduling tools like Active Batch and AirFlow and Informatica Scheduler. Excellent communication, documentation and presentation skills.

Overview

9
9
years of professional experience
1
1
Certification

Work History

Sr Cloud Data Engineer

Micron India Pvt. Ltd.
11.2019 - Current
  • Responsible for data Extraction from all heterogeneous sources (RDBMS, Files, HDFS, GCS, Servers) to Datawarehouse layer and creating Datamart Tables on top of them
  • Responsible for creating Data Analytics pipeline using GCS, BigQuery and scheduling through Cloud scheduler and PubSub and Cloud Function
  • Responsible for creating and managing data pipeline from source to target using PySpark, Python script and DataProc and NiFi as ETL
  • Manages archival data at GCS bucket for Year-to-Year analysis
  • Responsible for all activities related to the development, implementation and support of Big Data processing using HIVE and BigQuery Tables
  • Created Tables and Views on Snowflake, BigQuery and HIVE loaded the data to DataMart layer for user analysis
  • Developed Spark Scripts using Python as per the requirement
  • Used SPARK to load json data, create Schema RDD and loaded into HIVE, BQ Tables and handled structure data using SPARK-SQL
  • Responsible for creating Data Frame and Dataset object for SPARK-SQL processing
  • Supporting automated Python jobs and scripts
  • Worked on Data Cleansing and Data Standardization in ETL tool to do transformations, joins and some pre-aggregations before storing the data into HDFS and GCS
  • More often involved in Data analysis part by comparing the source RDBMS to either BQ or HIVE.

Software Developer

LiquidHub India Pvt. Ltd.
03.2018 - 10.2019

Software Engineer

GrayMatter Software Services Pvt. Ltd
02.2015 - 02.2018

Education

Master of Science -

College of Engineering And Technology
Bhubaneswar
04.2001 -

Skills

Oracle 11g(xe)

Academiccredentials

Gandhi Institution for Technological Advancement, BBSR, INDUSTRIAL ENGG. & MANAGEMENT from CET BBSR with 8.8 CGPA

Projectshandled

  • Micron Smart Manufacturing Data Analysis, November -2019, 15, DATA ENGINEER, Micron the fourth largest semiconductor manufacture in world is a USA based company. The goal of the project is to analyse the manufacturing data from each fabrication unit and to. As a Big Data ETL developer we are responsible for giving the data engineering solution to data source and data system, so that the productivity can be improved. Through SMAI project Micron successfully implemented SIGMA in the current data set and we can predict the data behaviour at an accuracy up to 99%., Responsible for data Extraction from all heterogeneous sources (RDBMS, Files, HDFS, GCS, Servers) to Datawarehouse layer and creating Datamart Tables on top of them, Responsible for creating Data Analytics pipeline using GCS, BigQuery and scheduling through Cloud scheduler and PubSub and Cloud Function, Responsible for creating and managing data pipeline from source to target using PySpark, Python script and DataProc and NiFi as ETL, Manages archival data at GCS bucket for Year-to-Year analysis, Responsible for all activities related to the development, implementation and support of Big Data processing using HIVE and BigQuery Tables, Created Tables and Views on Snowflake, BigQuery and HIVE loaded the data to DataMart layer for user analysis., Developed Spark Scripts using Python as per the requirement., Used SPARK to load json data, create Schema RDD and loaded into HIVE, BQ Tables and handled structure data using SPARK-SQL, Responsible for creating Data Frame and Dataset object for SPARK-SQL processing., Supporting automated Python jobs and scripts., Worked on Data Cleansing and Data Standardization in ETL tool to do transformations, joins and some pre-aggregations before storing the data into HDFS and GCS., More often involved in Data analysis part by comparing the source RDBMS to either BQ or HIVE.
  • Shire & Baxalta Data Analysis, Shire, USA, March-2018 to Oct -2019, 12, BIG DATA ETL Developer, SHIRE is a global healthcare company address the evolving needs of patients worldwide. As a Big Data team we are responsible for data accuracy and completeness of the provided data and to perform Patient data audit to find the customer tend in the market, through which we can predict the future sales of healthcare products of Shire., Responsible for configuration of Hadoop Eco system components using CDH 6 Distribution for HADOOP 2.6 version and SPARK 2.2.0, Responsible for loading data into Enterprise Data Warehouse (LH ODS) using Hadoop Eco System & Informatica BIG DATA tools, then creating Data Frame and Dataset object for SPARK-SQL processing., Responsible for migrating the Table Schema and definition from Oracle and MY-SQL to HDFS metastore for HIVE., Implemented Partition, Dynamic Partition and Bucketing in HIVE for efficient data access, Worked in different type of file format like ORC, RC and Sequential., Involved in performance tuning, query optimization, compression technique, Developed the Sqoop scripts in order to make the interaction between HIVE, Oracle and MySQL Database, Worked on Data Mastering and identifying the redundant data from ODS., Interacted with Business Analyst, Data Administrators, Reporting Team and End Users., Understanding the business requirement and ensuring estimated development timelines., Designed and developed end to end Data flow through Hadoop and ETL and storing into Database for Reporting., Performing the Quality Control checks on that data before it is loaded to Staging Layer., Involved in Development, resolved bugs Operation and support phase as per the team requirement.
  • Schneider, Schneider Electric, France, July-2017 to Feb-2018, 10, ETL Developer with HADOOP, This Project area comprises details of entire manufacturing execution. ETL team is responsible for handling and processing the data and providing the data to the BI team for reports, which will be useful for the business leaders to identify the focus areas like DPMO (Defects per Million Opportunities) Analysis - FPY (First Pass Yield) Analysis - FPY, PY, RY, MDR ppm, Responsible for fetching Client’s Manufacturing data from Schneider environment and ensuring perfect data load to HDFS for batch processing, Extensively used Informatica Client tools - Power Center Designer, Workflow Manager, Workflow Monitor and Repository Manager., Imported all the product specific data and event log to Data Ware House using Sqoop and Oracle Loader component from various relational databases like Oracle, Teradata., Responsible for writing and running Hive Query for daily and monthly reports., Experienced with different kind of compression techniques like GZip, BZip2, Snappy, Created Sessions, Tasks, Workflows and Worklets using Workflow manager., Involved in performance tuning and query optimization., Developed Slowly Changing Dimension Mappings for Type 1 SCD and Type 2 SCD
  • MTN Cellular Telecommunications, Nigeria, 8, ETL Informatica Developer, Worked as ETL, Hadoop and Oracle Developer in the project., Implemented Partitioning, and Buckets in HIVE for efficient data access., Utilized Apache Hadoop ecosystem tools like HDFS and Hive for large datasets analysis., Extensively worked with various Lookup caches like Static cache, Dynamic cache and Persistent cache., Implemented Change Data Capture logic for handling frequently changing data set from the source system., Worked in Informatica Optimization techniques and implemented in session level to reduce the workflow running time.

Professional Highlights

  • BigData Developer, Micron India Pvt. Ltd., Nov-2019
  • Software Developer, LiquidHub India Pvt. Ltd. (Capgemini), March 2018 to Oct-2019
  • Software Engineer, GrayMatter Software Services Pvt. Ltd., Feb 2015 to Feb 2018

Personal Information

  • Gender: Male
  • Nationality: Indian

Languages

English, Hindi and Oriya

Certification

Google Cloud Data Engineer

Timeline

Google Cloud Data Engineer

06-2024

Sr Cloud Data Engineer

Micron India Pvt. Ltd.
11.2019 - Current

Software Developer

LiquidHub India Pvt. Ltd.
03.2018 - 10.2019

Software Engineer

GrayMatter Software Services Pvt. Ltd
02.2015 - 02.2018

Master of Science -

College of Engineering And Technology
04.2001 -
ANWES PRADHANSr DATA ENGINEER