Kolisetty Sasiram

Associate Consultant

Andhra Pradesh, India4 yrs 7 mos experience

Key Highlights

  • Led migration of 100+ workflows to Databricks.
  • Achieved 70% increase in data access security.
  • Engineered data pipelines saving $100K annually.
Stackforce AI infers this person is a Data Engineering expert in SaaS environments, specializing in Big Data technologies.

Contact

Skills

Core Skills

Data EngineeringBig Data

Other Skills

Analytical SkillsApache SparkArduinoArduino IDEAzure Data FactoryAzure Data LakeAzure DatabricksAzure Key VaultC (Programming Language)Cascading Style Sheets (CSS)Data GovernanceData IngestionData ManagementData ModelingData Pipeline

About

Iโ€™m ๐Š๐จ๐ฅ๐ข๐ฌ๐ž๐ญ๐ญ๐ฒ ๐’๐š๐ฌ๐ข๐ซ๐š๐ฆ, a passionate Data Engineer with a proven track record in leveraging Big Data technologies to drive business outcomes. With over 4 years of experience, I specialize in transforming raw data into valuable insights and building robust data platforms that enhance operational efficiency and decision-making. My technical skillset includes: ๐€๐ณ๐ฎ๐ซ๐ž ๐ƒ๐š๐ญ๐š ๐…๐š๐œ๐ญ๐จ๐ซ๐ฒ, ๐ƒ๐š๐ญ๐š๐›๐ซ๐ข๐œ๐ค๐ฌ, ๐๐›๐ญ, ๐‡๐š๐๐จ๐จ๐ฉ, ๐€๐ฉ๐š๐œ๐ก๐ž ๐’๐ฉ๐š๐ซ๐ค, ๐’๐œ๐š๐ฅ๐š, ๐๐ฒ๐’๐ฉ๐š๐ซ๐ค, ๐‡๐ข๐ฏ๐ž, ๐’๐๐‹, ๐š๐ง๐ ๐๐จ๐’๐๐‹. I am also dedicated to continuously learning and staying ahead of evolving data engineering trends. In my current role as a ๐‹๐ž๐š๐ ๐‚๐จ๐ง๐ฌ๐ฎ๐ฅ๐ญ๐š๐ง๐ญ at Genpact, I architect and implement scalable, secure, and high-performance data engineering solutions for large enterprise environments. ย ย โ€ข Developed a Model-Controller Framework that streamlines complex ETL workflows. ย ย โ€ข Designed a dynamic DET framework for granular permissions management, ensuring compliance and governance across enterprise data landscapes. ย ย โ€ข Architected and optimized Databricks workflows aligned with Medallion architecture principles, achieving: 70% increase in authorized access to sensitive data and 30% boost in data processing speeds. ย ย โ€ข Engineered Azure Data Factory (ADF) pipelines, improving data throughput by 40%, resulting in $100K annual cost savings. ย ย โ€ข Tuned PySpark jobs, leading to a 40% improvement in processing efficiency. ย ย โ€ข Developed PySpark-based data cleaning pipelines, enhancing data accuracy by 15%. Beyond the technical aspects, I am passionate about building secure, reliable, and high-performance data platforms that enable organizations to unlock the full potential of their data. I thrive in challenging environments where innovation, performance tuning, and governance are top priorities. My goal is to continue evolving as a data engineering leader, driving cutting-edge data solutions that deliver measurable business impact and enable data-driven transformation across industries. If you're looking to collaborate on high-impact data engineering initiatives or need someone to help transform your organization's data into actionable insights, feel free to connect โ€” Iโ€™m always open to meaningful conversations and new opportunities !

Experience

Genpact

Lead Consultant

Aug 2024 โ€“ Present ยท 1 yr 7 mos ยท Hyderabad, Telangana, India ยท On-site

  • Led the migration of ๐Ÿญ๐Ÿฌ๐Ÿฌ+ ๐—œ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐—ฐ๐—ฎ ๐˜„๐—ผ๐—ฟ๐—ธ๐—ณ๐—น๐—ผ๐˜„๐˜€ to Databricks notebooks and workflows, improving scalability and long-term maintainability.
  • Designed and implemented ๐—ฎ ๐—บ๐—ฒ๐˜๐—ฎ๐—ฑ๐—ฎ๐˜๐—ฎ-๐—ฑ๐—ฟ๐—ถ๐˜ƒ๐—ฒ๐—ป ๐—ผ๐—ฟ๐—ฐ๐—ต๐—ฒ๐˜€๐˜๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ณ๐—ฟ๐—ฎ๐—บ๐—ฒ๐˜„๐—ผ๐—ฟ๐—ธ, streamlining pipeline deployment and reducing manual intervention.
  • Optimized Informatica logic into efficient Spark SQL and PySpark implementations, achieving ~๐Ÿฏ๐˜… faster execution times and improving pipeline reliability.
  • Implemented scalable processing workflows to refine and enrich data sourced from upstream ingestion frameworks for analytical and reporting use cases.
  • Improved ETL performance, driving ~๐Ÿญ๐Ÿฑ% ๐˜€๐—ฎ๐˜ƒ๐—ถ๐—ป๐—ด๐˜€ on Databricks compute costs through advanced query optimization and better resource allocation.
  • Spearheaded ๐—ฑ๐—ฎ๐˜๐—ฎ ๐˜ƒ๐—ฎ๐—น๐—ถ๐—ฑ๐—ฎ๐˜๐—ถ๐—ผ๐—ป and ๐—ฟ๐—ฒ๐—ฐ๐—ผ๐—ป๐—ฐ๐—ถ๐—น๐—ถ๐—ฎ๐˜๐—ถ๐—ผ๐—ป, ensuring high data accuracy and significantly reducing post-migration data discrepancies.
  • Scaled pipelines to process ๐˜๐—ฒ๐—ฟ๐—ฎ๐—ฏ๐˜†๐˜๐—ฒ๐˜€ of data daily, enhancing reliability and reducing data latency.
  • Collaborated cross-functionally to streamline data ingestion, processing, and reporting, improving SLA adherence and operational efficiency.
  • Provided technical leadership in the conversion of complex Informatica mappings and optimization of existing logic, fostering a culture of performance and engineering excellence.
DatabricksAzure Data FactoryETLData ValidationData ProcessingData Engineering+1

Celebal technologies

Big Data Consultant

Oct 2022 โ€“ Aug 2024 ยท 1 yr 10 mos ยท Hyderabad, Telangana, India ยท Hybrid

  • Developed ๐ƒ๐„๐“ (๐ƒ๐š๐ญ๐š ๐„๐ง๐ญ๐ข๐ญ๐ฅ๐ž๐ฆ๐ž๐ง๐ญ) ๐Ÿ๐ซ๐š๐ฆ๐ž๐ฐ๐จ๐ซ๐ค facilitating schema and table-level permissions management, automating time-bound access permissions for Unity Catalog schemas and tables/views.
  • Architected and developed Databricks workflows adhering to ๐Œ๐ž๐๐ข๐ฅ๐ฅ๐ข๐š๐ง ๐š๐ซ๐œ๐ก๐ข๐ญ๐ž๐œ๐ญ๐ฎ๐ซ๐ž principles, implementing seamless data ingestion and processing workflows.
  • Engineered processes within the customer-serving layer for seamless data delivery, incorporating functionalities such as Data Download, Data Preview, and Data Visualization. These enhancements led to a significant ๐Ÿ•๐ŸŽ% ๐ข๐ง๐œ๐ซ๐ž๐š๐ฌ๐ž in authorized access to sensitive data through user entitlement-based access controls.
  • Spearheaded the migration of Talend workflows to Azure Cloud, optimizing code logics within Databricks, resulting in a 30% improvement in processing speed and a ๐Ÿ๐ŸŽ% ๐ซ๐ž๐๐ฎ๐œ๐ญ๐ข๐จ๐ง in operational costs.
  • Designed and implemented ADF pipelines for efficient data ingestion from diverse sources, achieving a 40% increase in data throughput and saving approximately $๐Ÿ๐ŸŽ๐ŸŽ,๐ŸŽ๐ŸŽ๐ŸŽ annually in infrastructure costs through optimized flow orchestration with Data Factory pipelines.
  • Proficient in tuning and optimizing Pyspark jobs for enhanced efficiency, aligning tools with business use cases for optimal performance, and leading performance tuning initiatives, resulting in a ๐Ÿ’๐ŸŽ% ๐ข๐ฆ๐ฉ๐ซ๐จ๐ฏ๐ž๐ฆ๐ž๐ง๐ญ in data processing speed, as well as optimizing data pipelines, reducing ๐œ๐จ๐ฌ๐ญ๐ฌ ๐›๐ฒ ๐Ÿ‘๐ŸŽ% while ensuring data integrity and accuracy.
  • Engineered Pyspark data cleaning pipelines, ๐›๐จ๐จ๐ฌ๐ญ๐ข๐ง๐  ๐๐š๐ญ๐š ๐š๐œ๐œ๐ฎ๐ซ๐š๐œ๐ฒ ๐›๐ฒ ๐Ÿ๐Ÿ“% through addressing missing values, eliminating duplicates, and standardizing formats. Also, established a regulatory data quality framework for compliance.
DatabricksAzure Data FactoryData IngestionData ProcessingData QualityBig Data+1

Tata consultancy services

System Engineer

Aug 2021 โ€“ Oct 2022 ยท 1 yr 2 mos ยท Hyderabad

  • Experienced of importing and exporting data using ๐’๐ช๐จ๐จ๐ฉ from Relational Database Systems to HDFS and vice versa. Compiled, cleaned and manipulated data for proper handling.
  • Created multiple Hive tables, implemented partitioning, bucketing and other optimization techniques in Hive for efficient data access using ๐‡๐ข๐ฏ๐ž๐๐‹ language.
  • Implemented ๐‡๐ข๐ฏ๐ž ๐จ๐ฉ๐ญ๐ข๐ฆ๐ข๐ณ๐ž๐ ๐ฃ๐จ๐ข๐ง๐ฌ to gather data from different sources and run ad-hoc queries on top of them.
  • Increased the efficiency of the data processing by approximately ๐Ÿ‘๐ŸŽ% using ๐‡๐ข๐ฏ๐ž ๐จ๐ฉ๐ญ๐ข๐ฆ๐ข๐ณ๐š๐ญ๐ข๐จ๐ง ๐ญ๐ž๐œ๐ก๐ง๐ข๐ช๐ฎ๐ž๐ฌ, which helped in saving costs for the project.
  • Implemented Spark jobs using ๐๐ฒ๐ฌ๐ฉ๐š๐ซ๐ค and utilized Spark Structured APIs for faster processing of data.
  • Explored Spark jobs to improve the performance and optimization of the existing pipelines to deal with the growing data requirements. This resulted in reducing resources by 40% and fastening the process by ๐Ÿ‘๐ฑ.
  • Conducted performance tuning on SQL queries, optimized data retrieval by ๐Ÿ๐ŸŽ%.
HiveSparkSQLData ProcessingBig Data

Education

KL University

Bachelor of Technology - BTech

Jun 2017 โ€“ May 2021

Sri Chaitanya junior kalasala

Intermediate โ€” MPC

Apr 2015 โ€“ Apr 2017

S S S Mokshith High School

SSC โ€” SSC

Stackforce found 100+ more professionals with Data Engineering & Big Data

Explore similar profiles based on matching skills and experience