Sanjay Aswani

AI Researcher

Mumbai, Maharashtra, India15 yrs 7 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

12 years of experience in data engineering.
Expert in building scalable distributed systems.
Proficient in Apache Spark and Hadoop stack.

Stackforce AI infers this person is a Data Engineering expert with a focus on scalable distributed systems in the SaaS industry.

Contact

sanjayaswaniec@gmail.com +81081445807 LinkedIn

Skills

Core Skills

Data EngineeringDistributed SystemsBusiness Intelligence

Other Skills

AerospikeAmazon EC2Amazon Elastic MapReduce (EMR)AnalyticsApache AirflowApache DruidApache FlinkApache OozieApache PinotApache SparkApache SqoopBig DataCassandraData AnalyticsData Investigations

About

12 years of hands-on experience In - Architecting large-scale distributed systems - Designing, implementing & managing low latency, high throughput, near-real-time, distributed data pipelines/data warehouses/analytics platform on top of tera bytes of data. Specialised in building scalable systems with - Apache Spark - Hadoop stack(HDFS, Hive & HBase) - SQL - NOSQLs(Cassandra, Aerospike, Redis & ElasticSearch) - Stream technologies(Flink, Spark Streaming & Kafka Stream) - Programming Language(Python, Java, Scala & SQL) - Kafka - Apache Airflow - OLAP Systems(Apache Pinot & Apache Druid) - Cloud services (AWS and GCP) - RDBMS (MS SQL Server, Mysql and PostgreSQL) - Data Warehousing - Data modelling - Business intelligence (Power BI, tableau and Superset).

Experience

15 yrs 7 mos

Total Experience

5 yrs 2 mos

Average Tenure

10 yrs 3 mos

Current Experience

Media.net

Sr. Lead Data Engineer

Feb 2016 – Present · 10 yrs 3 mos · Mumbai Area, India · On-site

I design and develop end to end DataPipelines to ingest, Process and transform structures and semi structures data(500million~100GB)/day) from various sources and implement data quality check services, data validations processes.
Designing and tuning scalable data pipelines using core components like HDFS,Apache Spark, Hive and Kafka.
I own the analytics platforms build with various stream technologies using Apache Flink and Spark Streams using Aerospike, Cassandra and HBase as storage layer and Kafka as messaging queue.
Have designed and developed multiple microservice applications with various distributed platforms using Java Spring Boot.
Have setup and maintained multiple OLAP platforms like Apache Pinot and Apache Druid for internal admin teams and customer Reporting. Have tuned these platforms to take most performance with optimum infrastructure investment.

Apache SparkHadoop stackHDFSHiveKafkaAerospike+10

Directi

Sr. Data Engineer

Dec 2011 – Jan 2016 · 4 yrs 1 mo · Mumbai, Maharashtra, India · On-site

Developing and working across multiple tiers of data-intensive and high-performance client-facing & backend applications.
Working on Big Data Solutions, Business Intelligence, Data Investigations, Process optimisations, Analytics and Reporting.
Implementing MSBI platform solutions to develop and deploy ETL, analytical, reporting on MS SQL Server using its Database Engine, Analysis Services, Integration Services & Reporting Services components
Implementing Large Volume Data Warehouse and ETL Processes to Load and provide Real Time Data for Business Analysis

MSBIETLBusiness IntelligenceData InvestigationsAnalyticsReporting+2