Arif Khan

Founder

Seattle, WA, USA23 yrs 11 mos experience

Key Highlights

  • Founder of innovative data storytelling platform.
  • Expert in big data architecture and cloud solutions.
  • Led cross-team efforts at Microsoft for scalable systems.
Stackforce AI infers this person is a Big Data Architect with expertise in SaaS solutions and cloud computing.

Contact

Skills

Core Skills

Big DataCloud Computing

Other Skills

Google BigQueryaws data pipelineHiveElasticsearchAzure Cosmos DBCOSMOSAzure KustoMicrosoft SQL ServerMicrosoft AzureAndroidPostgreSQLNoSQLEnterprise SoftwareTest Driven DevelopmentHadoop

About

Founder & CEO @ Zinzu | ex-Microsoft , Seattle, WA. The hardest part of data is understanding the story it’s trying to tell. Without that story, every decision becomes a guessing game. Each pattern in the data is a different story waiting to be uncovered. Finding these stories should be as simple as searching for a phrase in a document. On a mission to solve two real problems: 1) Build a platform that extracts stories from raw data. 2)Let people express data needs as naturally as they think, with zero technical skills.

Experience

23 yrs 11 mos
Total Experience
3 yrs 8 mos
Average Tenure
1 yr 7 mos
Current Experience

Zinzu

Founder & CEO @ Zinzu | On a Mission to Turn Scattered Data into Stories

Oct 2024Present · 1 yr 7 mos · Seattle, Washington, United States · On-site

  • Understanding user behavior from raw data is one of the hardest problems in analytics.
  • Dashboards show numbers, not stories. SQL shows tables, not context.
  • Without the story, teams are guessing.
  • Every sequence of events has a narrative.
  • The challenge is extracting it at scale, across millions of users and billions of events.
  • I’m building Zinzu to solve this:
  • 1) Automatically extract the real story hidden inside raw event streams.
  • 2) Let anyone ask complex behavioral questions as naturally as they think, no technical skills.
  • Check out our preview @ https://zinzu.io
Google BigQueryaws data pipelineHiveElasticsearchAzure Cosmos DBCOSMOS+41

Resonate

Principal Engineering Leader

Dec 2023Jun 2024 · 6 mos · Remote

Liveramp

Staff Engineering Leader / Big Data Architect

Mar 2022Dec 2023 · 1 yr 9 mos · Seattle, Washington, United States

  • ● Tech lead for Segments delivery platform.
  • ● As a tech lead, spent time understanding legacy code running as Microservices on
  • Kubernetes (GK8) & hadoop jobs on GCP’s dataproc clusters, with minimum to no help,
  • shared knowledge with the rest of the team and improved team's velocity.
  • ● Analyzed large volumes of service calls to identify cache inefficiency, worked with the
  • partner team to change calling patterns for efficient usage of pre-processed data.
  • ● Designed & led an effort with two other devs to implement auto routing methodology to
  • different GCP's dataproc clusters. This enabled us to increase the number of
  • preemptible nodes and reduced cost by 20% and improved sla’s by 40%.
  • ● Data is serialized as thrift objects in our large datasets, which need custom deserializers,
  • troubleshooting was pain, designed and led an effort with another dev to create deserializer on large datasets and write output to Google’s bigquery db in a human readable format, this enabled us fast turnaround times on data quality issues.
  • ● Analyzed audience data sets on non-used segments, designed & implemented a self learning & auto adjusting service to cut unused data, further improving data filtering performance.
  • ● Worked with the SRE team to identify missing datadog alerts and improved monitoring by adding required alerts and reducing noisy ones.
  • ● Simplified zookeeper’s znode structure for coordination between microservices
JavaPostgreSQLBig DataHadoopMapReduceCadence+8

Open source

Open source contribution

Jan 2021Mar 2023 · 2 yrs 2 mos · Seattle, Washington, United States

  • Open source contribution (personal project during free time) Simply Index Logger (SIL):
  • ● Idea, initiation of the project, built a team of contributors, designed & developed v1 of the product
  • ● Code is open sourced & open for contribution from other developers
  • ● SIL is developed in JAVA.
  • ● Design & source code is available at the following location
  • https://bitbucket.org/akindexedlogger/indexedlogger/wiki/Home

Microsoft

2 roles

Big Data Architect

Promoted

Apr 2017Mar 2022 · 4 yrs 11 mos · Redmond

  • ● Built and scaled a behavior-understanding system capable of processing petabytes of data on clusters with tens of thousands of nodes.
  • ● Led cross team effort across the org to build a platform performing events sequencing using state
  • machine models in a most generic way. This was a complex effort involving Engineers, Data Scientists, PM’s and communicating to the leadership team. Successfully led, designed & delivered a scalable platform.
  • ● Led an effort across research, data scientists and different teams to evaluate various anomaly detection algorithms on windows telemetry data, successfully delivered cross group feature.
  • ● Worked closely with different teams in Windows org on a regular basis to identify gaps in events processing system, triage and prioritize feature requests with management & pm’s.
  • ● Mentoring new hires and actively participating in interviewing candidates.
  • ● Core member in the architectural review team in our org.
  • ● Designed / developed backend data processing system using Java MapReduce, Impala on azure
  • & Cosmos for Anomaly detection framework.
  • ● Worked with various teams in Office org to improve surveys performance.
  • ● Led a V team with data engineers & data scientists to analyze various sampling strategies to
  • reduce data volume and successfully implemented sampling strategy for the org.
  • ● Designed data storage layer for experimentation system in COSMOS (internal map-reduce big
  • data system) to enable backfilling and reduce storage (GDPR) guidelines.
  • ● Led cross team effort to implement GDPR policies on all data assets.
  • ● Led an effort to develop the export process to fulfill GDPR’s DSR. This was a cross team effort between onsite and offshore teams.
  • ● Analyzed data and implemented auto tagging process to separate out non-private & privacy datazZ
JavaC#COSMOSMicrosoft SQL ServerBig DataMapReduce+5

Senior Software Design Engineer

Jan 2006Apr 2015 · 9 yrs 3 mos · Greater Seattle Area

  • ● Joined Bing Ads when the group was a startup, has worked on various iterations of ad campaign
  • & editorial systems until Yahoo & Bing ad systems were merged.
  • ● Primary owner of ads editorial system, designed and developed a scalable platform to facilitate
  • manual verification of ads/keywords (serving editors from different locations).
  • ● Member of a team that successfully delivered scalable ads platform during crucial Yahoo and
  • Microsoft search engine merger under tight schedules.
  • ● Worked as a mentor and technical lead to offshore teams in India and China.
  • ● Travelled to Bangalore, India to provide training on Bing’s advertisement platform to new
  • engineers who moved from Yahoo to Microsoft.
  • ● Architected a system to scale and distribute processing of windows app store telemetry data
  • using consistent hashing. Designed it in such a way , new servers can be added and accounts could be moved to new servers with no row copy of data.
  • 3
  • ● Designed and developed a caching solution to improve throughput of telemetry data processing by more than 40%, this started as a side project and later productionalized.
  • ● Successfully led an effort to make Windows Reliability Telemetry system stable and working with minimal intervention during crucial release of Windows 8.
  • ● Developed auto scalable windows service (similar to mapreduce) for XBOX supply chain to distribute processing of enterprise relational data across multiple servers and bulk transfer reduced data to persistent store, improved performance by more than 50%
  • ● Designed and developed editorial processing system for Bing ads, this was done using functional partitioning and queueing methodologies, improved performance by 60% and reduced cost of customer support by 20%.
  • ● Led a team of 3 developers to build B2B interface with external digital marketing provider “ExactTarget” to deliver emails targeted towards xbox users.

Pointinside

Principal Architect

Apr 2015Apr 2017 · 2 yrs · Greater Seattle Area

  • ● Architected Analytics platform for retail partners using Big Data and NoSql solutions on Amazon’s aws cloud.
  • ● Go to person and big data architect across the company.
  • ● Developed scalable solution to process data on aws EMR clusters and load processed data into
  • ElasticSearch cluster to provide querying capabilities.
  • ● Built an automated solution to onboard new partners which includes various services (dns -
  • dynect , sftp, setting up infrastructure on aws cloud and monitoring).
  • ● Worked as an architect to search team to reduce data load times to solr(designing a system to
  • provide incremental loads to solr with combination of emr processing)
  • ● Developed a prototype to build a cloud based self serving analytics platform to retail partners,
  • which can be used by their in-house data stores / ui interfaces.
  • ● Designed and built a framework to streamline data loads to ElasticSearch cluster.
  • ● Performance tuned ElasticSearch indices and data load processes.
HadoopMapReduceHiveAmazon Web Services (AWS)Apache Pigaws data pipeline+3

Consult at verizon, dallas tx

Software Engineer

Jan 2002Dec 2005 · 3 yrs 11 mos

Stackforce found 100+ more professionals with Big Data & Cloud Computing

Explore similar profiles based on matching skills and experience