Varun Mehrotra

SRE (Site Reliability Engineer)

Lehi, Utah, United States17 yrs 4 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • 17+ years of experience in Software Engineering and Site Reliability Engineering.
  • Expertise in building and operating secure, reliable services.
  • Strong problem-solving skills with a focus on cost-saving methods.
Stackforce AI infers this person is a SaaS Site Reliability Engineer with extensive experience in automation and cloud services.

Contact

Skills

Core Skills

Site Reliability EngineeringDevopsAdobe Experience ManagerContent Management SystemsWeb Application Development

Other Skills

AEMAWSAdobe ConnectAdobe Experience Manager (AEM)Amazon Web Services (AWS)ApacheApache SlingArchitectureArchitecturesAutomationAzureCMSCQ5CRXChef

About

Accomplished Software Engineer/Operations/Site-Reliability Engineer with 17+ years of experience looking to use my learnings, adaptive problem solving skills, and determined personality to collaborate closely with others. Very interested in opportunities that fall under Software Engineering/ Site Reliability Engineering / DevOps. Also looking to use skills to help design, develop, and deploy services that are secure and reliable. My obsession with taking ownership, striving for the highest standards, solving problems by doing deep dives while keeping the big picture in mind, and finding cost-saving methods while building trust has helped me successfully deliver all my projects and made me an exceptional team member.

Experience

17 yrs 4 mos
Total Experience
5 yrs 9 mos
Average Tenure
14 yrs 1 mo
Current Experience

Adobe

6 roles

Lead Site Reliability Engineer (Document Cloud)

Feb 2024Present · 2 yrs 4 mos

Senior Site Reliability Engineer (Document Cloud)

Jun 2021Feb 2024 · 2 yrs 8 mos

  • Senior Site Reliability Engineer to help build and operate services like Adobe Sign. Adobe Sign is the fastest, easiest way to get contracts signed and filed. Unlike virtual fax or e-signature software, we offer an end-to-end solution by automating the entire contracting process from the request for signature to the distribution and filing of the final agreement. We instantly show customers what’s out for signature, what is signed, when and by whom.
  • Job Duties include:
  • Strong programming skills, particularly with Python.
  • Experience implementing Chef, Docker, Kubernetes, etc. in AWS and Azure
  • Enforce security controls including PCI-DSS, HIPAA, and SOC2.
  • Deliver infrastructure as code, automated wherever possible, for resources like DNS, log management, and code deployments
  • Participate in on-call pager rotation
  • Participate in the incident management process and serve as a war room manager
  • Assist in the creation and refinement of operational documentation
  • Manage our uptime and performance using service level indicators and objectives
  • Familiarity with Prometheus, Cortex, Grafana, NewRelic, and Splunk
  • Our current stack: Java, Apache, Tomcat, Memcached, Qpid, Kubernetes and MySQL on Linux
  • Blue/Green deploys via Jenkins CI/CD pipelines and stack builder automation for infrastructure.
PythonChefDockerKubernetesAWSAzure+13

Senior Site Reliability Engineer

Promoted

Sep 2017Jun 2021 · 3 yrs 9 mos

  • 1) Manage Hosted Adobe Products - Adobe Connect and Adobe Experience Manager Infrastructure and ensure reliability and high availability
  • 2) Deploy, maintain and monitor services and subsystems for products:
  • Adobe Connect (300+ servers), SQL Server DB servers, Adobe Media servers, Tomcat/Java Web App Servers , Universal Voice servers and AEM Event Servers
  • 3) Security Adherence - Regular patching and emergency patching of systems, Implement a full set of Controls as outlined by the Common Control Framework
  • 4) Dev/Ops - Participate and Influence when necessary in engineering and PLT meetings, Jira (Bugs) for engineering, Enable Engineering by providing reliable monitoring data and logs, Automate Build/Deployments , Work on daily used Technologies - Windows, Linux (CentOS), Python, Salt, AEM, Java, SQL Server, mySQL, Node JS, Tomcat, Memcache, AWS, Azure, F5, Avi, Jira, Splunk, Nagios, Icinga, AD, DNS, smtp, New Relic, Pager Duty, Slack, Jenkins, Artifactory, GIT
  • 5) Automation and internal tools development - Use Salt and Python development to build full automation and deployment tools, Develop internal dashboards, Identify the redundant tasks and write utilities to automate them with minimal user interventions (Python, php shell scripting, jenkins, artifactory, vault, terraform)
  • 6) Incident Response and postmortems - war rooms, Actively participate to help mitigate the situation and bring the service back online in a timely manner, Engage with different teams and escalate as necessary, Incident RCA calls to discuss the problem and steps that can be taken in future to try and avert the situation
  • 7) Agile - Participate in Sprint planning, Participate in Sprint review meetings to optimize the process
  • 8) Escalations and On Call - Handle escalations from Support or Engineering, Troubleshoot and resolve technical issues, Ensure there are runbooks for all new alerts going to the GOC, Diagnose and solve problems and escalate when necessary
Adobe ConnectAdobe Experience ManagerSQL ServerTomcatJavaSecurity Controls+8

Computer Scientist

Sep 2017Aug 2019 · 1 yr 11 mos

  • Manage Adobe Products - Adobe Connect, Adobe Primetime and Adobe Experience Manager - Smart Content Service Infrastructure and ensure reliability
  • Deploy, maintain and monitor services and subsystems for products:
  • A) Adobe Connect (300+ servers)
  • B) Prime time (500+ servers)
  • C) Smart Content Service(50+)
  • Security Adherence, DevOps, Incident Responses
  • Automation and internal tools development
  • A) Use Salt and Python development to build full automation and deployment tools.
  • B) Develop internal dashboards and applications to assist our engineering and support partners.
  • C) Identify the redundant tasks and write utilities to automate them with minimal user interventions (Python, php shell scripting)
  • Escalations and On Call
  • A) Handle escalations from Support or Engineering
  • B) Troubleshoot and resolve technical issues
  • C) Accept calls and Jira tickets from the Global Operations Center while on call
  • D) On-Call = 7 days x 12 hours a day in a rotation with other team members
  • E) Ensure there are runbooks for all new alerts going to the GOC
  • F) Diagnose and solve problems and escalate when necessary.
Adobe ConnectAdobe Experience ManagerSaltPythonAutomationIncident Response+3

Senior Enterprise Product Consultant - AEM

Promoted

Nov 2014Sep 2017 · 2 yrs 10 mos

  • Worked as Senior Enterprise Support engineer at Adobe Systems on AEM (Adobe Experience Manager) platform. Certified developer, Advanced Developer and Architect for Adobe Experience Manager
  • 1) First point of escalation for complex issue & concerns relating to AEM technical aspects.
  • 2) Providing timely response/resolution to complex technical issue and product inquires.
  • 3) Ensuring our Customers for success and help ensure projects stay on track
  • 4) Drive Customer Experience improvements through timely services review
  • 5) Record and document all issues related to customers within established process guidelines
  • 6) Trouble-shoot/qualify cases before escalating into Engineering
  • 7) Provide on-site assistance as needed to resolve product issues (minimal) & Providing proactive guidance to designated contacts to meet the Client Milestone Objectives
  • 8) Product Content Creation (KB articles, whitepapers, forum participation, aem readiness)
  • 9) Contributed in product bug fixes and Adobe consulting service tools.
AEMTechnical SupportCustomer SuccessProduct Content CreationAdobe Experience Manager

AEM Technical Consultant

May 2012Nov 2014 · 2 yrs 6 mos

  • Working as CS engineer at Adobe Systems on various DAY content management Communique version CQ5.x,CRX 1.x,2,x , CRXDE, DAM, ADEP, WEM, Jack Rabbit,Apache Sling, OSGI
  • providing Adobe CQ5 technical support to clients for all kinds of product related issues eg. architecture, infrastructure, functionality, development, integration, migration
  • Analyzing critical issues for providing RCA and taking corrective measures to avoid recurrence of similar issues
  • Recognizing areas that require patching and upgradation for fixing vulnerabilities
  • providing better experience to customer by blogging, tweeting and sharing the knowledge base articles and important fix/releases
CQ5CRXDAMApache SlingIntegrationRCA+1

Helios & matheson

Senior Software Engineer

Nov 2011Apr 2012 · 5 mos · Noida Area, India

  • Onsite at Adobe Systems and working as Technical Consultant
  • for Adobe CQ5.

Tata consultancy services

Systems Engineer

Dec 2008Oct 2011 · 2 yrs 10 mos · Lucknow Area, India

  • Roles and responsibilities of Configuration Management, Deployment Lead, Application Maintenance and Support.
  • Web Application Development on JSF framework
  • Framework Code Integration with Application Code
  • ADS Authentication
  • Deployment Scripts to Deploy Weblogic EAR
  • Build XML Script to build the EAR using script
  • Application Env Support - Development/UAT/Perf/Testing/Prod
  • Load Monitoring
  • WPS Server and Adapter Migration
  • SAML Implementation for Single Sign On
  • Build Release and deployment in Production
  • Performance Testing of Application
  • Security Testing of Application
JSFWeblogicSAMLPerformance TestingSecurity TestingWeb Application Development

Education

SRMS College of Engg. & Tech, Bareilly

Bachelor of Technology (B.Tech.) — Computer Science

Jul 2004Jun 2008

UPKSS Public School

Foundation degree — Mathematics and Computer Science

Jan 1989Jan 2003

Stackforce found 100+ more professionals with Site Reliability Engineering & Devops

Explore similar profiles based on matching skills and experience