Shailendra Kumar — Co-Founder

Seasoned Senior level DevOps/SRE Professional who favors challenging situations to utilize problem-solving technical skills to contribute to organizational goals and values. Currently helping team adopt chaos engineering practices. Specialities: Virtualization and Cloud: - AWS - EC2, RDS, EBS etc - OpenStack (Opensource and Mirantis ) - cinder, swift, nova, neutron, horizon, glance, Sahara(Elastic Map Reduce) - KVM, Xen, Vagrant and Oracle VirtualBox - Docker, Containers, coreOS, fleet, etcd, Kubernetes Deployment/Config Management Tools: - Ansible, Salt Version Control: git Programming Languages and Scripting: - Java, Groovy, Shell, Python, and Perl Storage: - HP 2000 - GlusterFS, LVM Operating Systems: - Centos6.X, centos7.x, and Ubuntu Administration: - Linux, storage and Hadoop administration(vanilla and CDH) CI & CD: - Jenkins, cloudbees CD/RO Monitoring: - Zabbix, Nagios, and Grafana, Prometheus and alrt manager, newrelic, splunk Ticketing: - Remedy, ITSM, Jira, Salesforce, Change Management ePortFolio: https://eportfolio.greatlearning.in/shailendra-kumar I have gained expertise in designing, building, and maintaining large scale, performant, and resilient systems and infrastructure. I have worked on a wide array of projects including deprecation of legacy systems, migrating to new architectures and creating new architectures from ground. In my current role, I am also responsible for close communications between technology, product & business teams, sprint planning & retrospective, doing code reviews, building delivery plans, mentoring engineers and participating in hiring process. can be reached at emailtoshailendra@gmail.com

Stackforce AI infers this person is a DevOps/SRE expert with a focus on cloud infrastructure and automation.

Location: Noida, Uttar Pradesh, India

Experience: 15 yrs 1 mo

Skills

Automation
Cloud Computing
Hadoop Administration
Data Management
Support Engineering
Software Development

Career Highlights

Expert in chaos engineering practices.
Proficient in cloud infrastructure management.
Strong background in automation and scripting.

Work Experience

Reliability System LLP

Co-Founder (3 yrs 1 mo)

Adobe

Computer Scientist II (SDE IV)/SRE (9 mos)

Computer Scientist (SDE III) (4 yrs 11 mos)

Guavus

Lead Engineer (3 yrs)

Amazon

Support Engineer III (1 yr 1 mo)

CGI

Associate Software Engineer (2 yrs 3 mos)

Education

UCLA PGP Pro at UCLA Anderson School of Management

Post Graduate Program in Cloud Computing at Great Lakes Institute of Management Gurgaon

Bachelor of Technology (B.Tech.) at UP Technical University - NIET Greater Noida

Shailendra Kumar

Co-Founder

Noida, Uttar Pradesh, India15 yrs 1 mo experience

Highly Stable

Key Highlights

Expert in chaos engineering practices.
Proficient in cloud infrastructure management.
Strong background in automation and scripting.

Stackforce AI infers this person is a DevOps/SRE expert with a focus on cloud infrastructure and automation.

Contact

emailtoshailendra@gmail.com LinkedIn

Skills

Core Skills

AutomationCloud ComputingHadoop AdministrationData ManagementSupport EngineeringSoftware Development

Other Skills

ANTAWSAmazon EKSAmazon S3AnsibleBashCloud Computing IaaSDatabase connectivityDatabasesDjangoDockerFastAPIFlaskGitGrafana

About

Experience

15 yrs 1 mo

Total Experience

3 yrs

Average Tenure

3 yrs 1 mo

Current Experience

Reliability system llp

Co-Founder

May 2023 – Present · 3 yrs 1 mo · Noida, Uttar Pradesh, India · On-site

Led the growth and expansion of Reliability System LLP through strategic planning, sales, and marketing efforts.
Developed and implemented innovative marketing strategies to increase brand visibility.
Successfully scaled the company by hiring and training new talent to support business objectives.

Adobe

2 roles

Computer Scientist II (SDE IV)/SRE

Aug 2022 – May 2023 · 9 mos

Computer Scientist (SDE III)

Sep 2017 – Aug 2022 · 4 yrs 11 mos

Writing scripts in Shell and Python for Automation
Writing terraform templates & modules to create and manage resources in AWS
Working on configuration management tools like Saltstack
Managing several Microservices for backend (onboarding, Creating pipelines, implementation of monitoring etc.)
Leading Chaos engineering efforts across teams
Implementing Cost optimization measures for AWS resources
Deploying and managing services in Kubernetes
Managing compliance and security of application and infra
Preparing, updating and testing Disaster Recovery Procedures
Conducting Daily standups, Sprint planning and review meeting of multiple scrum teams.
Onboarding and decommissioning of projects.

ShellPythonTerraformSaltstackKubernetesAWS+2

Guavus

Lead Engineer

Sep 2014 – Sep 2017 · 3 yrs · Gurgaon, Haryana, India

Deployment of Guavus Analytics on Tier1/Tier2 ISP in US.
Onsite critical upgrade support from hadoop to hadoop yarn.
Hadoop Software installation and Port configuration
Configure Name Node High Availability
Hadoop Cluster Software patching and upgrades.
Database connectivity for the Hadoop Cluster
HDFS management and monitoring.
HDFS support and maintenance.
Cluster maintenance including creation, addition and removal of data or name nodes
Manage and review Hadoop/Oozie log files.
Configuration,Running,Troubleshooting Oozie MR Jobs.
Module wise data validation from raw flow received, data dropped,annotated data and data showing up on UI.
Performance testing in staging environment to verify the system performance, UI performance and Modulewise functionality.
Modularise Data integrity and Data Validation practices.
UAT with customers
Interacting with sales, solutions Architect, customer from/during E2E deployment activities.
Work closely with Product Development, Product Manager and other stake holders to collaborate on Bugs and issues which require deep expertise.
Perform deep dives into both systemic and latent reliability issues; partner with software and systems engineers across the organization to produce and roll out fixes
Customer facing role, interacting with customer and resolving issues raised by customer.
Writing MOPs, SOPs and DRPs.
Apply in-depth analysis of hadoop based workload, project-based work, design solutions to issues, and evaluate their effectiveness

HadoopOozieDatabase connectivityHDFS managementHadoop AdministrationData Management

Amazon

Support Engineer III

Aug 2013 – Sep 2014 · 1 yr 1 mo · Hyderabad Area, India

I write unix shell scripts and do root-cause analysis for business critical issues and implement solutions for same.
I work in following technologies..
1. Unix Shell Scripts , Perl , SQL
2. Java , JSP and Web services
3. Cloud computing
Version control System : git , perforce
Build Toll : ANT
Development Tool : Eclipse
Role:
I am a part of Fulfilment center software team which is responsible for handling different services for Amazon warehouses. These services are mission critical services as any issue with these services can cause huge impact on customer orders hence affecting business. We need to make sure that these services are available 24X7.
My current roles and responsibilities are as follows:
Handling tickets cut for any service related issues. Deep dive into the issue to find the root cause.
Once the root cause is identified then we have to do trouble shooting to fix the issue.
If the issue is related to other teams then involve them as well to get it fixed.
In case of any high severity issue jump into the conference call and co-ordinate with different teams to identify the issue.
Writing Shell script for creating internal reports.
Writing tools to reduce operational burden.
Performing deployment for different services with active monitoring to keep a watch on the services' health.
Escalating issues to developers of the services in the event of any anomalies found during deployment.
Interviewing candidates for different openings related to my domain.
Hardware Planning and EC2 migration
Creating and maintaining Auto scaling groups and maintaining availability zone redundancies
Planning and designing fault tolerant and high availability clusters
Handling bad datacenter and bad rack distribution risks
Performing regular audits for infrastructure

Unix Shell ScriptsPerlSQLJavaWeb servicesSupport Engineering+1

Cgi

Associate Software Engineer

Apr 2011 – Jul 2013 · 2 yrs 3 mos · bangalore

I used to write Unix shell scripts and automate manual work . I just loved to write Unix shell scripts.
I did following projects in CGI.
Network Mediation by Metasolv:-
Project Description:This application is customized application on top of Metasolv application (a Mediation product from oracle). In this project we used to process HSPA data received from Switches (Network Elements). After processing CDR’s we used to created DB files for bulk load purpose and creating bsi files(billing files)and send to downstream counter parts like Amdocs.
Roles & Responsibility
Wrote custom scripts to pull files from switch and after processing files send to down streams.
Installing application, creating node manager and node chains, setup application on new servers.
Wrote/maintained script using unix shell scripting and pl/sql to bulk load the files and send bulk load reports by email.
Designed custom alerting system using unix scripts and Perl scripts.
Create reports using scripts.
OMC 6.0 Upgrade :
Roles & Responsibility
Prepared build using ANT
Deployed application in performance test environment
Participated in performance test result review
Set up bulk load jobs on database server
Participated in parallel run
Deployed the application in production
Participated in post deployment activities
Message Process System:
Roles & Responsibility
Worked as Developer to convert the requirement into application.
Designed custom scripts and tools for fetching files from switches and parse them and send for processing.
DRP (Disaster Recovery Process)
Roles & Responsibility:
Sketched the plan for DRP according to situation
Tested prepared plan for different possible situation
I enjoy to automate the manual process and setting up alerting systems by automated scripts and cronjobs to help application team in identifying issue on time. I deployed various CR’s in other projects on the floor.