karthik kumar

SRE (Site Reliability Engineer)

United Kingdom18 yrs 5 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in automation and infrastructure management.
  • Proven track record in datacenter provisioning.
  • Strong experience with monitoring and CI/CD systems.
Stackforce AI infers this person is a SaaS Infrastructure Engineer with strong automation and monitoring expertise.

Contact

Skills

Core Skills

AutomationConfiguration ManagementInfrastructure ManagementMonitoring

Other Skills

AWSAmazon Web Services (AWS)BashCassandraChefClusterDHCPDNSData CenterDevOpsDisaster RecoveryDomain Name System (DNS)F5 BigIPHadoopHigh Availability

Experience

18 yrs 5 mos
Total Experience
2 yrs 10 mos
Average Tenure
7 yrs 5 mos
Current Experience

Confluent

SRE

Jan 2019Present · 7 yrs 5 mos · London, United Kingdom

Sap

SRE (Altiscale acquired)

Nov 2016Sep 2018 · 1 yr 10 mos · Bengaluru Area, India

  • systems and network engineer

Altiscale

SRE

May 2015Sep 2018 · 3 yrs 4 mos

Walmart labs

Advanced Systems Engineer

Sep 2013May 2015 · 1 yr 8 mos · Bangalore

  • Chef
  • Writing wrapper cookbooks and cookbooks for Solaris
  • Plan, organize and implement cookbooks and roles
  • Setting up chef environments for testing cookbooks and moving to production
  • Logstash
  • Setup clustered logstash, elastic search, Redis all behind nginx
  • Using syslog and rsyslog to forward the logs with chef cookbooks to manage syslog/rsyslog
  • Rundeck
  • Setting up clustered rundeck and managing users.
  • Creating jobs for
  • a. App releases
  • b. System / Firmware upgrades
  • Infoblox
  • Managing DNS entries and Zones
  • IPAM
  • Using infoblox python api to integrate with Provisioning systems and other tools
  • Development
  • Design and developing custom tools with Python, cassandra, influxdb, JQuery and Bootstrap
  • Dashboard for nagios monitoring status
  • Administration
  • Linux and Solaris
  • SAN
  • DNS, DHCP, LDAP
ChefLogstashRundeckInfobloxLinuxSolaris+2

Salesforce

Operations engineer (Dimdim acquired)

Jan 2011Aug 2013 · 2 yrs 7 mos · Hyderabad, India

  • Continuous Integration
  • A fully automated system to poll SCM Perforce, build, Test and deploy using Jenkins and Rundeck
  • Nagios
  • Setup, monitor, weekly analysis and capacity planning
  • Write custom plugins for monitoring Terracotta JVM
  • Splunk
  • Setup clustered splunk and forward logs
  • Requirement gathering from Dev, PMs and Support team and create splunk dashboards for
  • Troubleshooting
  • Analyzing trends of app usage, response time and so on..
  • Load Balancer
  • Configuring load balancer VIPs, Pools, Nodes, SNAT and setup monitors
  • Using F5 Python API to programmatically configure LB
  • Server Provisioning
  • Automating provisioning hundreds of servers using Razor based on policies created hooking up dhcp and DNS (infoblox)
  • After OS installation managing applications using puppet
  • Deploying, troubleshooting and monitoring of Bigdata including hadoop, hbase, coprocessor, otsdb, sauron components. [ Beginner ]
NagiosSplunkLoad BalancerServer ProvisioningMonitoringAutomation

Dimdim

Member of Technical Staff. (Operations)

Dec 2009Jan 2011 · 1 yr 1 mo

  • Provisioning Dimdim’s new datacenter from scratch.
  • Planning for a brand new datacenter along with datacenter architects.
  • Kickstarting, Network setup and various other services like Kerberos, LDAP, DNS, DHCP, PXE, FTP, SNMP, NFS, yum, Logrotate, cron, NTP, SMTP.
  • Managing legacy AWS machines (DR and Backups)
  • Setting up DR AWS instances cost effectively.
  • Backup LDAP, MySQL dumps to Amazon S3.
  • Writing tools for (using Perl,Bash, awk)
  • Datacenter utlization and performance analyzing
  • JVM and Garbage collection monitoring
  • Log analyzing
  • Controltier (Deployment Automation,Bash)
  • Implemented deployment automation in Dimdim.
  • Configuring various products’ dependencies and deployment flow.
  • Zero downtime by Integrating with load balancer.
  • Implemented Continuous Integration.
  • Data migration across datacenters (Perl)
  • Dimdim moved to use user management in LDAP. We were responsible in migrating the data from MySQL to LDAP.
  • Release specific data migrations.
  • Implementing java clustering with Terracotta.
  • MySQL and LDAP administration
  • Troubleshooting.
  • Daily backups
  • Disaster Recovery
  • Clustering.
  • Nginx
  • Managing nginx as reverse proxy and web server for static contents.
  • Monitoring, troubleshooting, sys.administering and network management.
AWSPerlBashNginxInfrastructure ManagementAutomation

Computer sciences corporation

Assiociate Software Engineer

Jul 2007Nov 2009 · 2 yrs 4 mos

  • Web Developer.

Education

Anna University Chennai

B.E — Computer Science

Jan 2003Jan 2007

Stackforce found 100+ more professionals with Automation & Configuration Management

Explore similar profiles based on matching skills and experience