Paul Whitted

Product Manager

23 yrs 7 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • 20 years of experience in technology and healthcare sectors.
  • Expert in incident and crisis management.
  • Proven track record in building high-performing technical teams.
Stackforce AI infers this person is a seasoned leader in Infrastructure and Operations within the Technology and eCommerce sectors.

Contact

Skills

Core Skills

Technical LeadershipIncident ManagementIt OperationsHigh AvailabilitySystems ManagementNetwork OperationsTechnical SupportNetwork ManagementDeployment ManagementMonitoringSystem ManagementNetwork SecurityCustomer Support

Other Skills

Crisis ManagementTraffic ShapingWindowsApacheLinuxMulti Team ManagementeCommerce ApplicationsDatacenter Infrastructure24x7 team managementPerformance ManagementTeam ManagementTroubleshootingNetwork MonitoringData CenterServers

About

QUALIFIED BY 20 years of technical experience in Technology, Healthcare and Education business sectors. A successful background in building and growing high performing technical teams that deliver value to the business. A technical, hands on leader specializing in keeping large scale, high transaction production websites up and running. EXPERTISE • Incident Management • Crisis Management • Technical Leadership • Traffic Shaping • Monitoring • Windows • Apache • Nagios • Linux • Multi Team Management • eCommerce Applications • Datacenter Infrastructure

Experience

23 yrs 7 mos
Total Experience
3 yrs 2 mos
Average Tenure
8 yrs 9 mos
Current Experience

Facebook

Manager - Production Engineering

Sep 2017Present · 8 yrs 9 mos · Menlo Park, California

  • Support and lead managers and engineers working on Facebook's products and services, at different layers of the stack, on challenges related to scalability, reliability, performance and efficiency of systems
  • Understand and contribute to technical architectures, capacity plans, tooling needs, automation plans, product launch plans and create comprehensive plans for prioritizing technical and resourcing challenges
  • Drive technical architecture discussions, even on subjects you haven't had direct experience working with
  • Develop lasting partnerships with product management, program management, network engineering, software engineering and other related groups to build and improve our ever-growing large-scale distributed infrastructure and product environment
  • Empower engineers to develop their careers, matching their strengths with projects tailored to their skill levels, long-term skill development, personalities, and work styles
  • Help build and enrich an inclusive work environment comprised of people from diverse backgrounds
  • Assess employee performance on an ongoing basis, address under-performance, and recognize and promote performance
  • Work closely with dedicated recruiting staff to expand the team including interviewing candidates, participating in conferences/events, and on-boarding new employees
  • Balance the need to “keep things running” with allocating time to long-term, high-impact projects
Incident ManagementCrisis ManagementTechnical LeadershipTraffic ShapingMonitoringWindows+5

Formation data systems

Advisor

Apr 2015Apr 2017 · 2 yrs

  • Participate on the technical advisory council helping Formation Data Systems execute on their vision for what Enterprise Storage should be.

Netflix

Manager Enterprise Operations & Support

Feb 2015Feb 2017 · 2 yrs · Los Gatos

  • Responsible for leading a 24/7 team in our Los Gatos headquarters while overseeing the Ops support for our Corporate Infrastructure, Billing and Payments real time API's, Lab Engineering, Media Pipeline and Enterprise Platform teams.
  • Recruited and developed a high performance team to support and enhance IT Operations
  • Responsible for scheduling and maintaining the right balance of staffing for the 24/7 engineering support team.
  • Develop, monitor, and assess Enterprise Ops platforms, programs and services.
  • Engage the software engineers to understand upcoming features and coordinate appropriate monitoring/insight/alerting prior to deployment.
  • Own monitoring & alert configuration to detect, triage and resolve issues quickly.
  • Enable team to take charge of outages, lead calls until they are resolved, and make sure the root cause has been found and fixed, while closing the case as the lead on an Incident Review.
  • The ability to articulate and implement technical support without unnecessary processes and procedures.
24x7 team managementIT OperationsMonitoringIncident Management

Ebay

3 roles

Sr. Manager - Site Engineering Center

Promoted

Apr 2012Nov 2014 · 2 yrs 7 mos

  • Responsible for attaining high availability (99.94 goal), performance and reliability of the eBay Marketplaces website (ebay.com). Managing the operations of large scale systems to maintain high up-time/availability. Manage multiple teams across two states as well as one internationally (magento.com) 24x7x365. My teams are charged with the management of front end infrastructure of eBay.com and subsidiaries as well as the incident management and resolution of all major site issues.
  • Serve as a proactive mentor, teacher and problem solver
  • 24x7 incident and systems management
  • Executive Duty Officer rotation – Senior executive communication and PR liaison.
  • Systems include; all production databases- Oracle, MySQL, Mongo, Cassandra; Unix/Linux servers, Hadoop systems, VMs, Windows, Siebel, network assets, Load Balancers, tens of thousands of systems.
  • Solely responsible for the Time To Restore KPI which was less than 15 min goal.
High AvailabilityPerformance ManagementIncident ManagementSystems Management

Manager Site Operations

Oct 2011Apr 2012 · 6 mos

  • I managed the Technical Duty Officer team (10 directs) who oversaw incident management for eBay.com as well as driving resolution on site issues.
  • eBay's Site Engineering Center is responsible for attaining high availability, performance and reliability of the eBay marketplaces website. The SE-center manages operations on large scale systems to maintain high up time. Responsible for 24x7 Incident and Systems Management.
  • Systems include all Production Databases(Oracle, MySQL, Mongo, Cassandra), Unix/Linux servers, Hadoop systems, VMs, Windows, Siebel, Network assets, Load Balancers etc.
Incident ManagementSystems ManagementTeam Management

Principal Technical Duty Officer

Aug 2010Oct 2011 · 1 yr 2 mos

  • I was responsible for the resolution of all site impacting events as well as incident management during events. The technical leader in the room and the decision point for all repair paths and actions during crisis. Keeping the impact to the customers to a minimum and ensuring the uptime of the site to protect the business was my specialty.
Incident ManagementCrisis ManagementTechnical Leadership

Akamai technologies

Network Operations Engineer

Feb 2010Aug 2010 · 6 mos

  • Responsible for ensuring uptime and performance of the Akamai ADS Network. Offered tier 3 support (last line) for the platform ensuring the availability to serve customers and the business on a 24x7x365 basis. Accountable for all technical issues related to the vertical internal business unit, which consisted of the entire ADS technology stack.
  • Responsible for physical hardware spread across multiple Datacenters and regions / zones.
  • Responsible for the network and the application itself; processing over 60 million transaction every hour 24x7.
  • Responded to all alerts generated by alerting platform, Nagios as well as application specific performance alerts
  • Managed the internal ticket queue as well as project updates for the department
  • Utilized thousands of Dell servers as well as proprietary hardware across a cisco powered network, custom application running on custom Akamai build of Centos 4.6 with Apache/Tomcat.
Network OperationsTechnical SupportMonitoring

Vmware

VNoc Engineer - Contract

Dec 2009Feb 2010 · 2 mos

  • Responsible for transitioning the VMware NOC from Palo Alto to Broomfield, CO. The project was delivered on time and on budget with no disruption to the business. A strong and capable team was created and trained in the new Colorado location.
  • Responsible for identifying, troubleshooting and resolving IT production network, system and application related problems reported through Zenoss or CA Spectrum monitoring systems, trouble tickets or through the service desk.
  • Rapidly responded to all alerts, restoring or escalating service/application issues per established SLA’s
  • Facilitated IT conference calls during production outages, providing accurate written and verbal communication in a timely and consistent manner
  • Tracked all planned and unplanned outages through trouble ticketing systems and reporting tools
  • Participated in change management and post mortem process to ensure operational consistency and service improvement
  • Executed operational processes base on run book and SOP documentation
  • Configured and maintained network/systems management tools
Network ManagementTroubleshootingIncident Management

Netflix

NOC Engineer

Jan 2007Jun 2009 · 2 yrs 5 mos

  • Responsible for the successful deployment of all business applications, from the website code to the major database code updates and all other applications that were business critical. Worked as part of the 24x7x365 IT Operations team within the Network Operations Center.
  • Managed a highly visible, high traffic website and ensured site availability and escalation paths for critical issues
  • Delivered on a 15 minute downtime window, bi-weekly code deployment project that limited downtime to highly used services
  • Utilized BigBrother monitoring system and other homegrown monitors for alerting and proactive resolution of issues
  • Performed troubleshooting and resolution across a wide spectrum of systems including: Linux/Unix ,window servers and applications running on these systems
  • Experienced with Tomcat, Apache, IIS and SharePoint technologies
  • Organized and facilitated weekly planning meeting for deployments and setup the run sheets to ensure smooth roll outs to internal and external customers
  • Provided ongoing guidance and training to employees inside and outside of the department
Deployment ManagementMonitoringTroubleshooting

Sutter health

Systems Support Tech

Jan 2004Jan 2006 · 2 yrs · Santa Cruz

  • Responsible for daily system management and operation of all LAN’s, WAN’s and telecommunication software and hardware.
  • Configured, maintained, troubleshot end user workstations, departmental servers and phone systems
  • Rapidly responded to all incoming email and phone support request
  • Managed all network assets
  • Prepared and updated operational documentation & FAQ’s that are used by technical staff to perform tasks
  • Maintained and increased network security in regards to HIPPA by utilizing multiple tools
  • Performed daily operational tasks and troubleshooting of VMS Unix system
  • Performed monthly and emergency patch management per security incidents
System ManagementNetwork SecurityTroubleshooting

Navisite

NOC Specialist / Supervisor / CSE

Feb 2000Aug 2003 · 3 yrs 6 mos

  • Responsible for resolving all customer issues whether it was software, hardware or user related. Answered incoming phone calls and emails from highly technical engineers, customers and partners. As a supervisor, I oversaw up to twelve employees, making sure the nine defined roles we had were adequately staffed and properly monitored. I worked with a broad range of technologies including but not limited to:
  • IIS and Apache web servers, Cisco hardware (local directors, switches, routers), Firewalls, an assortment of monitoring tools (HPOV, BMC Patrol, Netcool, ISM URL monitoring) as well as the remedy ticketing program.
  • Monitored a network of 5,000 servers, NT & Unix, Linux and various storage arrays
  • Monitored networking devices such as routers, hubs and switches
  • Monitored all access to datacenter
  • Responsible for the management and resolution of all emergency situations that arose
  • Defined NOC roles and built rapport with customers
  • Handled customer issues escalated by my staff
  • Resolved any internal problems with regards to day to day operations of the datacenter
Customer SupportNetwork MonitoringIncident Management

Stackforce found 100+ more professionals with Technical Leadership & Incident Management

Explore similar profiles based on matching skills and experience