Paul Whitted

Product Manager

23 yrs 7 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

20 years of experience in technology and healthcare sectors.
Expert in incident and crisis management.
Proven track record in building high-performing technical teams.

Stackforce AI infers this person is a seasoned leader in Infrastructure and Operations within the Technology and eCommerce sectors.

Contact

Skills

Core Skills

Technical LeadershipIncident ManagementIt OperationsHigh AvailabilitySystems ManagementNetwork OperationsTechnical SupportNetwork ManagementDeployment ManagementMonitoringSystem ManagementNetwork SecurityCustomer Support

Other Skills

Crisis ManagementTraffic ShapingWindowsApacheLinuxMulti Team ManagementeCommerce ApplicationsDatacenter Infrastructure24x7 team managementPerformance ManagementTeam ManagementTroubleshootingNetwork MonitoringData CenterServers

About

QUALIFIED BY 20 years of technical experience in Technology, Healthcare and Education business sectors. A successful background in building and growing high performing technical teams that deliver value to the business. A technical, hands on leader specializing in keeping large scale, high transaction production websites up and running. EXPERTISE • Incident Management • Crisis Management • Technical Leadership • Traffic Shaping • Monitoring • Windows • Apache • Nagios • Linux • Multi Team Management • eCommerce Applications • Datacenter Infrastructure

Experience

23 yrs 7 mos

Total Experience

3 yrs 2 mos

Average Tenure

8 yrs 9 mos

Current Experience

Facebook

Manager - Production Engineering

Sep 2017 – Present · 8 yrs 9 mos · Menlo Park, California

Support and lead managers and engineers working on Facebook's products and services, at different layers of the stack, on challenges related to scalability, reliability, performance and efficiency of systems
Understand and contribute to technical architectures, capacity plans, tooling needs, automation plans, product launch plans and create comprehensive plans for prioritizing technical and resourcing challenges
Drive technical architecture discussions, even on subjects you haven't had direct experience working with
Develop lasting partnerships with product management, program management, network engineering, software engineering and other related groups to build and improve our ever-growing large-scale distributed infrastructure and product environment
Empower engineers to develop their careers, matching their strengths with projects tailored to their skill levels, long-term skill development, personalities, and work styles
Help build and enrich an inclusive work environment comprised of people from diverse backgrounds
Assess employee performance on an ongoing basis, address under-performance, and recognize and promote performance
Work closely with dedicated recruiting staff to expand the team including interviewing candidates, participating in conferences/events, and on-boarding new employees
Balance the need to “keep things running” with allocating time to long-term, high-impact projects

Incident ManagementCrisis ManagementTechnical LeadershipTraffic ShapingMonitoringWindows+5

Formation data systems

Advisor

Apr 2015 – Apr 2017 · 2 yrs

Participate on the technical advisory council helping Formation Data Systems execute on their vision for what Enterprise Storage should be.

Netflix

Manager Enterprise Operations & Support

Feb 2015 – Feb 2017 · 2 yrs · Los Gatos

Responsible for leading a 24/7 team in our Los Gatos headquarters while overseeing the Ops support for our Corporate Infrastructure, Billing and Payments real time API's, Lab Engineering, Media Pipeline and Enterprise Platform teams.
Recruited and developed a high performance team to support and enhance IT Operations
Responsible for scheduling and maintaining the right balance of staffing for the 24/7 engineering support team.
Develop, monitor, and assess Enterprise Ops platforms, programs and services.
Engage the software engineers to understand upcoming features and coordinate appropriate monitoring/insight/alerting prior to deployment.
Own monitoring & alert configuration to detect, triage and resolve issues quickly.
Enable team to take charge of outages, lead calls until they are resolved, and make sure the root cause has been found and fixed, while closing the case as the lead on an Incident Review.
The ability to articulate and implement technical support without unnecessary processes and procedures.

24x7 team managementIT OperationsMonitoringIncident Management

Ebay

3 roles

Sr. Manager - Site Engineering Center

Promoted

Apr 2012 – Nov 2014 · 2 yrs 7 mos

Responsible for attaining high availability (99.94 goal), performance and reliability of the eBay Marketplaces website (ebay.com). Managing the operations of large scale systems to maintain high up-time/availability. Manage multiple teams across two states as well as one internationally (magento.com) 24x7x365. My teams are charged with the management of front end infrastructure of eBay.com and subsidiaries as well as the incident management and resolution of all major site issues.
Serve as a proactive mentor, teacher and problem solver
24x7 incident and systems management
Executive Duty Officer rotation – Senior executive communication and PR liaison.
Systems include; all production databases- Oracle, MySQL, Mongo, Cassandra; Unix/Linux servers, Hadoop systems, VMs, Windows, Siebel, network assets, Load Balancers, tens of thousands of systems.
Solely responsible for the Time To Restore KPI which was less than 15 min goal.

High AvailabilityPerformance ManagementIncident ManagementSystems Management

Manager Site Operations

Oct 2011 – Apr 2012 · 6 mos

I managed the Technical Duty Officer team (10 directs) who oversaw incident management for eBay.com as well as driving resolution on site issues.
eBay's Site Engineering Center is responsible for attaining high availability, performance and reliability of the eBay marketplaces website. The SE-center manages operations on large scale systems to maintain high up time. Responsible for 24x7 Incident and Systems Management.
Systems include all Production Databases(Oracle, MySQL, Mongo, Cassandra), Unix/Linux servers, Hadoop systems, VMs, Windows, Siebel, Network assets, Load Balancers etc.

Incident ManagementSystems ManagementTeam Management

Principal Technical Duty Officer

Aug 2010 – Oct 2011 · 1 yr 2 mos

I was responsible for the resolution of all site impacting events as well as incident management during events. The technical leader in the room and the decision point for all repair paths and actions during crisis. Keeping the impact to the customers to a minimum and ensuring the uptime of the site to protect the business was my specialty.

Incident ManagementCrisis ManagementTechnical Leadership

Akamai technologies

Network Operations Engineer

Feb 2010 – Aug 2010 · 6 mos

Responsible for ensuring uptime and performance of the Akamai ADS Network. Offered tier 3 support (last line) for the platform ensuring the availability to serve customers and the business on a 24x7x365 basis. Accountable for all technical issues related to the vertical internal business unit, which consisted of the entire ADS technology stack.
Responsible for physical hardware spread across multiple Datacenters and regions / zones.
Responsible for the network and the application itself; processing over 60 million transaction every hour 24x7.
Responded to all alerts generated by alerting platform, Nagios as well as application specific performance alerts
Managed the internal ticket queue as well as project updates for the department
Utilized thousands of Dell servers as well as proprietary hardware across a cisco powered network, custom application running on custom Akamai build of Centos 4.6 with Apache/Tomcat.

Network OperationsTechnical SupportMonitoring

Vmware

VNoc Engineer - Contract

Dec 2009 – Feb 2010 · 2 mos

Responsible for transitioning the VMware NOC from Palo Alto to Broomfield, CO. The project was delivered on time and on budget with no disruption to the business. A strong and capable team was created and trained in the new Colorado location.
Responsible for identifying, troubleshooting and resolving IT production network, system and application related problems reported through Zenoss or CA Spectrum monitoring systems, trouble tickets or through the service desk.
Rapidly responded to all alerts, restoring or escalating service/application issues per established SLA’s
Facilitated IT conference calls during production outages, providing accurate written and verbal communication in a timely and consistent manner
Tracked all planned and unplanned outages through trouble ticketing systems and reporting tools
Participated in change management and post mortem process to ensure operational consistency and service improvement
Executed operational processes base on run book and SOP documentation
Configured and maintained network/systems management tools

Network ManagementTroubleshootingIncident Management

Netflix

NOC Engineer

Jan 2007 – Jun 2009 · 2 yrs 5 mos

Responsible for the successful deployment of all business applications, from the website code to the major database code updates and all other applications that were business critical. Worked as part of the 24x7x365 IT Operations team within the Network Operations Center.
Managed a highly visible, high traffic website and ensured site availability and escalation paths for critical issues
Delivered on a 15 minute downtime window, bi-weekly code deployment project that limited downtime to highly used services
Utilized BigBrother monitoring system and other homegrown monitors for alerting and proactive resolution of issues
Performed troubleshooting and resolution across a wide spectrum of systems including: Linux/Unix ,window servers and applications running on these systems
Experienced with Tomcat, Apache, IIS and SharePoint technologies
Organized and facilitated weekly planning meeting for deployments and setup the run sheets to ensure smooth roll outs to internal and external customers
Provided ongoing guidance and training to employees inside and outside of the department

Deployment ManagementMonitoringTroubleshooting

Sutter health

Systems Support Tech

Jan 2004 – Jan 2006 · 2 yrs · Santa Cruz

Responsible for daily system management and operation of all LAN’s, WAN’s and telecommunication software and hardware.
Configured, maintained, troubleshot end user workstations, departmental servers and phone systems
Rapidly responded to all incoming email and phone support request
Managed all network assets
Prepared and updated operational documentation & FAQ’s that are used by technical staff to perform tasks
Maintained and increased network security in regards to HIPPA by utilizing multiple tools
Performed daily operational tasks and troubleshooting of VMS Unix system
Performed monthly and emergency patch management per security incidents

System ManagementNetwork SecurityTroubleshooting

Navisite

NOC Specialist / Supervisor / CSE

Feb 2000 – Aug 2003 · 3 yrs 6 mos

Responsible for resolving all customer issues whether it was software, hardware or user related. Answered incoming phone calls and emails from highly technical engineers, customers and partners. As a supervisor, I oversaw up to twelve employees, making sure the nine defined roles we had were adequately staffed and properly monitored. I worked with a broad range of technologies including but not limited to:
IIS and Apache web servers, Cisco hardware (local directors, switches, routers), Firewalls, an assortment of monitoring tools (HPOV, BMC Patrol, Netcool, ISM URL monitoring) as well as the remedy ticketing program.
Monitored a network of 5,000 servers, NT & Unix, Linux and various storage arrays
Monitored networking devices such as routers, hubs and switches
Monitored all access to datacenter
Responsible for the management and resolution of all emergency situations that arose
Defined NOC roles and built rapport with customers
Handled customer issues escalated by my staff
Resolved any internal problems with regards to day to day operations of the datacenter

Customer SupportNetwork MonitoringIncident Management