David Ponessa

SRE (Site Reliability Engineer)

Amsterdam, North Holland, Netherlands19 yrs 5 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in Site Reliability Engineering and DevOps practices.
  • Proficient in large scale data processing and cloud infrastructure.
  • Strong background in Linux systems administration and automation.
Stackforce AI infers this person is a SaaS Infrastructure Engineer with a strong focus on Site Reliability Engineering.

Contact

Skills

Core Skills

Site Reliability EngineeringApache Flink

Other Skills

AIXAWSAmazon Web Services (AWS)AnsibleApache KafkaBackup ManagementBash scriptingBig Data ProcessingCI/CDCloud ComputingDNSDockerDomain Name System (DNS)Elastic Stack (ELK)ElasticSearch

About

I have been working as a Linux systems administrator for a long time, and I quickly moved to become a DevOps Engineer/SRE, striving to design systems that implement high levels of observability, scalability and high availability. I have become comfortable coding in Java, and also use Python sometimes as a swiss army knife. I prefer all things open-source, always. Best solutions should be secure, elegant, and simple. Horizontal scaling should always be preferred, development and testing environments must be dynamic in nature. Every IT solution needs to be prepared to be reinvented in the not-so-long term. I've been a physics/astronomy nerd, always, and I also meet the "geek" classification quite easily. I've discovered throughout my professional career that I have abilities managing small groups of people, and in my experience, no team that qualifies as "great" grows much bigger than a dozen or so people.

Experience

Booking.com

3 roles

Senior Site Reliability Engineer

Promoted

Aug 2022Present · 3 yrs 7 mos

  • Responsible for architecture, design, development and operations of a large scale data processing platform built to deliver security as a service. This involves dealing with large volumes at data in flight/at rest (Gigabytes per second, Petabytes at rest), and intertwined systems to perform data filtering, transformations, async and sync data enrichment, and aggregations, at processing time, using Apache Flink, Apache Kafka, Elasticsearch-Logstash-Kibana, hdfs/gcs/s3. In doing so I work extensively with Java in Flink and predecessor custom code for consumer/producer applications, and development of a framework to abstract the need of code knowledge to implement transformations and aggregations from end users (data analysts, data scientists, etc).
  • I am also responsible for ensuring observability and reliability of the entire service and components, using a variety of automation tools such as Puppet, Ansible, Terraform, GitLab, and custom scripting in shell or python, as well as design of CI/CD pipelines to improve velocity, and overall design for fast idempotent reliable delivery.
  • The platform underneath consists of a mix of Public Cloud, SaaS, and most of the actual processing layer sitting on kubernetes, where orchestration is a mix of kubernetes (deployments, services, CRDs), helm and in-house orchestration solutions.
Apache FlinkSite Reliability EngineeringJavaElasticsearchAnsibleTerraform+2

Site Reliability Engineer

Feb 2021Aug 2022 · 1 yr 6 mos

  • Responsible for architecture, design, development and operations of a large scale data processing platform built to deliver security as a service. This involves dealing with large volumes at data in flight/at rest (Gigabytes per second, Petabytes at rest), and intertwined systems to perform data filtering, transformations, async and sync data enrichment, and aggregations, at processing time, using Apache Flink, Apache Kafka, Elasticsearch-Logstash-Kibana, hdfs/gcs/s3. In doing so I work extensively with Java in Flink and predecessor custom code for consumer/producer applications, and development of a framework to abstract the need of code knowledge to implement transformations and aggregations from end users (data analysts, data scientists, etc).
  • I am also responsible for ensuring observability and reliability of the entire service and components, using a variety of automation tools such as Puppet, Ansible, Terraform, GitLab, and custom scripting in shell or python, as well as design of CI/CD pipelines to improve velocity, and overall design for fast idempotent reliable delivery.
  • The platform underneath consists of a mix of Public Cloud (Dataflow, BigQuery), BareMetal (Petabyte scale data in Elasticsearch plus indexers), and most of the actual processing layer sitting on kubernetes, where orchestration is a mix of kubernetes (deployments, services, CRDs), helm and in-house orchestration solutions.
Apache FlinkSite Reliability EngineeringJavaElasticsearchAnsibleTerraform+2

Linux Systems Engineer

May 2018Feb 2021 · 2 yrs 9 mos

  • Responsible for architecture, design, development and operations of a large scale data processing platform built to deliver security as a service. This involves dealing with large volumes at data in flight/at rest (Gigabytes per second, Petabytes at rest), and intertwined systems to perform data filtering, transformations, async and sync data enrichment, and aggregations, at processing time, using Apache Flink, Apache Kafka, Elasticsearch-Logstash-Kibana, hdfs/gcs/s3. In doing so I work extensively with Java in Flink and predecessor custom code for consumer/producer applications, and development of a framework to abstract the need of code knowledge to implement transformations and aggregations from end users (data analysts, data scientists, etc).
  • I am also responsible for ensuring observability and reliability of the entire service and components, using a variety of automation tools such as Puppet, Ansible, Terraform, GitLab, and custom scripting in shell or python, as well as design of CI/CD pipelines to improve velocity, and overall design for fast idempotent reliable delivery.
  • The platform underneath consists of a mix of Public Cloud (Dataflow, BigQuery), BareMetal (Petabyte scale data in Elasticsearch plus indexers), and most of the actual processing layer sitting on kubernetes, where orchestration is a mix of kubernetes (deployments, services, CRDs), helm and in-house orchestration solutions.
Apache FlinkSite Reliability EngineeringJavaElasticsearchAnsibleTerraform+2

Doctor.com

DevOps

Mar 2017Apr 2018 · 1 yr 1 mo · Argentina

  • Manage the entire IT infrastructure supporting Doctor.com services in a full DevOps position.
  • Full lifecycle of EC2, RDS, Elasticsearch, ECS instances to support all applications, mostly LAMP stack, in separate environments.
  • Architecture, design and implementation of self-managed Kubernetes to start migrating to microservices.
  • Design and implementation of CI/CD pipelines for the complete business logic, using jenkins.
Site Reliability EngineeringKubernetesCI/CDJenkinsAWSElasticsearch

Atos

Technical Supervisor

Jul 2015Apr 2018 · 2 yrs 9 mos

  • Linux and Solaris Systems Administration Senior with focus on: Storage Management, Security compliance, deployment of new systems through kickstart/jumpstart technoogy (cobbler/foreman), Backup management and configuration, Networking (TCP/IP, NFS, clustering), Incident management - Change Management - RCA investigation (ITIL process), Archtecture testing and planning for *nix general platforms, Procedure design, testing and implementation with focus on developing proven, supported work instructions for continuous improvement, Disaster Recovery support planning and execution, private cloud experience and management, virtualization (vmware/kvm/xen).
  • Developed knowledge in AWS, Puppet, Docker and up and coming DevOps technologies. Built infrastructure solutions based on RedHat Enterprise Virtualization, Ansible, Satellite 6/Foreman.
  • Writing of Ansible roles for config automation and deployment.
  • Managed and monitored all installed systems and infrastructure to ensure the highest level of availability.
  • Installed, configured, tested and maintained operating systems, application software and system management tools.
  • Defined enterprise processes and best practices and tailored enterprise processes for applications.
  • Monitored and tested application performance to identify potential bottlenecks, develop solutions, and collaborate with developers on solution implementation.
  • Wrote and maintained custom scripts to increase system efficiency and performance time.
  • Designed and implemented system security and data assurance.
  • Provided 2nd and 3rd level technical support and troubleshooting to internal and external clients.
  • Created ample procedure documentation for newly adopted technologies.
Site Reliability EngineeringLinuxSolarisAnsiblePuppetDocker

Acs, a xerox company

Infrastructure Analyst Senior

Nov 2011Jun 2015 · 3 yrs 7 mos · Argentina

  • Linux and Solaris Systems Administration Senior
  • Storage Management
  • Security Management
  • Server building
  • Print queue administration
  • Monitoring/Auditing/Performance Tools installation and configuration
  • Backup management and configuration
  • Networking management
  • Incident management
  • Change Management
  • RCA investigation
  • Archtecture testing and planning for *nix general platforms
  • Procedure design, testing and implementation with focus on developing proven, supported work instructions for Bussiness As Usual pocess
  • Disaster Recovery support, planning and execution
  • Simple cloud management
  • VMWare operation
LinuxSolarisNetworkingBackup ManagementSecurity ManagementSite Reliability Engineering

Ibm global business services

Unix Systems Administrator

Jun 2010Nov 2011 · 1 yr 5 mos · Argentina

  • System Administration of HP-UX, AIX, Solaris and Linux OSes, with focus on monitoring and administration software tools (TIVOLI, IBM Director, HP-SIM, Sun MC and others).
  • Tasks performed:
  • Linux (RHEL, SLES, Debian), Solaris 8/9/10, AIX, HP-UX administration.
  • Tivoli Endpoint and Tivoli Management Region administration.
  • Tivoli Usage and Accounting Manager (TUAM) administration.
  • Tivoli Security Compliance Manager (TSCM) administration.
  • Tivoli Application Dependency Discovery Manager (TADDM) administration.
  • Server Resource Monitoring (SRM) administration.
  • IBM Systems Director 6.2 administration.
  • HP Systems Insight Manager administration.
  • Sun Management Center administration.
  • Project deployment and management.
LinuxSolarisNetworkingBackup ManagementSecurity ManagementSite Reliability Engineering

Hp enterprise services

MidRange Coverage Systems Administrator

Sep 2009Jun 2010 · 9 mos · Argentina

  • First level support of midrange server spectrum and application facilities. Management and support of 9000+ *nix and wintel servers.
  • Daily work included:
  • Unicenter Operation.
  • Linux (RHEL) and Solaris administration.
  • Network troubleshooting.
  • Incident Management.
  • Change Installation and Management.
  • Technical Team Management.
LinuxSolarisHP-UXAIXMonitoring Tools

Universidad nacional de cordoba

Teacher Assistant - 2nd category

Mar 2007Feb 2009 · 1 yr 11 mos

  • Assistant teacher in Mathematical analysis and General Physics.
  • Academic consultant and lab assistant.
  • Performed as teacher in:
  • Calculus and Algebra
  • Classical Mechanics
  • Thermodynamics
  • Electrodynamics
LinuxNetworkingWeb Server Administration

Grupo de teoría de la materia condensada - famaf - unc

Systems and Network Administrator

Sep 2006Sep 2009 · 3 yrs

  • Mainteinance of systems and network administration for the computing systems and workstations of the group.
  • These included:
  • Architecture design.
  • GNU/Linux under Red Hat and Debian administration, mainteinance and installation.
  • Storage administration and planning.
  • Backup administration.
  • Network security and administration.
  • Web Server deployment and administration.
  • Mail Server deployment and administration.
  • Performance and capacity planning, as well as enhancement.
LinuxSolarisNetworking

Education

Universidad Nacional de Córdoba

Lic. en Física (unfinished)

Jan 2002Jan 2007

Universidad Empresarial 'Siglo 21'​

Bachelor’s Degree (unfinished) — Computer Science

Jan 2016Jan 2018

Instituto Diocesano Monseñor Miguel Angel Aleman

Bachiller Mercantil

Jan 1996Jan 2000

Stackforce found 100+ more professionals with Site Reliability Engineering & Apache Flink

Explore similar profiles based on matching skills and experience