Senior Site Reliability Engineer @ Verizon - Irvine, CA

Job Overview

12 days ago

Senior Site Reliability Engineer

Verizon - Irvine, CA

When you join Verizon

Verizon is one of the world’s leading providers of technology and communications services, transforming the way we connect across the globe. We’re a diverse network of people driven by our shared ambition to shape a better future. Here, we have the ability to learn and grow at the speed of technology, and the space to create within every role. Together, we are moving the world forward – and you can too. Dream it. Build it. Do it here.

What you’ll be doing...

We are looking to hire Senior Site Reliability Engineer for the VCG GTS Organization. In this role, you will lead a cross-functional team that develops the SRE Continuous Integration and Deployment framework and practice all tenets of SRE, vision and technical leadership to enable the execution of best-in-class middleware engineering practices that would improve reliability of applications. Help execute on our vision for Site Reliability Engineering (SRE), determining how each system relates to each other and using a breadth of tools, build CICD framework and automation to improve reliability for customers. Develop Source Code Management, centralizing and automating the configuration management process.

  • Developing and maintaining pipeline configurations.
  • Providing CI/CD environment build automation in a containerized environment that uses technologies such as EKS, Docker, Artifactory, Python, Shell, Ansible and Gitlab.
  • Performing implementation, configuration and ongoing performance enhancements for ELK/EFK Logging platform in the on-prem & AWS environments.
  • Delivering solutions towards automating, optimizing and supporting mission critical deployments in AWS, leveraging configuration management, CI/CD, and DevOps processes.
  • Developing and implementing the next-gen monitoring solution for the enterprise based on SRE principles and practices.
  • Developing an effective data-driven approach for monitoring and alerting that enables the SRE team to maintain high availability and deliver a high quality of service.
  • Developing a log monitoring framework based on exceptions in access logs, server logs and platform logs.
  • Analyzing data to understand customer experience and usage patterns to identify gaps in current monitoring.
  • Working with SRE and dev engineers to fine tune alert thresholds, increase alert effectiveness by event correlation and pattern recognition.
  • Developing and onboarding new monitoring features and capabilities for critical metrics and transition to operations.
  • Defining standards, guidelines and templates for operational and business dashboards and metrics alerting.
  • Developing a robust alerting system that can identify problematic anomalies and minimizes false alarms.
  • Performing sustainable incident response.

Where you’ll be working…

In this hybrid role, you'll have a defined work location that includes work from home and assigned office days set by your manager.

What we’re looking for...

You’re a technical expert with solid credentials and a remarkable ability to find, break down, and troubleshoot problems. You work effectively with a wide range of internal and external stakeholders, and you’re great at partnering with the groups you support. You constantly look for ways to make our integration processes and deployments better, and you’re a natural mentor to junior engineers looking to develop their technical skills.

You’ll need to have:

  • Bachelor’s degree or four or more years of work experience.
  • Four or more years of relevant work experience.

Even better if you have one or more of the following:

  • Bachelor's degree in Computer Science.
  • Knowledge of SRE practices and principles to build resilient systems and to provide business continuity.
  • Experience in Application Performance Management, Containerized Workloads , Logging, Alerting, Configuration Management Tools, API Management Tools, Cloud and Source Code Management.
  • Experience in creating rich Grafana, Kibana and new Relic visualizations and dashboards for providing key metric monitoring information to users and support staff.
  • Experience in development of log monitoring framework based on exceptions parsing.
  • Experience in installing, configuring and maintaining Elasticsearch, Logstash, and Kibana logging platform.
  • Experience in working with version control systems like (GIT, subversion or mercurial), creating Jenkins pipelines to build code artifacts and deploy the code.
  • Experience in automation and ability to code or script at an advanced level.
  • Experience in Systems Architecture, in-depth knowledge on SRE, IT Operations, Cloud, Coding and Scripting experience with Java, JavaScript, python, Ansible and Cloud formation.

If Verizon and this role sound like a fit for you, we encourage you to apply even if you don’t meet every “even better” qualification listed above.

Compensation

Our benefits are designed to help you move forward in your career, and in areas of your life outside of Verizon. From health and wellness benefits, short term incentives, 401 (k) Savings Plan, stock incentive programs, paid time off, parental leave, adoption assistance and tuition assistance, plus other incentives,we've got you covered with our award-winning total rewards package. For part-timers, your coverage will vary as you may be eligible for some of these benefits depending on your individual circumstances.

If you are hired into a California work location, the compensation range for this position is between $99,000 and $185,000 based on a full-time schedule. The salary will vary depending on your location and confirmed job-related skills and experience. This is an incentive based position with the potential to earn more. For part-time roles, your compensation will be adjusted to reflect your hours.

Equal Employment Opportunity

We're proud to be an equal opportunity employer - and celebrate our employees' differences, including race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, and Veteran status. At Verizon, we know that diversity makes us stronger. We are committed to a collaborative, inclusive environment that encourages authenticity and fosters a sense of belonging. We strive for everyone to feel valued, connected, and empowered to reach their potential and contribute their best. Check out our diversity and inclusion page to learn more.

Similar Jobs

Software Engineer, Site Reliability - Undergraduate/Graduate nternship

AvidXchange

Burbank, CA

NET, SQL, Azure, PowerShell, JavaScript, and HTML. Interns will be allocated time each week to collect as a team and discuss experiences, ideas, and learnings.

Sr Site Reliability Engineer - Observability - Open to Remote

First American Financial Corporation

Santa Ana, CA

Collaborate with development and operations team to ensure availability and reliability of the application and infrastructure.

Senior Site Reliability Engineer (Remote/Hybrid)- Santa Ana, CA

FutureRecruit

Santa Ana, CA

Strong working knowledge of cloud services and architecture. (Architectures, micro-services, high availability) with proficiency in installation, maintenance…

Senior Site Reliability Engineer

Infinity Consulting Solutions, Inc.

Santa Ana, CA

Full Time Opportunity (W2 only). Measure and monitor availability and overall system and environment health. Make recommendations to improve service.

Sr. Site Reliability Engineer

Supernal

Irvine, CA

Think about systems – their edge cases, failure modes and life cycles – and how to improve the long-term reliability, and scalability of our infrastructure.

IT Architect III - Site Reliability Engineer (Hybrid Work Schedule)

Inland Empire Health Plans

Rancho Cucamonga, CA

IEHP is on a journey to adopt Production Engineering as a key part of this transformation is to create a state-of-the-art IT production operations process.

Site Reliability Engineer (Application Software)

SpaceX

Hawthorne, CA

5+ years of DevOps, site reliability engineering, or system administration experience. Bachelor's degree in computer science, information systems, or…

Senior Site Reliability Engineer

Cox Corporate Services

Irvine, CA

Improve predictability and reliability of software releases, workflows, and operating software. We also look to instill core SRE practices into the engineering…

Dev Ops / Site Reliability Engineer - Hybrid

AEG Worldwide

Los Angeles, CA

In the SRE role you will be working directly with Developers, QA, Infrastructure Engineers, Security and Compliance, Account Management and Incident Management…

Senior Site Reliability Engineer

Circle

Los Angeles, CA

Circle is looking for a Senior Site Reliability Engineer who will design, build and maintain Circle's infrastructure estate to meet the growing worldwide…

Dev Ops / Site Reliability Engineer - Hybrid

AXS

Los Angeles, CA

In the SRE role you will be working directly with Developers, QA, Infrastructure Engineers, Security and Compliance, Account Management and Incident Management…

Senior Systems Engineer, Site Reliability Engineering, Google Cloud

Google

Los Angeles, CA

Scale systems sustainably through mechanisms like automation and evolve systems by driving changes that improve reliability and velocity.

Senior Site Reliability Engineer

Verizon

Irvine, CA

Help execute on our vision for Site Reliability Engineering (SRE), determining how each system relates to each other and using a breadth of tools, build CICD…

Site Reliability Engineer

The Aerospace Corporation

El Segundo, CA

You will be part of the Spacelift Telemetry Acquisition and Reporting System (STARS) team providing mission assurance support for National Security Space (NSS)…

Kubernetes SRE (Site Reliability Engineer)

EPCVIP, Inc

Los Angeles, CA

Site reliability engineer creates a bridge between development and operations by applying a software engineering mindset to system administration topics.

Staff Site Reliability Engineer

PlayStation Global

San Diego, CA

Prior successful experience as a systems performance or site/systems reliability engineer. Manage availability, latency, scalability, and efficiency of Shared…

Senior Site Reliability Engineer

GoFundMe

San Diego, CA

The successful reliability engineer effectively guides incident responses, helps identify root causes and provides recommendations or solutions to mitigate and…

Site Reliability Engineer - Linux, DevOps, Python

Blu Omega

Culver City, CA

Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.

Senior Site Reliability Engineer, Americas

Canonical - Jobs

San Diego, CA

Our site reliability engineers bring Python software-engineering skills and rigour to the operations domain. A wide range of engineering disciplines and career…

Senior Site Reliability Engineer

Origis Energy

San Diego, CA

2 years of experience in a cutting-edge site reliability role utilizing container orchestration technologies & infrastructure automation (Kubernetes, Docker,…

Senior Site Reliability Engineer, Americas

Canonical - Jobs

San Bernardino, CA

Our site reliability engineers bring Python software-engineering skills and rigour to the operations domain. A wide range of engineering disciplines and career…

Senior Site Reliability Engineer, Americas

Canonical - Jobs

Los Angeles, CA

Our site reliability engineers bring Python software-engineering skills and rigour to the operations domain. A wide range of engineering disciplines and career…

Principal DevOps Engineer

Thales

Irvine, CA

Thales people architect solutions that enable two-thirds of planes to take off and land safely. Collaborate with IT and Engineering organizations to define and…

Software DevOps Engineer

Virgin Orbit

Long Beach, CA

The ideal candidate will have a blend of experience with site reliability engineering, database administration, and security coordination.