Site Reliability Engineer @ LogDNA - Texas

Site Reliability Engineer

LogDNA
Texas
3 days ago

At LogDNA you'll help us build a fast and modern log management platform that offers the flexibility of an amazing developer experience with the trust of enterprise-grade infrastructure. We strive to help developers pinpoint production issues by aggregating all system and application logs into one platform. Today, LogDNA is used by over 3,000 teams including IBM, OpenAI, Instacart, and Lime Bike. We're building a future where developers don't have to dread the tools they use at work, starting with log management. We've achieved 300% year-over-year revenue growth in the last year, and we're just getting started.

We're Y-Combinator alumni, venture-backed by Emergence Capital (Salesforce, Box, and Zoom) and Initialized Capital (Reddit, Coinbase, and Patreon). Our team comes from a wide variety of backgrounds and experiences, having worked on products at Heroku, Facebook, WhatsApp, Udacity, Ripple, among others.

Our team is responsible for keeping LogDNA's systems running smoothly 24x7x365, leveraging our mixture of specialties. We are currently looking for a passionate and motivated engineer who is enthusiastic to join a distributed team, shares our commitment to growing our platform together. A successful candidate should be an energetic self-starter with a passion for continuous improvement and a desire to positively impact a growing venture-backed, Y-Combinator alum start-up.

For this role we are looking for people with a wide level of experience, and final leveling may be tailored to fit the experience of chosen candidate.

Responsibilities

  • Be in an On-call rotation to respond to surprises
  • Support our support engineers on customer incidents
  • Engage with your distributed teammates
  • Manage our infrastructure with Terraform, Ansible, and Kubernetes
  • Focus improvements in monitoring and alerting towards early detection and reducing pager fatigue
  • Continuously improve our tooling and pipelines for building and maintaining our products
  • Work to make production as boring as possible

Must-Have Qualifications:

  • Be comfortable at a command line interface
  • Know how to use a version control system such as git
  • Understand the fundamentals of configuration management
  • Desire to improve collaboration and be able to communicate asynchronously
  • Passion for identifying and reducing toil through simplification and automation
  • Willingness to create and update documentation to facilitate learning for yourself and the team
  • Strong experience in a programming or scripting language and the ability to translate to new ones

Nice-to-haves:

  • Kubernetes
  • Datastores (ElasticSearch, Redis, MongoDB)
  • Coding (JavaScript, Python, Go, Rust)
  • Cloud (AWS, Azure, GKE)
  • Linux (EL, Ubuntu, CoreOS, Alpine)
  • CI/CD Tooling (CircleCI, Jenkins, TravisCI)
  • Configuration management tools (ex. Chef, Saltstack, Puppet, Ansible)