Senior Site Reliability Engineer

Granicus
Full-time
United States (Remote)
$70,000 - $80,000
Posted on 6 months ago

Job Description

Granicus is seeking an experienced Senior Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of their services, lead infrastructure building and maintenance, automate processes, and implement best practices in site reliability.

Responsibilities

  • Provide on-call production support
  • Work on customer and internal engineering tickets
  • Work on SRE backlog items
  • Monitor and maintain systems
  • Automate processes
  • Assist in incident management
  • Participate in system improvements
  • Collaborate with software engineers
  • Create and maintain documentation
  • Assist in capacity planning
  • Implement and adhere to security best practices

Requirements

  • 5+ years in site reliability engineering or similar role
  • Experience supporting AI/ML infrastructure
  • Expertise in Linux/Unix systems and cloud platforms (AWS)
  • Proficiency in scripting languages (Python, Bash, Ruby) and programming languages (Go, Java, C++)
  • Familiarity with AI/ML operations
  • Experience with the ELK Stack
  • Experience with configuration management tools
  • Exposure to AI/ML toolchains
  • Relevant certifications are a plus

Benefits

  • No benefits