Site Reliability Engineer Job at iVedha Inc., Los Angeles, CA

ajBtZG1Vbmd6dnJ4Wmtic0NKZm0yNDhw
  • iVedha Inc.
  • Los Angeles, CA

Job Description

Site Reliability Engineer (SRE)

 

Position Overview:

We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with strong expertise in Python, advanced proficiency in Azure-based infrastructure, and significant experience in Customer Reliability Engineering (CRE) and Automation. The ideal candidate will have 3 to 5 years of experience in SRE or related fields and a proven ability to design, deploy, and maintain scalable, reliable, and high-performing cloud solutions. This role focuses on driving system reliability, leveraging automation to optimize operations, and delivering robust solutions for complex infrastructure challenges.

 

Key Responsibilities:

 

Design & Plan-

  • Design and implement comprehensive Elastic (ELK stack) solutions, including Elasticsearch, Logstash, and Kibana.
  • Analyze and document requirements to improve existing infrastructure through automation ("Infrastructure as Code") and seamless Azure cloud integration.
  • Develop and document architectural designs for scalable Azure solutions, tailored to customer requirements.

 

Build & Deploy-

  • Build robust CI/CD pipelines (Azure DevOps, Jenkins, ArgoCD) to support efficient code deployment and reusable automation workflows.
  • Advance scripting and automation frameworks using Python, Bash, and Painless scripting languages.
  • Manage, troubleshoot, and enhance Kubernetes clusters, including Azure Kubernetes Service (AKS)environments.
  • Deploy production-ready Elasticsearch clusters on-premises and in Kubernetes clusters.

 

Operate & Support-

  • Proactively monitor systems using tools like Azure Monitor, Elastic Observability, and Application Insights, ensuring high availability and performance.
  • Develop self-healing mechanisms and automated scaling for distributed systems to reduce downtime and improve reliability.
  • Lead incident response processes, conduct root cause analysis, and drive post-mortem discussions to prevent recurring issues.
  • Collaborate with security teams to implement and maintain best practices for system security and compliance.

 

Automation-

  • Develop robust automation scripts for repetitive operational workflows, configuration management, and deployment pipelines using tools such as Ansible, Terraform, and Helm.
  • Drive enhancements in infrastructure automation to enable seamless deployments and self-service capabilities for engineering teams.

 

Collaboration & Customer Engagement-

  • Partner with cross-functional teams (engineering, operations, and product) to design systems with reliability and performance in mind.
  • Work closely with customers to address specific reliability challenges and ensure tailored Azure-based solutions meet their operational needs.
  • Foster a DevOps culture and champion best practices across teams.

 

Qualifications:

 

Experience-

  • 5+ years of hands-on experience as SRE / SRE Automation Engineer.
  • Proven expertise in designing, deploying, and managing Azure cloud infrastructure and services.
  • Significant experience in Elastic stack (ELK), including managing Elasticsearch clusters, Logstash pipelines, and Kibana visualizations.
  • Advanced proficiency in Python scripting and automation for large-scale systems.
  • Strong knowledge of Kubernetes cluster management, including AKS.
  • Demonstrated experience building CI/CD pipelines and deploying applications in distributed environments.
  • Working knowledge of containerization tools like Docker and orchestration technologies.

 

Technical Skills-

  • Azure Expertise: Azure Kubernetes Service (AKS), Azure DevOps, Application Insights, Log Analytics, and Azure security best practices.
  • Automation Tools: Proficiency with Ansible, Terraform, Helm, and ArgoCD.
  • Scripting: Python (advanced), Bash, Painless scripting for Elasticsearch pipelines.
  • Monitoring: Elastic Observability, Grafana, and Azure-native tools.
  • Networking: Understanding of virtual networks, firewalls, and RBAC in cloud environments.
  • Security: Familiarity with OAuth, SAML, and secure deployment methodologies.
  • Knowledge of highly scalable systems, RESTful APIs, and caching mechanisms.

 

Soft Skills-

  • Strong problem-solving and troubleshooting skills for complex distributed systems.
  • Excellent communication and collaboration skills, including the ability to liaise between technical teams and non-technical stakeholders.
  • Customer-focused approach, with a track record of designing solutions that meet client-specific reliability requirements.
  • Proactive, self-motivated, and committed to continuous learning and improvement.

 

Education-

  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).

 

Preferred Qualifications-

  • Working knowledge of Elastic Cloud for Kubernetes (ECK).
  • Certification in Microsoft Azure or Kubernetes.
  • Experience implementing GitOps methodologies for deployment automation.

Job Tags

Remote job,

Similar Jobs

CRH

Finance & Accounting Intern - Bulls Gap, TN Job at CRH

 ...2026). Concentration or Area of Study in Finance, Accounting, or related discipline. Ability to attend program hosted in Bulls Gap, TN with possible travel to other sites for networking opportunities. Excel experience and exposure to business analytics a plus.... 

Dexian

Business Systems Consultant Job at Dexian

 ...Job Summary: Dexian is seeking a Business Systems Consultant for an opportunity with a client located in Minneapolis, MN. Responsibilities: Perform account reconciliations to validate financial data and ensure accuracy Conduct regression testing to... 

Solomon Page

Paid Social Copywriter - Healthcare Job at Solomon Page

Our client, a telemedicine company, is looking for a freelance Paid Social Copywriter with a background in healthcare marketing for a long...  ...to generate fresh ideas for campaigns and content themes. Write copy variations for A/B testing to continuously improve performance... 

University of Mississippi

Assistant Director of Marketing (Athletics) Job at University of Mississippi

Ole Miss Athletics is conducting a search for Assistant Director of Marketing. This position is responsible for coordinating, creating, developing, and executing marketing campaigns that enhance the Ole Miss fan experience during athletic events. The incumbent composes... 

Erie Insurance Group

Material Damage Adjuster II Job at Erie Insurance Group

 ...Position may be eligible for an annual bonus payment. At Erie Insurance, you're not just part of a Fortune 500 company; you're also a...  ...management and organization skills preferred. Ability to drive/travel regularly within the assigned territory. Duties and...