Back to all vacancies

Site Reliability Engineer/DevOPS (Analytics AI)

Site Reliability Engineer/DevOPS (Analytics AI)

The "Analytics Team" develops services that process huge amounts of data in real time. This is a fault-tolerant low latency system with a microservice architecture that operates 24/7 with a 99.999% availability level

Now we are looking for enthusiastic professionals to join our Team. The technology stack includes Scala, Apache Kafka, Cassandra, Clickhouse, Docker, Kubernetes, Prometheus. The services are hosted on Google Cloud Platform, AWS

Responsibilities:

  • Actively participate in the project at the stages of design, development, testing, deployment and maintenance

  • Design and implement deployment scenarios, deliver the product to Lab, Stage, PRO environments

  • Design and develop the infrastructure

  • Design monitoring metrics, alerts and dashboards

  • Monitor and maintain services in production (participating in on-call shifts)

  • Customize CI/CD process (LAB, Stage, PRO)

  • Deploy new releases to the production environments

Requirements:

  • Solid knowledge and strong experience in production support activities

  • Understanding of SRE principles and DevOPS practices

  • GNU/Linux-based OS experience at least 2-3 years

  • Basic TCP/IP network knowledge

  • Experience with Google Cloud Platform, AWS, Docker, Kubernetes.

  • Knowledge of Terraform

  • Experience with GitLab CI/CD

  • Real automation experience (Python, Bash, Golang)

  • Understanding of basic principles of HA and distributed systems design

  • Effective communication skills (Active listening, Friendliness, Confidence, Sharing feedback, Respect)

  • English - Intermediate (B1)

Personal:

  • Team player

  • Fast learner

  • Documentation culture

Will be a strong plus:

  • CI/CD automation experience

  • Experience with Apache Kafka, Cassandra, Clickhouse

  • Experience with GitOps tooling (ArgoCD/FluxCD)

  • MS Azure/GCP cloud experience

  • JVM tuning and troubleshooting experience

  • Experience in Nginx administration

  • Good knowledge of modern monitoring systems (Prometheus, Grafana, VictoriaMetrics)

  • Web-service administration experience: Nginx

  • Experience with Helm, Helm chart customization

We offer:

  • Well-coordinated professional team

  • Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth

  • Additional Health and Life Insurance Package

  • Employee Assistance Program

  • 25 vacation days

  • This role requires on-site presence at our office 4 days a week to support effective collaboration and teamwork.

Write to us at jobs@jettycloud.com or send a message to our recruiters

We use cookies to analyze data.

If you keep using this website, it means that you agree to accept our cookies.
In case you don't agree to do that, check your browser settings or leave jettycloud.com.