Prometheus & Grafana Engineer

  • Etat
  • zdalnie

Podsumowanie AI ✨ – szybka analiza oferty

SME/L3 Engineer - Grafana and Prometheus odpowiedzialny za utrzymanie, optymalizację i rozwój systemów monitorowania i obserwowalności opartych na Grafanie i Prometheusie. Zadania obejmują administrację tymi narzędziami, tworzenie niestandardowych dashboardów, projektowanie reguł alertów, analizę danych i współpracę z zespołem infrastruktury w celu zapewnienia skalowalności oraz optymalizacji systemu.

Szczegóły Oferty

Job Description:

Environment:

The provider uses Prometheus, which monitors cloud-native systems, such as Kubernetes. It is the only system directly supported by Kubernetes and is the de facto standard in the entire cloud-native ecosystem. The data is graphically processed with the help of Grafana and made available in a dashboard.

Looking for an experienced SME/L3-Engineer with a deep understanding of Grafana and Prometheus to join our team. In this role, you will be responsible for maintaining, optimizing, and advancing our monitoring and observability systems. Your expertise will be critical in ensuring the reliability, performance, and scalability of our infrastructure. You will be owning the overall health/availability/configurations of Grafana and Prometheus solutions.

Key responsibilities:

  1. Grafana and Prometheus Administration
    • Configure, maintain, and scale Grafana and Prometheus instances.
    • Develop and implement custom dashboards for monitoring key metrics.
    • Troubleshoot issues, ensure data accuracy, and optimize query performance.
  2. Monitoring and Alerting:
    • Design and manage alerting rules for proactive issue identification and resolution.
    • Continuously improve and expand monitoring coverage to meet evolving needs.
    • Collaborate with teams to define alert thresholds and escalation procedures.
  3. Data Analysis and Visualization:
    • Analyze metrics data to identify performance bottlenecks and areas for improvement.
    • Create meaningful visualizations and reports to provide insights for stakeholders.
    • Contribute to the enhancement of data retention and archiving strategies.
  4. Scaling and Optimization:
    • Collaborate with the infrastructure team to ensure seamless integration and scalability of Grafana and Prometheus.
    • Fine-tune configurations to achieve optimal resource utilization and performance.
    • Proven experience as an L3 Engineer specializing in Grafana and Prometheus administration.
    • Proficiency in creating custom Grafana dashboards and queries.
    • Strong understanding of monitoring best practices, alerting, and data analysis.
    • Knowledge of time-series databases and storage strategies.
    • Scripting and automation skills for efficient system management.

Technologie i Umiejętności

Grafana
Kubernetes
Prometheus
Wyświetlenia: 4