Site Reliability Engineer
Site Reliability Engineer
Site Reliability Engineer
DataArt Ltd
Site Reliability Engineer
Site Reliability Engineer

Site Reliability EngineerDataArt Ltd

Job description

DataArt's SRE Center of Competence successfully develops and provides SRE expertise and solutions for our clients.

We are looking for a Senior SRE specialist who will join our team and provide consulting services to our clients and delivery teams.
The Senior SRE expert will participate in sales/pre-sales and discovery, provide consultancy and architecture reviews, and supervise projects during all the stages of development.
We offer an opportunity to grow professionally: lead initiatives, expand your SRE skills and technologies, mentor colleagues, and participate in R&Ds or PoCs.


• Collect and analyze data metrics, traces, and logs from the environment and the application
• Take part in system design consulting, platform management, and capacity planning
• Analyzing the requirements and supporting them from an SRE perspective
• Assist in making decisions regarding the priorities of feature development and reliability improvements based on the current state of the system
• Partner with development teams to improve services through rigorous testing and release procedures


• Programming skills with at least one of any modern programming language
• Experience with containerized environments, Docker, Kubernetes
• Experience managing code, database, infrastructure (networking, operating systems, storage)
• Experience with monitoring frameworks (Grafana, Kibana, Prometheus)
• Experience with IaaC and related tools (e.g. Terraform, CloudFormation)
• Experience with modern CI/CD (e.g. Github Actions)
• Experience with a major Cloud Provider (e.g. AWS, GCP, Azure)
• SRE experience within a service development team for supporting, troubleshooting, and log analysis to meet our service availability and observability
• Experience maintaining Service Level Objectives (SLO) / Service Level Indicators (SLI)
• Good spoken and written English, great communication skills
• Teamwork experience

Nice to have

• Strong knowledge of a scripting language (e.g. Python, Bash)
• Experience with OpenStack
• Strong Linux or Windows system-level analysis capabilities
• Experience optimizing cloud cost and reducing system resource usage by setting clear requirements through efficiency and capacity planning
• Experience with varieties of SaaS operation tools like uptime, Dynatrace, PagerDuty
• Experience in improving documentation on-site reliability measures, either in application documentation or in runbooks, explaining the issues encountered and the solutions implemented
• Experience in a negotiation process within a team or during inter-team communication

What we offer

Professional Development:
— Experienced colleagues who are ready to share knowledge;
— The ability to switch projects, technology stacks, and try yourself in different roles;
— More than 150 workplaces for advanced training;
— Study and practise English: courses and communication with colleagues and clients from different countries;
— Support of speakers who make presentations at conferences and meetings of technology communities;
• Health insurance;
• The ability to focus on your work: a lack of bureaucracy and micromanagement, and convenient corporate services;
• Friendly atmosphere, concern for the comfort of specialists, contemporary office space;
• Flexible schedule (there are core mandatory hours), the ability to work remotely upon agreement with colleagues.

Tev varētu interesēt arī:

Arvato Systems Latvia SIA
€ 1600 – 3800
Beigu termiņš: 28.02.2023
DevOps / System administrator
SPH Engineering, SIA
€ 1500 – 3500
Beigu termiņš: 20.02.2023
DevOps Engineer
Atea Global Services
€ 3800 – 4500
Beigu termiņš: 24.02.2023