At a glance
Location: US-MN-Maple Grove
Posted: 1/1/2020
Closing: 1/31/2020
Degree: Not Specified
Employment Type: Full-Time
Experience: Not Specified
Site Reliability Engineer
Data Recognition Corporation
Job description

Data Recognition Corporation 
Maple Grove, MN

Please, no agencies

Company cannot provide sponsorship for this position


This position works with development teams to take responsibility for service availability and performance in production environments. This position serves as a subject matter expert on the capabilities and limits of multi-data center production infrastructure, and to implement quality and release standards.

Responsibilities

  • Define best practices promoting service reliability and fault-tolerance.
  • Collaborate with development teams to implement best practices.
  • Design and implement innovations that improve service reliability, infrastructure resiliency and security, and data availability.
  • Serve as a subject matter expert on all matters related to the service operations. Troubleshoot and provide root cause analysis for issues spanning code, network, database and systems components.
  • Develop and automate emergency recovery procedures, deployment schedules, post-maintenance validation, and other operational activities.
  • Collaborate with Product and Software Development teams to define Service Level Agreements (SLAs), Objectives (SLOs) and Indicators (SLIs)
  • Collect SLI metrics and establish monitoring based on SLO thresholds and other product requirements
  • Develop product specific reliability requirements to support SLOs.
  • Ensure the infrastructure meets performance and capacity requirements.
  • Understand application dependencies to review interaction, monitoring and alerting, and dependency reliability in order to meet SLOs.
  • Ensure service availability during software upgrades and maintenance to infrastructure, databases, and dependencies.
  • Provide technical leadership and mentoring to other members of SRE team.
  • Complete documentation according to approved methodology.
  • Coordinate and participate in development and review meetings.

Qualifications

  • 5+ years of software development, automation or infrastructure as code experience.
  • Experience analyzing application performance problems.
  • Experience with Unix/Linux and/or Windows operating system administration and networking architecture.
  • Experience with monitoring systems: Elastic Stack, Splunk, NewRelic or similar.
  • Experience with source control and continuous integration tools (GitHub and Jenkins preferred.)
  • Exposure to issue tracking systems such as JIRA and ServiceNow.
  • Experience using automated testing and performance tools (JMeter, BlazeMeter, Selenium Grid, Protractor, or similar)
  • Familiarity with Agile development methodologies, including Scrum.
  • Highly motivated with strong communication, analytical and technical skills along with the ability to work both independently and as a member of a team.

Preferred Qualifications:

  • Experience working with cloud-based systems and technologies. (AWS, Azure)

• Cloud infrastructure as code experience. (Terraform, CloudFormation)

  • Experience with configuration management tools. (Ansible, Chef, Puppet, Salt)
  • Experience with multiple database technologies. (MS SQL, MySQL, Postgres, MongoDB, DynamoDB, Oracle)
  • Demonstrated ability with scripting and programming languages. (Java, JavaScript, Angular, Node.js, CoffeeScript, TypeScript)
  • Web service testing and an understanding of microservices.
  • Knowledge of performance, load, and stress testing practices.
  • Experience building CI/CD tools (Jenkins preferred) for a production application in an enterprise environment.

DRC retains the right to change or assign other duties to this position.

No Agencies, Please

Company cannot provide sponsorship for this position

https://www.datarecognitioncorp.com/career-opportunities/
Data Recognition Corporation is an Affirmative Action/Equal Opportunity Employer M/F/D/V

Job requirements

Qualifications

  • 5+ years of software development, automation or infrastructure as code experience.
  • Experience analyzing application performance problems.
  • Experience with Unix/Linux and/or Windows operating system administration and networking architecture.
  • Experience with monitoring systems: Elastic Stack, Splunk, NewRelic or similar.
  • Experience with source control and continuous integration tools (GitHub and Jenkins preferred.)
  • Exposure to issue tracking systems such as JIRA and ServiceNow.
  • Experience using automated testing and performance tools (JMeter, BlazeMeter, Selenium Grid, Protractor, or similar)
  • Familiarity with Agile development methodologies, including Scrum.
  • Highly motivated with strong communication, analytical and technical skills along with the ability to work both independently and as a member of a team.