Job Information
Volpara Health Site Reliability Engineer in Aurora, Colorado
Here at Volpara, we're on a mission to save families from cancer. By combining our ground-breaking Breast Health Platform with empowered patients, we are unleashing a revolution in cancer detection. You can check out more about what we do here: https://youtu.be/YoWsuV64uAI?si=iqzcN7PAiXoogwSg
If you are based in the US Central or Mountain time zones and willing to work regularly scheduled weekends and on-call, then read on to find out more about this fantastic opportunity to join our growing company making big changes in the world.
Our Site Reliability Engineers (SRE for short) are embedded as an engineer in a product pod/squad, championing the SRE capability by using their expertise in cloud technologies, IT operations, and software automation to ensure that production services are up-to-date, secure, available, and performant and that service incidents are resolved quicky. When this role is successful service level objectives and agreements are fulfilled, product updates are deployed quickly and reliably, and installation and support teams have the knowledge and tools to easily help customers succeed with our products.
What does that mean you will be doing?
Develop, maintain, and support CI/CD processes and environments to ensure frequent and timely production releases.
Perform and monitor production system deployment as required by each product pod's release cycles, possibly during evenings or weekends and as frequently as every 2-week sprint.
Develop and maintain disaster recovery procedures and perform required annual tests. Perform periodic monitoring and auditing of data backups and replications required for the DR process.
Participate in regular standups in the assigned product pod.
Work with other pod members to continually improve system monitoring, issue detection, and rapid service restoration with a focus on maintaining and improving the reliability and availability of all customer-facing production systems.
Investigate and develop tools and processes and integrate them to improve the reliability, stability, efficiency, security, and cost of our products & services.
Aid and guide Installation Engineering and Customer Support teams to enable them to install, configure and support our products efficiently and effectively.
Reduce manual, repetitive SRE and product pod related work as much as possible, using automation.
Manage Software-as-a-Service on behalf of customers inclusive of provisioning, deployment, configuration, ongoing maintenance, monitoring, and alerting.
When training is complete, serve on a 24/7 on-call roster to address any critical customer issues after-hours.
Create and maintain documentation of required processes so that it can be used effectively for knowledge transfer and as evidence for audit and compliance purposes.
Maintain SRE-related source code, adhering to quali