Site Reliability Engineer

We are looking for a Site Reliability Engineer (SRE) to join our team, working at the most integral levels of the operation of our core product and its intersection with public cloud (e.g., Azure and AWS).

Toronto, ON


MedStack is an out-of-the-box privacy compliance solution built specifically for the needs of the digital health industry. We are eliminating one of the biggest barriers currently impeding successful healthcare innovation and helping digital health innovators to scale faster, easier and more affordably.&nbsp;Our customers are building cloud healthcare solutions for telemedicine, mental health, patient engagement, smart medical devices, chronic condition management, healthcare billing, genomics research, teledentistry and more.&nbsp;<br /><br />We&rsquo;ve been named a TechCrunch Top Pick, one&nbsp;of Canada&rsquo;s Top 20 early-stage technology companies by CIX, and one of the Top 50 digital health disruptors in Canada by&nbsp;PwC and CB Insights. We are a remote-first team that embraces big problems and are driven to make healthcare better.

keywords: who we are,who we need,our technology,what’s in it for you,software automation and configure,handle operational issues,develop features,develop,identify risks,the experience,examples of work you may be well versed in:,passion


Overview: <p><strong>We strive to make healthcare better for all by making it easier for digital health innovators to succeed.</strong></p> <p><strong>Who we are</strong></p> <p>MedStack is on a mission to transform the process of healthcare innovation. Our company is an award-winning startup backed by prominent VCs and is a successful graduate of 500 Startups, and the Creative Destruction Lab. We've been named a TechCrunch Disrupt Top Pick for Cybersecurity, one of Canada's Top 20 early-stage technology companies by CIX, and one of the Top 50 digital health disruptors in Canada by PwC and CB Insights.&nbsp;</p> <p>Our business is a subscription-based platform, but we are also an ecosystem, a community that champions startups and entrepreneurship. We embrace automation, collaboration, startup empowerment, and creative problem-solving. Our customers work with us and with each other to tackle the world's biggest problem: making healthcare more efficient, accessible, economical and effective, at a time when the world needs it more than ever before.</p> <p><strong>Who we need</strong></p> <p>We are looking for a Site Reliability Engineer (SRE) to join our team, working at the most integral levels of the operation of our core product and its intersection with public cloud (e.g., Azure and AWS). We need someone who has conceived of and built custom software automation to manage cloud resources for deployment, and is eager to level up the deployment, operations, and scalability of our platform.&nbsp; This is an opportunity to channel your passion for highly resilient, secure, performant technology to our MedStack Control platform that supports hundreds of segregated environments.</p> <p>This is a remote position and we welcome applicants from across all of Canada.</p> <p><strong>Our technology</strong></p> <p>We run a DevOps culture, and we&rsquo;re using Docker, Ansible and ElasticSearch. We use both Azure and AWS, and we are flexible with languages and tools, using whatever is most effective for the work at hand.</p> <p><strong>What&rsquo;s in it for you</strong></p> <p>The exciting project work. You will have the opportunity to build and manage a scalable platform that serves digital health companies focused on making healthcare and healthtech more affordable and accessible to the masses. You will make an impact in the digital health landscape by ensuring MedStack Control&rsquo;s core technology has a reputation of resilience, scalability and reliability, increasing our market reach and ultimately the reach of our clients.</p>
Responsibilities: <p><strong>What you will do</strong></p> <ul> <li>Write software automation and configure tools to manage the clusters of our platform as cattle (not pets). Introduce new tooling integration for deployments (eg. Ansible Tower, oAuth, Hashicorp Vault, Kubernetes).</li> <li>Handle operational issues and identify opportunities for automation and operational improvements. For example, manage our Elasticsearch implementation.</li> <li>Develop features to support hierarchical access control, and be the first point of contact for IDS escalations from our Security Operations Centre.</li> <li>Identify risks and boundaries in infrastructure deployment and prioritize improvements to support scale, and advocate for changes and features in the platform to create more autonomous systems.</li> </ul>
Requirements: <p><strong>What you bring:</strong></p> <ul> <li><em>The demonstrable experience.&nbsp;</em>You have a programming background that you&rsquo;ve leveraged as you&rsquo;ve moved into SRE functions, focusing on infrastructure scalability, operability, and reliability; and you are a champion of the DevOps figure 8 model.&nbsp; You are customer-focused, calm under pressure, debugging issues in production in a planned and systematic way. In addition to knowledge of SRE best practices, you have hands-on experience in SRE languages and tools (e.g. Docker, Kubernetes, Ansible, Terraform, Elastic Stack (ELK), CNCF projects, etc.).&nbsp;</li> </ul> <ul> <li><em>Examples of work you may be well versed in</em>: <ul> <li>Writing and managing deployment code</li> <li>Building deployment automation</li> <li>Implementing monitoring improvements</li> <li>Maintaining distributed systems</li> <li>Running Kubernetes in High Availability (HA) environments</li> <li>Fleet management, including maintenance activities, health monitoring, and incident resolution</li> <li>Lifecycle management of system resources and data</li> </ul> </li> </ul> <ul> <li><em>You are passionate about technology with purpose and making an impact.&nbsp;</em>You love working on large-scale systems in a startup environment. You have an appreciation for a diversity of thought and experience. You are comfortable using a range of digital tools to communicate problems, and solutions to your teammates, fostering remote working relationships. You have a creative portfolio demonstrating your work and a life outside of work. Whatever it is, you do it with dedication and you constantly challenge yourself and absorb new information.</li> </ul> <p><strong>How you will make a difference:</strong></p> <ul> <li><em>Architect and develop.<strong>&nbsp;</strong></em>You will build scalable systems, using automation and pushing changes that improve reliability and usability. You will work closely with across teams to build and implement solutions adhering to the technical specifications and aligning with the roadmap and reference architecture.</li> <li><em>Support.</em>&nbsp;You will be part of the team that maintains the platform ensuring 24/7 accessibility and reliability; measuring and monitoring availability, latency, and overall system health. You will set and maintain high standards around incident response practices and policies. Inspired by the philosophy of Site Reliability Engineering, we provide customers with access to 24/7 support, as a result, you will be part of an on-call rotation schedule. You can expect on average one after-hours call per 1-week shift.</li> <li><em>Share knowledge.</em>&nbsp;You will train other team members on technologies and processes; drive education and knowledge transfer of design patterns, technical practices, and relevant technologies and tools</li> <li><em>Implement best practices.</em>&nbsp;You will support the investigation and adoption of new and emerging cloud architecture practices and technologies, creating opportunities for improvement and actively participating in discussions and initiatives to improve our technical practices and competitive position.</li> </ul> <strong>Why we think you'd like it here</strong><br /> <p>MedStack is a remote-first company. And while that may not be a distinction now, it's the way we have always operated. This translates to having the processes, technology, and understanding in place to nurture a collaborative remote work environment with an exceptional work-life balance.</p> <p>We meet (virtually) as a team at the end of every week to celebrate our wins, acknowledge milestones, showcase what we've each accomplished, tackle challenges, and learn from each other.&nbsp;</p> <p>Diversity and inclusion are not mere words on paper to us. We have stringent guidelines and bold objectives.</p> <p>We offer competitive compensation, generous benefits and vacation packages, and the choice to partake in the company's employee stock option plan. There is no better time to come into an early-stage award-winning company with a strong brand that is redefining an industry.</p> <p><strong>Join us.</strong></p> <p>We are a company with a developer, entrepreneur and innovator mindset and extremely high specificity, accuracy, and quality standards. Our core product is digital security infrastructure. Everything from our team processes to our brand reflects a culture of facts-first, entrepreneurship vs. the status quo, celebrating our customers' success before ours, and collaboration and openness.&nbsp;</p> <p>If you have 70% of the qualifications we are looking for, share our way of thinking and want to play an integral role in impacting health tech and health care, apply to express your interest.&nbsp;</p> <p>MedStack welcomes and encourages applications from people with disabilities. Accommodations are available on request for candidates taking part in all aspects of the selection process.</p>