Description
Senior SysOps Engineer
Location: Remote, USA
Employment Type: Full-Time
Compensation: $120,000.00 - $149,000.00 (Range applies to US candidates only) + Benefits/Variable Comp - Range may vary based on experience.
Benefits Offered: Vision, Medical, Life, Dental, 401K
Summary
The Senior SysOps Engineer plays a critical role in implementing, supporting, securing, and continuously improving customer cloud environments. This role serves as a technical leader within Cloud Operations, providing hands-on expertise while mentoring and guiding other Cloud Engineers to elevate team capability, consistency, and operational excellence. In addition to deep technical execution, this position requires strong analytical skills to understand business processes, translate customer requirements into scalable cloud solutions, and resolve complex production issues. The employee collaborates with internal teams, leadership, and customers globally, and plays a key role in driving automation, standardization, and best practices across the cloud platform. A passion for cloud technology, operational excellence, and developing others is critical for success in this role.
Primary Duties and Responsibilities
Deliver automated and scalable solutions that yield operational efficiencies for Cloud Operations engineers.
Drive operational outcomes aligned to department SLA, cost, and automation effectiveness targets.
Lead the orchestration of customer environment implementations and migrations.
Perform and oversee Platform and AI Services environment upgrades and maintenance.
Serve as a senior escalation point for complex issues, leading root cause analysis and driving long-term remediation.
Provide technical guidance and mentorship for early and mid-career engineers.
Create, review, and update operational workloads and standard operating procedures.
Consistently deliver high-quality solutions and implementations on time.
Monitor customer environments through Azure, telemetry services, alerting, and logging to ensure proactive issue detection and resolution.
Identify opportunities for process optimizations and automation where applicable to improve reliability, reduce manual effort, and increase operational efficiency.
Participate in on-call rotations, lead incident response, and post-incident reviews.
Lead incident communications for complex events, authoring post-incident reviews and driving remediation plans across teams.
Perform Disaster Recovery testing and contribute to compliance readiness.
Foster cross-functional collaboration with Platform, Cloud, and Customer Support Teams.
Participate in testing and implementation of pre-released products and services.
Provide exceptional customer service through effective communication, technical expertise, and ownership of customer outcomes.
Secondary Responsibilities
Participate in and influence architectural discussions to ensure solutions are designed for operational success, security, and scalability.
Automate provisioning, monitoring and scaling of cloud environments.
Contribute to internal cloud and business initiatives to advance OneStream's cloud strategy.
Support audit readiness, security improvements, compliance changes, and risk mitigation efforts.
Assist with internal projects as needed.
Required Education and Experience
BS/BA in Computer Science, Engineering, or relevant field, or equivalent professional experience.
7-10 years of experience in cloud operations, infrastructure engineering, or a related role, with increasing responsibility.
Strong experience in the analysis, implementation, and operation of enterprise IT systems and cloud systems.
Deep understanding of cloud infrastructure, operating systems, networking, and security concepts.
Strong knowledge of Infrastructure-As-Code concepts and tooling (Terraform, CloudFormation templates, Bicep or ARM templates).
Proficiency with AI-assisted development tools (e.g., Claude, Cursor, GitHub Copilot), including the ability to critically evaluate, validate, and operationalize generated outputs.
Infrastructure automation using PowerShell, Terraform, REST, etc. and familiarization with supporting tools such as ARM/Bicep templates, Kusto Query Language (KQL), and Azure Policy.
Solid experience with IT Compliance requirements, control-driven policy and process design, and development of operational and security guardrails.
Proven ability to work independently, prioritize effectively, and manage complex technical initiatives.
Excellent written and verbal communication skills, with the ability to explain technical concepts to diverse audiences.
Certifications: Microsoft Azure Az-104 or demonstrated equivalent Azure administration competency.
Preferred Education and Experience
Certifications: Microsoft Azure AZ-204, AZ-305, AZ-400, AZ-500, DP-300, and SC-100.
7-10 years of hands-on experience with:
Microsoft Azure, AWS, or Google Cloud.
Azure Kubernetes Services (AKS) with container-based deployment orchestration or other platforms such as OpenShift, GKS or EKS.
Identity and access management using Azure Entra ID and Okta with a deep understanding of modern authentication protocols (OpenID Connect, SAML, OAuth2) and zero-trust security concepts (MFA/SSO).
Experience with highavailability application architectures using traffic management and loadbalancing strategies to ensure resilience, fault tolerance, and minimal service disruption.
Working knowledge of various cryptographic algorithms and protocols (IPSec, TLS, SSH, AES).
Familiarity with Site Reliability Engineering (SRE) tools such as Dynatrace or New Relic and self-healing observability concepts.
Knowledge, Skills, and Abilities
Strong customer service orientation with a focus on reliability and trust.
Advanced troubleshooting and problem-solving skills.
Ability to manage multiple priorities in a fast-paced, production-focused environment.
Demonstrated... For full info follow application link.