This Senior Network & Computer Systems Administrator is responsible for the design, implementation, administration, and ongoing maintenance of multiple enterprise monitoring platforms supporting a complex, distributed IT environment. This role performs advanced technical and administrative tasks across server operating systems, network monitoring, and application performance monitoring tools from vendors including Microsoft, Broadcom, Dynatrace, and Nexthink.
This role collaborates with customers and stakeholders at all organizational levels to ensure that monitoring solutions are comprehensive, customer-focused, and aligned with business and mission needs. The successful candidate will be able to translate technical monitoring capabilities into meaningful, actionable information for non-technical users, deliver individual and group training, and produce high-quality technical and policy documentation.
This position aligns with Cayuse's core values of Innovation, Excellence, Collaboration, Adaptability, and Integrity by fostering technical solutions that meet customer needs, promoting teamwork, and prioritizing quality in deliverables.
Perform day-to-day administration, configuration, and optimization of enterprise monitoring tools, including but not limited to:
Microsoft System Center Operations Manager (SCOM)
Broadcom DX NetOps Spectrum
Broadcom DX NetOps Network Flow Analysis
Broadcom DX NetOps Performance Manager
Dynatrace (planned/ongoing implementation)
Nexthink (planned/ongoing implementation)
Install, upgrade, patch, and maintain monitoring application software on a variety of operating systems and platforms (physical and virtual).
Design, implement, and tune monitoring policies, thresholds, alerts, dashboards, and reports to support operational, performance, and capacity management needs.
Integrate monitoring tools with related systems (e.g., ticketing/ITSM platforms, logging, notification systems, CMDB) as required.
Maintain and administer Windows Server 2022 and Red Hat Enterprise Linux (RHEL) 8 and 9 systems supporting monitoring tools in both physical and virtualized environments.
Perform OS-level configuration, hardening, troubleshooting, and performance tuning to support highly available and resilient monitoring services.
Coordinate with server and virtualization teams to ensure appropriate resource allocation, backup/recovery configuration, and change management for monitoring infrastructure.
Apply a solid understanding of WAN/LAN concepts and data circuits (e.g., T1, MPLS, VPN and Enterprise VPN, SONET, Ethernet, Fiber) to design and support effective network monitoring solutions.
Utilize SNMP v3 (traps and polling), flow technologies, and other protocols to collect network performance and availability data.
Perform bandwidth and traffic analysis to identify trends, bottlenecks, and potential issues impacting service delivery.
Work with network engineering teams to interpret monitoring results, assist in root cause analysis, and recommend corrective actions.
Interface with customers, stakeholders, and leadership at all levels of the organization to gather requirements, explain monitoring capabilities, and present findings or recommendations.
Communicate complex technical concepts in clear, non-technical language tailored to the audience's level of understanding.
Provide individual and group training on the enterprise monitoring toolset, dashboards, reports, and standard operating procedures.
Maintain a strong customer-service orientation, demonstrating professionalism, responsiveness, and effective relationship management.
Follow established processes and procedures for incident, request, change, and problem management related to enterprise monitoring services.
Maintain call and interaction quality standards, including adherence to work schedules, SLAs, and documented support procedures.
Document policies, procedures, configurations, and issues in the ticketing system and internal knowledge base to ensure repeatable, consistent support.
Develop and maintain advanced reporting (including monthly and ad hoc reports) using tools such as Microsoft Excel, SQL/MySQL queries, and built-in reporting modules within monitoring platforms.
Analyze monitoring data to identify opportunities to improve service levels, proactively detect issues, and better support the needs of internal clients.
Work both independently and as part of a cross-functional team to deliver high-quality monitoring solutions and support.
Participate in a rotating on-call schedule to provide after-hours support for critical monitoring platforms and related incidents.
Contribute to continuous improvement efforts, including tool evaluation, process refinement, and the adoption of emerging best practices in monitoring and observability.
Other duties as assigned.
The qualifications and skills listed below are intended to provide a general overview of the requirements for this position. However, due to the anticipated nature of the contract and the absence of a finalized task order from the client, this list should not be considered all-encompassing. Additional qualifications, certifications, skills, or experience specific to the client's requirements may be identified and requested upon award of the task order. Candidates should demonstrate flexibility and a willingness to adapt to evolving responsibilities as outlined by the client.
Hands-on experience administering and maintaining:
Windows Server 2022
Red Hat Enterprise Linux (RHEL) 8 and/or 9
Demonstrated experience installing, upgrading, configuring, and supporting enterprise monitoring tools (e.g., SCOM, Broadcom DX NetOps suite, Dynatrace, Nexthink or similar).
Strong understanding of WAN/LAN networking concepts, including:
Data circuits such as T1, MPLS, VPN/Enterprise VPN, SONET, Ethernet, Fiber
Core concepts of routing, switching, and network performance metrics
Experience working in environments with defined processes (e.g., ITIL-based incident, problem, and change management) and adhering to documented standards and procedures.
Proven ability to communicate effectively with non-technical customers, including translating complex technical issues into clear, actionable information.
Demonstrated ability to work independently with minimal supervision as well as collaboratively within a team.
Experience supporting customers in a service-oriented environment, with strong client relationship skills.
Installation, configuration, upgrade, and day-to-day administration of:
Microsoft System Center Operations Manager (SCOM)
Broadcom DX NetOps Spectrum
Broadcom DX NetOps Network Flow Analysis
Broadcom DX NetOps Performance Manager
Dynatrace (implementation and/or operation preferred)
Nexthink (implementation and/or operation preferred)
Development of monitoring rules, alerts, dashboards, and reports within these tools.
Experience providing user training and creating user guides, runbooks, and technical documentation for monitoring platforms.
Familiarity with Cisco IOS, networking configuration files, hardware, and core networking concepts.
Strong understanding and practical experience with:
SNMP v3 traps and polling
Bandwidth and network flow analysis
Advanced system and network troubleshooting skills across multi-tier, distributed environments.
Proficiency with Microsoft PowerShell for automation, configuration, and data collection related to monitoring systems.
Working knowledge of SQL and MySQL for querying monitoring databases and supporting custom reports.
Advanced Microsoft Excel skills, including the use of formu