IT Problem Management Lead
Date: 16 Nov 2024
Location: Braddell, SG
Company: Network For Electronic Transfers (S)
BCS is NETS’ wholly owned subsidiary, and is an entity within the NETS Group. It manages and operates clearing and payment infrastructure for the Singapore Automated Clearing House, including Fast And Secure Transfers (FAST), Inter-bank GIRO (IBG), Cheque Truncation System (CTS), and provides services for PayNow and SGQR Central Repository.
Position Summary
The ITSM team ensures BCS ITIL processes are operationally relevant and optimised with sufficient controls, providing Technology Teams with a framework to operate and deliver IT services to our customers.
The Technology Problem Manager is responsible for identifying, analyzing, and mitigating underlying causes of Major Incidents and recurring IT incidents or issues. The role ensures long-term stability of the IT environment by preventing incidents from recurring, improving system performance, and minimizing disruptions.
Key Responsibilities
- Proactively identify potential problems by analyzing incident trends and service data.
- Lead root cause analysis (RCA) efforts for recurring and major incidents to prevent future occurrences.
- Document the outcome of RCA meetings, ensuring investigations covers all contributing factors, and possibility of systemic issues.
- Document all root cause analyses, action plans, and resolutions in a way that can be easily understood by technical and non-technical audiences.
- Share findings and updates on issues with relevant stakeholders to improve service reliability.
- Maintainand improve the problem management process, ensuring it aligns with ITIL best practices.
- Maintain and manage a Known Error Database (KEDB), ensuring all known errors are accurately documented and accessible to relevant teams.
- Track and monitor known errors to ensure they are being addressed in a timely manner, minimizing their impact on operations.
- Work with incident and change management teams to update or retire known errors as fixes are applied.
- Ensure that known errors and workarounds are shared with relevant teams to enhance incident resolution efficiency.
- Identify risks that could cause potential disruptions and work to mitigate them through proactive problem management.
- Track and report all problem management related KPIs in a timely manner
Requirements
- Minimum 5 years of working experience managing problem investigations in a complex technological environment supporting real time transactions
- ITIL v3/v4 Foundation Certification required, higher ITIL certifications preferred
- Certifications in relevant technologies (e.g., AWS, Azure, Cisco) are a plus
- Strong communication and writing skills and able to articulate complex matters in concise manner
- Strong understanding of IT infrastructure components (servers, databases, networking) and their interdependencies
- Proven record in driving complex root cause analysis and incident resolution in multi-vendor and multi-platform environments
- Familiarity with monitoring and alerting tools (e.g. ELK or SolarWinds)
- Self-motivated and able to work independently
- Strong problem-solving skills with the ability to analyze complex technical issues and effectively perform root cause analysis
Banking Computer Services Pte Ltd (a subsidiary of Network for Electronic Transfers (Singapore) Pte Ltd)