Job Description

Responsibilities Deployment & Automation

Implement and maintain CI/CD pipelines using tools such as GitHub Actions, AWS CodePipeline, and Jenkins.
Automate infrastructure provisioning and management using Infrastructure-as-Code (IaC) with Terraform, CloudFormation, or AWS CDK.
Develop robust automation scripts and self-service tooling to minimize toil and enhance operational efficiency.

Capacity, Performance & Cost Optimization

Lead and implement operational cost optimization initiatives across cloud infrastructure and data platforms.
Configure, maintain, and tune auto-scaling policies and performance thresholds.
Develop and execute Resiliency Test plans and provide critical support for Performance testing efforts.

Incident Management & SRE Principles

Serve as a production on-call responder, employing strong troubleshooting skills to quickly resolve complex incidents.
Proficiently utilize ITIL framework concepts and ITSM tools (e.g., ServiceNow) for incident and change management.
Develop high-quality Root Cause Analysis (RCA) documentation and Knowledge articles to prevent future recurrence.
Implement and enforce SRE principles, including the definition and tracking of Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets.

Observability & Monitoring

Manage and leverage advanced observability platforms (Dynatrace preferred, AppDynamics, ELK, etc.).
Implement distributed tracing with accurate context propagation across data services and applications.
Optimize monitoring queries, and configure actionable dashboards, alerts, and anomaly detectors using tools like Dynatrace and Kibana.

Data Analytics Platform Reliability

Ensure the reliability, performance tuning, and access control for Databricks cluster management and data pipelines.
Maintain Informatica workflow orchestration, connector reliability, and error handling for critical data flows.
Manage Power BI gateway health, access control, and ensure reliable data refresh processes.

Security & Compliance

Manage service accounts, access permissions, and roles following the principle of least privilege.
Create, deploy, and manage digital certificates and TLS/SSL configurations.
Execute effective remediation tasks and respond to security incidents as part of the operational team.

Qualifications Education & Experience

Bachelor's degree in Computer Science, Engineering, or a related technical field.
2 to 4 years of hands-on experience in a DevOps, Site Reliability Engineering (SRE), or Cloud Infrastructure role.
Practical, working experience with major cloud platforms, specifically AWS and Azure.

Technical Skills

Mid-level proficiency in Python or other scripting languages (e.g., Bash, Go) for automation tasks.
Mid-level proficiency with Configuration Management tools, including Ansible.
Strong knowledge of containerization technologies (Docker, Kubernetes/ECS).
Solid understanding of Linux systems and networking fundamentals (TCP/IP, DNS, Load Balancing).
Working knowledge of relational, cloud-native (e.g., AWS RDS), and NoSQL database technologies.
Direct hands-on experience supporting and maintaining data platforms like Databricks, Informatica, or Power BI is highly desirable.

Professional Attributes

Excellent written and verbal communication skills, with a proven ability to document complex systems.
Demonstrated ability to work independently, manage shifting priorities, and drive initiatives to completion.
Availability for on-call duties and to work outside of standard business hours as required to support a 24/7 production environment.

Job Tags

Work experience placement, Shift work,

Similar Jobs

University of Illinois UrbanaChampaign

Director, Reichard Real Estate Academy Job at University of Illinois UrbanaChampaign

...Director, Reichard Real Estate Academy Department of Finance Gies College of Business University of Illinois Urbana... ...industry projects and leaders to help students secure strong entry-level positions and thrive in their early careers. The Director...

Épicerie Boulud - World Trade Center

Retail Specialist Job at Épicerie Boulud - World Trade Center

picerie Boulud is hiring a full time Cashier/Barista/Retail Specialist . Must have weekend availability as well as be able to work the Friday closing shift. picerie Boulud is Chef Daniels eat-in/take-out casual market that offers fresh baked breads and a selection...

D4C Dental Brands

Dental Hygienist Job at D4C Dental Brands

...Description As a Registered Dental Hygienist ,you will play a vital role in ensuring your young patients develop healthy habits... ...PRN Requirements Associate's Degree Registered Dental Hygiene active state license, included but not limited to, radiograph,...

Gulfstream Strategic Placements, LLC

HVAC Project Manager Job at Gulfstream Strategic Placements, LLC

...Project Manager job at large HVAC/Plumbing Construction Company This is a permanent, full-time position with a great salary and benefits package as well as career-growth opportunities. Responsibilities: Complete full mechanical hvac and plumbing project from start...

Pacific Asset Management, LLC

Tech Finance Senior Analyst: Budgeting, TBM & Cost Insight Job at Pacific Asset Management, LLC

A financial services company in Newport Beach is seeking a Senior Financial Analyst II to manage technology budgets and perform financial analysis. The role involves using advanced Excel and financial modeling skills, while leveraging tools like Oracle EPM and Power BI....

Site Reliability Engineer (SRE) Job at ShiftPixy Resources Inc, Washington DC

L01IcGVLc3ZOUjZRNHgxOXd5c3ZjdVNu