Site Reliability Engineer
Description
Embankment ⚡️
Accuracy is everything. Embankment delivers tech-powered fund services to Europe’s alternative fund managers through a proprietary software platform and operated by our service organization.
Backed by leading investors and trusted by 250+ funds across Europe, we combine deep financial expertise with modern technology to turn complexity into clarity.
We’re 60+ employees across Copenhagen plus offices in Aarhus, Stockholm, Oslo and Luxembourg.
Team and role 🎒
Our platform is mission-critical for both internal and external users. Reliability, correctness, and controlled change are non-negotiable. In tech/product we are 10+ people across Copenhagen and Aarhus.
We collaborate day-to-day on Slack, Notion and Github. We are invested in using coding agents such as Claude, Codex and Augment.
We’re now looking for a Site Reliability Engineer to take ownership of our production infrastructure and help us scale safely. This is a full-time permanent role and you can work from our locations in either Aarhus or Copenhagen. Our base salary range starts at EUR 80.000 + pension and benefits.
Stack & infrastructure 📟
- TypeScript across frontend and backend
- React, Node.js
- Google Cloud Platform (GCP)
- Terraform for infrastructure as code
- Kubernetes as our core runtime
- Flux for GitOps-based deployment
- Apollo (federated graph)
- Postgres + TypeORM, Redis
- Gemini, Claude, GPT for automation and product features
Your responsibilities 📝
Own the reliability of our production systems end-to-end
Design, provision, and maintain GCP infrastructure using Terraform
Operate and evolve our Kubernetes platform and GitOps workflows (Flux)
Define and improve observability: metrics, logs, traces, alerts, and SLOs
Lead incident response and postmortems; turn failures into durable fixes
Partner with product engineers to improve reliability at the code level:
safer deploys and rollbacks
resilience patterns
background jobs and async workflows
LLM-related failure modes
Help set standards for security, access control, and operational hygiene
Depending on interest and capacity, the role may also contribute to automation and standards around internal tooling (e.g. identity, device provisioning), but this is not a helpdesk role
Who we’re after 👋🏻
- Strong experience running production systems on cloud infrastructure
- Hands-on with Kubernetes and infrastructure as code (Terraform)
- Comfortable being on-call and owning incidents when they happen
- You think in terms of failure modes, not just happy paths
- You can read and write application code and enjoy collaborating with product engineers
- Pragmatic, calm under pressure, and biased toward automation
- You care about making systems understandable for the next person
Experience with GCP, GitOps, or TypeScript-based backends is a plus—but not required if you bring strong fundamentals.