Source Job

Engineer

FAL
$180,000–$250,000/yr
US

  • Build and maintain Python fleet tracking system that manages the full lifecycle of servers.
  • Build server management tooling that automates provisioning, health checks, GPU diagnostics, recovery and alerting.
  • Create and maintain metrics, dashboards, and alerting for hardware health across the fleet.

Python Linux Ansible Terraform Troubleshooting

20 jobs similar to Engineer

Jobs ranked by similarity.

Global

  • Provide production support on a shift according to the team on-call roster.
  • Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support.
  • Continuously monitor the health and performance of our services, systems, and infrastructure.

Granicus builds and maintains technology that is transforming the Govtech industry by bringing governments and its constituents together. They serve 5,500 federal, state, and local government agencies and more than 300 million citizen subscribers, and are known for being one of the best companies to work for.

US

  • Own and operate end-to-end infrastructure for backend services, frontend systems and databases.
  • Build and maintain reliable deployment workflows including CI/CD pipelines and rollback procedures.
  • Improve system-wide observability through metrics, logging, alerting, and monitoring to ensure uptime.

Jito Labs builds a high-performance trading terminal on Solana. They are a lean, high-output team building something that sits at the intersection of execution quality, user experience, and on-chain infrastructure.

Global

  • Build and maintain our host provisioning stack to bring new bare metal online quickly and confidently.
  • Evolve our homegrown orchestration engine to manage clusters, containers, and VMs.
  • Build out internal observability and alerting so we catch fleet problems before customers feel them.

Railway's core mission is to make software engineers higher leverage. They provide powerful tools so engineers can spend less time setting up and more time doing. The team is small, with high ownership, and they are passionate about being exceptional.

$29,000–$36,000/yr
India

  • Design, build, and maintain scalable, reliable systems on GCP.
  • Develop automation for infrastructure provisioning using Terraform, Ansible, or Deployment Manager.
  • Manage incident response, conduct postmortems, and implement improvements to reduce recurrence.

SupplyHouse.com is an industry-leading e-commerce company specializing in HVAC, plumbing, heating, and electrical supplies since 2004. They value every individual team member and cultivate a community where people come first with Generosity, Respect, Innovation, Teamwork, and GRIT.

Global

  • Create, deploy, and manage high performing servers.
  • Deliver millions of requests globally with sub-second latency.
  • Shape something from the core, without legacy infrastructure.

Entefy is working to create the fastest data syncing experience ever built. They hold their data syncing, consistency and uptime to the highest standards and are looking for someone to manage high performing servers.

US Global

  • Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure.
  • Implementing and utilizing configuration management and deployment tools.
  • Leading continuous improvement, by automating the installation, configuration and maintenance of services on our platform.

The Wikimedia Foundation operates Wikipedia and other Wikimedia free knowledge projects with the vision of a world where every single human can freely share in the sum of all knowledge. As a charitable, not-for-profit organization, it relies on donations and has staff members based in 40+ countries.

Unlimited PTO

  • Assess and improve visibility by identifying gaps in dashboards, metrics, and logs.
  • Refine alerts and dashboards for critical services to catch issues earlier.
  • Automate routine checks and monitoring tasks to free up engineers.

PlayOn is where high school sports come to life through platforms like GoFan, NFHS Network, and MaxPreps. As a growth-stage company backed by KKR, we build the technology that powers high school athletics from ticketing and streaming to fundraising and merchandise.

$140,000–$230,000/yr
US

  • Collaborate with Engineering, Product, and Operations to manage a global fleet of tens of thousands of media players and smart speakers.
  • Build tools in Bash and Golang for fleet management and investigate network issues, collaborating with customer Network Engineers.
  • Refine observability pipelines and processes to ensure efficient monitoring and support for distributed device management.

QSIC is a technology company that reinvents in-store audio by using audio, AI, and creativity to drive growth for retailers and brands. With team members in Australia, the US, and Mexico, they power thousands of stores across three continents, reaching over 100 million shoppers monthly, and received Series B funding in 2025.

US Unlimited PTO

  • Maintain, improve, and extend an AI platform already running in production.
  • Handle a mix of backend development, data pipelines, DevOps, and infrastructure work.
  • Translate business and product requirements into technical decisions independently.

Provectus is an AI consultancy and solutions provider. We help businesses adopt AI technologies, offering development and integration services. While the job posting doesn't mention company size information, they seem to foster a flexible, autonomous, and tech-forward culture.

Ireland

  • Design, build, and deploy production systems with a focus on scalability, reliability, observability, and performance.
  • Develop and maintain comprehensive automation solutions to eliminate toil and streamline operational efficiency.
  • Proactively monitor production systems and implement automated incident response mechanisms to minimise downtime.

Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. The company is well-established and profitable with over $8 billion in revenue and values diversity and inclusivity.

$180,000–$250,000/yr
US

  • Build our core Python/Rust platform: request routing, AI workload orchestration, scheduling, GPU autoscaling, large scale file storage, queueing, etc
  • Produce forward designs for platform evolution as we scale to 100x current traffic and need to provide low latency across the world
  • Leverage AI to an extreme level to automate the mundane parts of building complex but reliable systems

Fal is building the infrastructure, tools, and model access to move from AI idea to production. They aim to be the unified platform where high-performance inference, orchestration, and observability come together to unlock new categories of AI-native products.

$145,000–$250,000/yr
US Unlimited PTO

  • Construct infrastructure as code, developing and enforcing best practice across configurations while preventing drift between Terraform configurations and infrastructure deployments.

SentiLink provides innovative identity and risk solutions, empowering institutions and individuals to transaction with confidence. They are building the future of identity verification in the United States replacing a clunky, ineffective, and expensive status quo with solutions that are 10x faster, smarter, and more accurate.

Canada

  • Define, drive, design, and build/ship end-to-end solutions that solve real customer problems.
  • Contribute to the end-to-end AI/ML software development lifecycle, ensuring reproducible research.
  • Drive architecture, design, and delivery of advanced ML systems in the Product R&D team.

Kinaxis is a global leader in modern supply chain orchestration. Known for its AI-infused platform and transparency across end-to-end supply chains, Kinaxis helps customers make faster, better decisions. The company has over 2000 employees worldwide and is recognized with Top Employer awards.

$180,000–$300,000/yr
US 20w maternity 12w paternity

  • Act as a trusted advisor to clients, providing technical expertise and guidance throughout engagements
  • Conduct PoCs, workshops, presentations, and training sessions on GPU cloud technologies and best practices
  • Collaborate with clients to understand their business requirements and develop solution architectures

Lavendo partners with startups and high‑growth companies to help them hire top‑tier sales, go‑to-market, and technical talent. They are an equal opportunity workplace and consider all qualified applicants without regard to race, color, religion, national origin, age, sex, marital status, ancestry, disability, genetic information, veteran or military status, gender identity or expression, sexual orientation, or any other characteristic protected by law.

US Unlimited PTO

  • Lead Onboarding end‑to‑end and extend with additional use cases.
  • Own a small portfolio of customer account and act as a trusted technical partner all year.
  • Provide technical support and communicate crisply with customers throughout.

OpsMill is building the next generation of infrastructure data management, focusing on helping automation teams unify data and scale automation reliably. As a commercial open-source company, they are practitioners who understand the real-world challenges of scaling infrastructure automation.

Global

  • Travel frequently (up to 75%) to military installations to support system fielding, integration, and training.
  • Partner with operators and soldiers to tailor system configurations to mission needs.
  • Troubleshoot complex system and network issues and drive resolution to completion

Research Innovations, Inc. (RII) is dedicated to breaking through the status quo with transformative technology. They build advanced software solutions for government and military missions, applying agile development and user-centered design to solve complex, mission-critical problems.

SRE

Fal
$180,000–$250,000/yr
US

  • Own and operate our Kubernetes infrastructure.
  • Build and maintain CI/CD pipelines and deployment infrastructure.
  • Leverage AI to automate analysis and resolution of production issues.

Fal is the generative media ecosystem powering the next generation of AI products. They build the infrastructure, tools, and model access that teams need to move from idea to production.

  • Maintain the reliability and performance of customer environments remotely, supporting Mirantis Opensack/k0s layers.
  • Diagnose and resolve system-level issues, requiring hands-on Linux administration experience.
  • Troubleshoot customer environments based on Linux, OpenStack, Kubernetes, networking, and other cloud technologies; detect, report, and resolve issues.

Mirantis helps enterprises move to the cloud on their terms, delivering a true cloud experience on any infrastructure, powered by Kubernetes. They serve many of the world’s leading enterprises and value openness, collaboration, risk-taking, and continuous growth.

$120,000–$140,000/yr
Europe

  • Design and deliver backend features in Python for large-scale enterprise SaaS.
  • Contribute to technical excellence through testing, CI/CD, and secure design.
  • Provide technical guidance and architectural input across backend initiatives.

Arch Systems empowers discrete manufacturing facilities with deep data insights that enable optimal efficiency, precise KPIs, and proactive decision-making. The company works with leading manufacturers to integrate and optimise their data for actionable intelligence, fueling productivity and operational excellence.

India

  • Design, deploy, and manage Kubernetes-based platforms in production.
  • Implement and manage automation frameworks for infrastructure provisioning and operations.
  • Administer and optimize VMware environments (vSphere, ESXi, vCenter).

EPlus believes technology is a people business and delivers solutions that make a real difference. Their team is passionate, skilled, and driven, valuing collaboration, innovation, and extraordinary results and dedicated to fostering, cultivating, and preserving a culture that represents diversity, enables inclusion.