Senior Platform Engineer

 

Description:

As a Senior Platform Engineer working within the Harrison Platform team, you will be working in a small team of Solutions Architects and DevOps engineers to support and deliver components of the Harrison Platform.

The Harrison Platform is "the machine that builds the machine". It is a common toolset for building AI-as-a-Medical-Device solutions; a MLOps platform if you will. It is used by our ventures to accelerate, enhance and simplify their model development. A key component of the Platform is our physical Machine Learning Training Cluster, which is based on NVidia A100 DGX's and can also burst into AWS. Your role will focus on managing the cluster and associated AWS infrastructure as well as Kubernetes based software stack it hosts

 

What you'll do:

  • Develop and deploy the infrastructure stacks for MLOps platform related services using Terraform and Ansible,
  • Provide assistance in developing and deploying new features and improvements to the cluster, both for the physical DataCentre and within AWS,
  • Perform maintenance tasks such as software upgrades,
  • Manage support requests that come in from our AI Engineers,
  • Write end user technical documentation,
  • Liaise with vendors as required on support issues,
  • Develop, update and improve upon a variety of open source Terraform modules for venture and internal consumption,
  • Identify potential areas of improvement in both the technical and operational aspects of the Platform team, and
  • Occasionally be required to visit the DataCentre in Sydney should the need arise. (Note that being based in Sydney is not a requirement).

 

What we're looking for:

  • Linux administration skills,
  • Kubernetes knowledge and experience,
  • AWS (or major cloud vendor) experience,
  • Familiarity or experience in Infrastructure as Code tooling such as Terraform or Ansible,
  • Bash and Python scripting skills,
  • Working experience with Github Action workflows,
  • Familiarity with physical DataCentre environments and ideally have hands-on DataCentre / physical infrastructure experience,
  • Experience with storage systems,
  • Prometheus stack experience, and
  • Working knowledge of TCP/IP networking.

Organization Harrison.ai
Industry Engineering Jobs
Occupational Category Senior Platform Engineer
Job Location Sydney,Australia
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Intermediate
Experience 2 Years
Posted at 2024-07-20 10:15 am
Expires on 2024-11-17