Description:
The pipelines of 1bn+ rows/30TB+ will need to be designed to work quickly, as the data coming from clients and going into their data warehouse (Azure Databricks) is being used for GenAI. The target for responses is 6 seconds. The data warehouse has already been established but may need fine-tuning as they scale.
Skills required:
- End to end experience building similar pipelines to the above, within a start-up or scale-up environment - They currently use PySpark and ADF for ETL but would be open to listening about any other ETL tools that you can recommend
- Ability to work independently and without assistance as you will be responsible for everything from data discovery, to solution design, troubleshooting and data engineering
- Demonstrated experience in cloud data warehousing and cloud-to-cloud transformations, ideally within Azure/Databricks but other cloud platforms will be accepted for a technology agnostic candidate
- Solid knowledge of new trends and emerging tools and technologies within data engineering, with a passion for discovering these and introducing them to the business
- Expert-level Python and SQL knowledge, software engineering or data science backgrounds would be highly regarded
- Desired, but not essential: In-store retail or e-commerce experience within Australia