Lead Software Engineer – Data

Mississauga, Ontario | Full Time | On-site


Egen is a fast-growing and entrepreneurial company with a data-first mindset. We bring together the best engineering talent working with the most advanced technology platforms, including Google Cloud and Salesforce, to help clients drive action and impact through data and insights. We are committed to being a place where the best people choose to work so they can apply their engineering and technology expertise to envision what is next for how data and platforms can change the world for the better. We are dedicated to learning, thrive on solving tough problems, and continually innovate to achieve fast, effective results.

Our Data Engineering teams build scalable data pipelines using Python, Spark, and cloud services (GCP and AWS). The pipelines we build typically integrate with technologies such as Kafka, Storm, and Elasticsearch. We are working on a continuous deployment pipeline that leverages rapid on-demand releases. Our developers work in an agile process to efficiently deliver high value applications and product packages.

As a Lead Engineer at Egen, you will leverage Spark and GCP (preferred) to architect and implement cloud-native data pipelines and infrastructure to enable analytics and machine learning on rich datasets.  

Required Experience:

  • Built and run resilient data pipelines in production and have implemented ETL/ELT to load a multi-terabyte enterprise data warehouse.
  • Implemented analytics applications using multiple database technologies, such as relational, multidimensional (OLAP), key-value, document, or graph.
  • Defined data contracts, and wrote specifications including REST APIs.
  • Transformed data between data models and formats with the most modern PySpark practices. Have built cloud-native applications and supporting technologies / patterns / practices including:  Cloud Services, Docker, CI/CD, DevOps, and microservices.
  • Planned and designed artifacts that describe software architectures involving multiple systems and technologiesYou’ve worked in agile environments and are comfortable iterating quickly.

Nice to have’s (but not required):

  • GCP expertise is preferred but will consider AWS
  • Experience moving trained machine learning models into production data pipelines.
  • Experience in biotech, genomics, clinical research or precision medicine.
  • Expert knowledge of relational database modeling concepts, SQL skills, proficiency in query performance tuning, and desire to share knowledge with others.
Scroll to Top