Senior Scientific Data Engineer (Institutional Informatics Team - Joint Genome Institute)

Details

Posted: 2026-06-10

Location: Berkeley, California

Berkeley Lab's (LBNL) Joint Genome Institute (JGI) has an opening for a Senior Scientific Data Engineer to join the Institutional Informatics Team!

In this exciting role, you will provide technical expertise supporting scientific data generation and computing systems that enable laboratory operations, data analysis, and project management. This role analyzes complex scientific and operational requirements and translates them into scalable, user-focused system solutions and functional specifications. You will support the design, implementation, and continuous improvement of core platforms, including laboratory workflow orchestration, genomic data generation, metadata management, status tracking, the Laboratory Information Management System (LIMS), and the Data Warehouse/Data Lakehouse.

This position has an anticipated start date of August 3, 2026.

What You Will Do:

Translate complex scientific, operational, and user requirements into functional specifications, system designs, and implementation plans for system enhancements, integrations, and shared data platforms.
Design, develop, deploy, and support core systems, APIs, and workflows--including the Laboratory Information Management System (LMS), Data Warehouse/Data Lakehouse, and Proposal/Project Management platforms--that enable genomic data generation, metadata management, and laboratory operations.
Lead the resolution of complex technical challenges, drive system improvements, and ensure production platforms are reliable, scalable, interoperable, and high-performing.
Promote engineering best practices through technical reviews, documentation, mentorship, and continuous improvement of development and operational processes.

What We Are Looking For:

A Bachelor's Degree (or equivalent knowledge/training) in Computer Science or a related field and a minimum of 8 years of related professional experience developing, integrating, deploying, and supporting production software applications and data systems that enable metadata management, workflow orchestration, data lifecycle operations, and broad user data access to scientific and operational data or an equivalent combination of education and professional experience.
Experience with data engineering and event-driven technologies such as Airflow, Kafka, or related tools.
Experience working with various database and data storage technologies including relational databases, object storage platforms, and systems supporting structured, semi-structured, and large-scale datasets.
Strong knowledge of software and data engineering fundamentals supporting large-scale production systems, including system design, APIs, testing methodologies, concurrency, reliability, scalability, interoperability, and performance optimization.
Proficiency in Python and experience with one or more additional programming languages.
Experience using AI-assisted development tools, with demonstrated sound judgment in evaluating and validating generated code for production suitability.
Demonstrated ability to provide technical leadership in shaping system architecture and technical direction across cross-functional engineering groups.
Excellent communication skills, including experience organizing and presenting complex technical information to internal teams and stakeholders.
Demonstrated experience collaborating with stakeholders to understand project goals and translate complex scientific, operational, and user requirements into automated systems, technical specifications, and implementation plans.

Senior Scientific Data Engineer (Institutional Informatics Team - Joint Genome Institute)

Post My Job

Berkeley, California