The primary purpose of the Data Scientist position is to design, implement and maintain the process of building automated image interpretation tools and the extraction of tumor measurements to fulfill the TMI objective. This activity is an important sub-component of the overall function of TMI and requires a combination of computational skill and subject matter (imaging) expertise.
Ideal candidates will have machine learning experience with medical imaging. Candidates will be asked to share examples of their machine learning experience with medical imaging
This individual will be working with internal and external teams developing specific image analysis algorithms and will coordinate these efforts in a fashion that supports scientific/technical evaluation and integration into the broader TMI effort.
This individual will have demonstrated experience with programming languages and scripting methods (Python, MATLAB, C++, CUDA, Bash, and/or SQL), machine learning / deep learning methods, data analytics, and medical image analysis. Preference is for candidates with experience with common open-source scientific computing libraries such as PyTorch and TensorFlow. The ideal candidate will have strong computational and analytical skills particularly in deep learning and is motivated by solving challenging medical research problems for patient benefit. Additionally, experience identifying opportunities to streamline and optimize code and imaging pipeline processes is a plus. Interest in continuously and independently exploring and learning new technologies and solutions beyond current knowledge base is also required.
Technical Expertise
Tools/models development and management:
Working independently and with researchers in analyzing, defining, and resolving analytical problems and bugs.
Participating in discussion and implementation of machine learning model management solutions.
Evaluating existing algorithms/tools and developing new and user-friendly routines in an automated fashion that can be deployed to the broader TMI effort.
Maintaining knowledge of cutting-edge machine learning approaches and technologies and implementing these where appropriate.
Maintaining high code quality and ensuring code is thoroughly and consistently tested before deploying for end user use.
Organizing data and publishing code with documentation, in line with departmental standards.
Providing support for existing software systems as they evolve.
Analytical Thinking
Computational programming skills:
Providing analysis of data, design, and feasibility of proposed solutions.
Developing solutions to ensure data flow through different platforms within the Context Engine and performing testing and documentation.
Designing and developing automated module testing routines (unit testing, integration testing, etc.) and defining version control procedures.
Identify and evaluate publicly available, pre-developed containers and models to enrich the TMI container library.
Oral and Written Communication
Team support and guidance:
Transferring knowledge, expertise, and methodologies by proactively providing technical assistance to researchers and peers.
Serving as primary technical contact for members of the TMI automation team to receive/review requests and assisting researchers to analyze a wide variety of clinical data, evaluate, and interpret the results.
Presenting results and progress in project meetings as well as external meetings, workshops, conferences, etc.
Communicating and assisting cooperatively and effectively with leaders, peers, end users and support teams when required.
Other duties as assigned
Education Required: Bachelor's degree in Biomedical Engineering, Electrical Engineering, Computer Engineering, Physics, Applied Mathematics, Science, Engineering, Computer Science, Statistics, Computational Biology, or related field.
Experience Required: Three years experience in scientific software development/analysis. With Master's degree, one years experience required. With PhD, no experience required.
Preferred Experience: Experience with common open-source scientific computing/machine learning libraries (e.g., PyTorch / TensorFlow), containerization, and cloud-native technologies (Docker & Kubernetes) is preferred.
Knowledge of version control protocols, automated test frameworks, and high-performance computing is highly desired.
It is the policy of The University of Texas MD Anderson Cancer Center to provide equal employment opportunity without regard to race, color, religion, age, national origin, sex, gender, sexual orientation, gender identity/expression, disability, protected veteran status, genetic information, or any other basis protected by institutional policy or by federal, state or local laws unless such distinction is required by law. http://www.mdanderson.org/about-us/legal-and-policy/legal-statements/eeo-affirmative-action.html
Additional Information
Requisition ID: 165119
Employment Status: Full-Time
Employee Status: Regular
Work Week: Days
Minimum Salary: US Dollar (USD) 91,000
Midpoint Salary: US Dollar (USD) 113,500
Maximum Salary : US Dollar (USD) 136,000
FLSA: exempt and not eligible for overtime pay
Fund Type: Soft
Work Location: Hybrid Onsite/Remote
Pivotal Position: Yes
Referral Bonus Available?: Yes
Relocation Assistance Available?: Yes
Science Jobs: No
#LI-Hybrid