During my Ph.D. and first Postdoc, I got familiar with artificial intelligence (AI) and machine learning techniques that I believe have the capability of changing our world in greater ways every day. I read a lot of papers and passed many online courses and applied for another postdoc position at the US to pursue my dream of applying AI to medical science. I was accepted as a postdoctoral associate at Weill Cornell Medical College and moved to New York in 2017. Here, I found the best opportunity to work in one of the top-notch research groups in the field of clinical image analysis. So, I defined a new project that applied my skills in data mining and AI to the analysis of pathological images.
My zeal in using mathematical modeling and algorithms for describing different biological systems inspired me to pursue a Ph.D. in Bioinformatics at the University of Tehran in 2009. After defining my thesis, I moved to Toronto to study under the supervision of Prof. Gary Bader at the Donnelly Center (CCBR), University of Toronto, Canada. During my time at CCBR, I developed a new method for detecting biomarkers and critical molecular changes during prostate cancer progression. The paper was published in Computer in Biology and Medicine journal. I defended my thesis in September 2014 and started a new position as a postdoctoral research fellow at the Institute for Research in Fundamental Sciences (IPM) in October 2014. In this position, I continued my research and developed a new algorithm for assigning the type to interactions in gene regulatory networks inferred from microarray data. The paper was published in Algorithm for Molecular Biology journal in 2015 and chosen as one of the top reading papers in RECOMB/ISCB Conference on Regulatory and System Genomics with DREAM challenge, Philadelphia, the USA in 2015.
My first paper in artificial intelligence concept was published in EBioMedicine, January 2018. In this paper, I utilize several computational methods based on convolutional neural networks (CNN) and build a stand-alone pipeline to effectively classify different histopathology images across different types of cancer. In particular, I demonstrate the utility of my pipeline to discriminate between two subtypes of lung cancer, adenocarcinoma and squamous cell carcinoma. On average, my pipeline achieved an average area under the curve (AUC) of 0.89 for discrimination of lung cancer subtypes on whole-slide images obtained from The Cancer Genome Atlas. My method yields cutting edge sensitivity on the challenging task of detecting various tumor classes in histopathology slides, reducing the false rate. This pipeline (Figure 1) is complementary to other clinical evaluation methods in order to improve pathologists’ knowledge of the disease and to improve treatment strategies. The pipeline does not require any prior knowledge of the image color normalization from the user, which allows pathologists and medical technicians apply this approach without extensive knowledge of optimization or mathematical tools. My pipeline and related documentation are freely available at https://github.com/ih-lab/CNN_Smoothie.
Figure 1: This flowchart demonstrates the CNN_Smoothie pipeline, which includes extracting data, training, andevaluation of CNN algorithms, and prediction of various classes.
I extended my research on developing methodologies for search, exploration, and analysis of large compendia that include not only gene expression data or pathological images, but also many additional data types, such as human blastocysts and MRI images. Therefore, I defined a new project for assessing human blastocyst quality and predict implantation success using deep learning methods.
Visual morphology assessment is routinely used for evaluating of embryo quality and selecting human blastocysts for transfer after in vitro fertilization (IVF). However, the assessment produces different results between embryologists and as a result, the success rate of IVF remains low. To overcome uncertainties in embryo quality, multiple embryos are often implanted resulting in undesired multiple pregnancies and resulting complications. Unlike in other imaging fields, human embryology and IVF have not yet leveraged artificial intelligence for unbiased, automated IVF embryo assessment. Therefore, I postulated that an artificial intelligence (AI) approach trained on thousands of embryos can reliably predict embryo quality without human intervention. To test this hypothesis, I trained a deep neural network to help select highest quality embryos based on a large collection of de-identified human embryo time-lapse images from a single, high-volume fertility center.
My deep neural network (DNN) called STORK predicts blastocyst quality with >98% accuracy and generalizes well to images from other clinics and outperforms individual embryologists. Using clinical data for 2,182 embryos, I then created a decision tree that integrates embryo quality and patient age to identify scenarios associated with increased or decreased pregnancy chance. This IVF data-driven analysis shows that the chance of pregnancy based on individual embryos varies from 13.8% to 66.3% depending on automated embryo quality assessment and patient age. In conclusion, my AI-driven approach (Figure 2) provides a novel reproducible way to assess embryo quality and uncovers new, potentially personalized strategies to select embryos with maximize likelihood of pregnancies and minimize multiple pregnancies. I hope this system would be established in several hospital units that helps significant reduction in confusion and delays related to patient medication. My discovery related to the automatic assessment of embryo quality is now protected by the CTL patent and the result of this project published in bioRxiv.
Figure 2: This flowchart demonstrates the design and assessment phases of STORK.
In addition to my work in analyzing embryo images, I have also worked on other collaborative projects dealing with many types of data. In the new project, I am working with digital MRI image to analyze of prostate cancer grades using deep convolutional neural networks.
Prostate cancer (PC) is the most commonly diagnosed cancer in adult male populations. The aggressiveness of tumors, as well as the different disease subtypes, remain key challenges. Early detection and intervention of aggressive prostate cancer can help improve the survival rate. The accurate diagnosis of prostate diseases preventing over-treatment of prostate diseases. The prostate cancer diagnosis is based on histologic examination of tissue currently obtained via needle biopsy. Though biopsy usually provides a correct and definitive diagnosis of prostate cancer, patients undergoing prostate biopsy frequently experience minor complications, including hematospermia, hematuria, rectal bleeding. There is a population of about 2-3% of men that end up with a devastating complication of infection with its associated mortality. Screening prostate cancer based on clinical images helps us identify the disease at an earlier stage before symptoms appear and ideally even predict high grade or metastatic disease.
My goal in this project, in the first step, is to introduce an alternative automated computational technique to distinguish aggressive prostate cancer from non-aggressive variations independent of biopsy using multiple MRI imaging data sets. In the next step, I will extend the method to detect various prostate disease conditions including benign prostatic hyperplasia, prostatitis, and prostate cancer through different states. My technique is based on AI and employs publicly available datasets as well data generated by collaborators at Weill Cornell Medicine for the purpose of training and testing. My method offers an opportunity to establish a database and an automated framework for classification and analysis of MRI images that in combination with radiologists’ expertise, can lead to better diagnosis and treatment planning for patients in the future. This project is still under process and I hope the result will be published in 2019 (Figure 3).
Figure 3: The figure shows two MRI images for (a) Gleason score ≥7 (aggressive) and (b) Gleason score ≤6 (non-aggressive) and their corresponding pathological images.
I am currently working with Dr. Iman Hajirasouliha, assistant professor of Computational Genomics at the Institute for Computational Biomedicine, Weill Cornell Medical School. I am also working with Dr. Olivier Elemento (director of the Caryl and Israel Englander Institute for Precision Medicine, WCMC), Dr. Bilal Chughtai (Urologist in Obstetrics and Gynecology, WCMC), Dr. Nikica Zaninovic, and Dr. Zev Rosenwaks from the Ronald O. Perelman and Claudia Cohen Center for Reproductive Medicine, WCMC. For future work with the collaboration of these senior faculty members, I intend on extending the project to the integrating genomic information with clinical images to train AI-based algorithms for both images and other clinical information such as chromosome abnormality and characterize structural variations such as gene fusions in whole genome sequence (WGS) data and RNA-seq data.
My research addresses a fundamental need of humans for AI, which can apply intelligence to medical science. Considering my accomplished scientific background in machine learning along with my mentoring and teamwork experience, I am confident that my work will continue to positively influence the field of health care in the future. These future methodological works described above are only part of my future work. My collaborative efforts will direct me towards open problems to which I can apply my theoretical and mathematics skills in the development of useful and novel methodologies.