I am an assistant professor in the Department of Computer Science at Texas State University, and an adjunct assistant professor in the Department of Cellular and Structural Biology at the University of Texas Health Science Center - San Antonio. I am the Principal Investigator of Oncinfo Lab.I received my Ph.D. from the University of British Columbia (UBC) in 2011. Prior to my current position, I was a researcher at British Columbia Cancer Agency in Canada, and also in the Department of Genome Sciences at the University of Washington. My old homepage is out-of-date but it archives my previous research. See my updated CV for a summary of my professional activities.
I am a bioinformatician and computational biologist, and my primary interest is focused on developing new machine learning algorithms for analyzing complicated, large size, and rich biological data. I have experience with analyzing data provided by a wide range technologies including, next generation sequencing, ChiP-seq, and flow cytometry. I am interested in combining techniques from a variety of fields such as spectral graph theory, feature extraction, Bayesian statistics, Dynamic Bayesian Networks, and deep learning to develop novel algorithms, which can recognize the patterns in the biological or clinical data missed by state-of-the-art methodologies. I am very eager in interdisciplinary collaborations with smart biologists and clinicians to define and accomplish scientific projects with the goal of making discoveries in areas such as molecular and cell biology, cancer diagnosis and prognosis, phylogenetics, genomics, and epigenomics.
Postdoctoral and graduate student positions are available in Oncinfo Lab. The interested individual should be highly motivated to perform cutting-edge research in the field of bioinformatics and computational biology. Qualification criteria include, but are not limited to, professional expertise in programing and scripting, some basic knowledge of molecular biology or willingness to quickly obtain the background, and prior experience in comprehending scientific text and technical writing. Exposure to interdisciplinary research, or familiarity with R is a plus. See this advertisement for more details [pdf].
The primary way of communication between me and students who enroll in my classes is through Piazza. Please sign up using the provided link to get access to general announcements, homeworks, exams, slides, etc.. While students are encouraged to ask questions via Piazza, and read the discussions initiated by other students, they are discouraged to send emails to me directly.
- Spring 2017: Formal Languages & Foundations of Computer Science II (sign up here).
- Fall 2016: Formal Languages & Foundations of Computer Science II.
- Spring 2016: Formal Languages.
- Fall 2015: Formal Languages.
- Spring 2015: Formal Languages & Foundations of Computer Science II.
- Fall 2014: Formal Languages.
I am currently leading a couple of projects which are defined in collaborations with academica and industry. My partners in academia include Karsan lab at British Columbia Cancer Agency in Canda, and Noble lab at the University of Washington. I have also ongoing collaborations with R&D departments of top pharmaceutical companies such as Pfizer and Novartis. There are more details on my current scientific projects in Oncinfo website.
Most of my papers are accessible from my Google Scholar Profile.
- H. Zare et al.,
Composition from Multiple Sections of a Breast Cancer,
PLoS Computational Biology 2014, 10(7).
"We proposed a generative model for Next Generation Sequencing data derived from multiple subsections of a single tumor, and we described an EM procedure for estimating the clonal genotypes, number of clones, and relative frequencies using this model."
Most of the code and algorithms I developed for scientific computing and data analysis are publicly available including:
- Pigengene; an R package that provides an efficient way to infer biological signatures from gene expression profiles based on gene network analysis.
- Clomial; an R package for fitting binomial distributions data obtained from Next Gen. Sequencing of multiple samples from the same tumor. The trained parameters can be interpreted to infer the clonal structure of the tumor.
- SamSPECTRAL; an R package which implements my enhancement to spectral clustering for fast analysis of large data sets. It is capable of clustering flow cytometry data with 100K events in minutes.
- FeaLect; an R package which implements my novel scoring scheme useful for feature selection. It is a general purpose tool with applications in bioinformatics.
The best way to reach me is by sending an email to me at
I will be happy to meet colleagues and students during my office hours, Mondays and Wednesdays 8:30am-11:00am at Comal 307B. Please send me an email to set an appointment before your visit.