Laboratory of Genetics Intramural Program Investigators ^
Home ^
Ilya Goldberg, Ph.D., Staff Scientist
Facility Head, Image Informatics and Computational Biology Unit
Ilya Goldberg, Ph.D.Dr. Goldberg received his Ph.D. in Biochemistry and Cell Biology from the Johns Hopkins University School of Medicine in 1996. Following postdoctoral training in crystallography and virology at Harvard University, and image informatics at MIT, he joined the NIA in 2001. While at MIT, he founded the Open Microscopy Environment (OME: together with Drs. Peter Sorger and Jason Swedlow. The aims of OME are to provide open information interchange formats and open-source software infrastructure for the scientific imaging community. Currently, the IICBU continues to develop software and standards for OME, new approaches to pattern recognition in images, and new technology for image-based high throughput screening. All of this technology development drives the central theme of the IICBU: How cell and tissue morphology relate to cellular and organismal state.

Image Informatics and Computational Biology Unit (IICBU): This program is designed to develop technology for quantitative imaging assays for studying age-related processes at the cellular, tissue, and organism level.
1. Software and Standards for the Open Microscopy Environment (OME): OME is an open-source software project to implement image informatics infrastructure capable of analyzing, managing and organizing images and related information on a large scale (105 - 108 images per system) (1). This is a collaborative project between four academic groups: The NIA IICBU, Jason R Swedlow, University of Dundee, Peter Sorger, Massachusetts Institute of Technology, Kevin Elicieri and John White, University of Wisconsin-Madison. The project currently comprises several hundred source files and nearly a half-million lines of code in Perl, C, Java, HTML, XML, and MATLAB. This ongoing project is in use by many, in addition to the four collaborative groups, has active email lists for developers and users, produces at least one stable release per year, and has a live public code-base that receives a dozen commits per day. More information about OME, its history, architecture and technical documentation is available on its web-site at
Currently, IICBU is involved in four aspects of the OME project: 1) Curating the OME XML file format, which has gained acceptance by manufacturers of microscopy software and equipment. 2) Implementing public image repositories based on OME that are cross-referenced with other public genomics and other "omics" datasources. 3) Developing end-user tools that work with OME's data model; and 4) Maintaining and validating the OME Analysis Engine in machine vision and pattern recognition applications.
2. Quantitative Morphometrics for Determining Cellular and Organismal State: Automated image analysis can be divided into two broad categories: model-based and model-free. In traditional model-based systems, the model of what is being imaged is manually constructed, and is used as the basis to report quantitative information (e.g., an algorithm for finding "blobs" in an image that reports their size, shape, signal intensity, etc). The main advantage of the model-based approach is that one controls the aspects of the image that will be considered (e.g., the algorithm and parameters for finding the "blobs"); but a different approach is needed in situations where the model can't be easily defined, or is completely unknown.
Model-free systems make no assumptions of an underlying model and perform the analysis after training with a relevant set of images. Thus, a model-free pattern recognition approach is more widely applicable than a model-based approach. It treats all images equivalently, and performs the same operations whether grading lymphomas, determining sub-cellular localization, sub-typing pollen grains, etc. Each image is reduced to a set of "signatures" (also called "features" in machine learning, or "image descriptors" in machine vision). Each signature is a numeric value produced by an algorithm sensitive to a specific type of image content, and can be thought of as a sensor for a specific image characteristic (various textures, intensity statistics, distribution of objects, etc). A large collection of signatures (>1200 in our case) ensures that there is a sufficient variety of sensors available for many kinds of images. Because the vast majority of these signatures are irrelevant to a given imaging problem, all signatures with weak discrimination power are eliminated in a systematic automated way. The reduced set of signatures for a particular problem is then used to train standard network-based classifiers.
We have validated the generality and accuracy of this approach in over a dozen different imaging problems. For example, of particular interest in our Institute has been the application of the model-free approach to study the aging process in a quantitative as well as objective way. This required as a prerequisite a technique for continuous classification. We developed and validated two different approaches to solve this problem using images of C. elegans body wall muscle and non-invasive imaging of the C. elegans pharynx terminal bulb. In both cases, a machine-built continuous classifier was able to report a value that correlated with the known age of the worm, and was able to correctly interpolate ages that were not used in training the classifier.
Continuous classification is being used to determine if tissue morphology at an early age is correlated with total life span. Ongoing experiments use non-invasive imaging of the worm's pharynx terminal bulb (the worm's eating organ), then track each individual worm to determine its total life span. The images are then grouped into "short-lived," "medium-lived" and "long-lived" depending on the life span. Given a sufficient sample size, the ability to train a classifier indicates that a correlation exists between early morphology and total life span on an individual basis. Preliminary results indicate that this may indeed be the case. If more data reinforces these results, follow-up experiments will use this classifier to segregate a genetically identical population into groups of long and short-lived worms at an early age. These sub-populations can then be compared using microarrays to determine the genes that are expressed differently in the two populations, and that may be exerting control over or are influenced by the aging rate.
As another application, a high-content screening platform is being developed to make image-based morphological screens cheaper and easier. This platform is based on microarray technology to print RNA-interference (RNAi) or gene-expression constructs at a high density on microscope slides. Once the slide is printed with 2000-5000 different gene-specific constructs, cells are plated on the entire slide. Cells that land on a printed "spot" will be altered relative to their neighbors depending on what was printed on that spot - either a single gene will be knocked-down if RNAi was printed, or a single gene will be over-expressed if an expression construct was printed. Current preliminary results indicate that we can print RNAi at adequate densities, grow cells on these slides, and observe predicted phenotypes depending on what was printed. Knocking down a required gene results in a "hole" in the continuous lawn of cells wherever this gene's RNAi was printed on the slide. These control experiments are being followed up with a full-scale screen looking for genes whose expression levels affect nuclear and mitochondrial morphology.
Finally, we have embarked on a project to evaluate pattern recognition approaches and quantitative morphometry in machine-assisted medical diagnosis. This project is a collaboration with Dr. Elaine Jaffe at the National Cancer Institute, a world-renowned expert in classifying human lymphoma. Currently, we are performing control experiments to determine how well our machine classifiers are able to reproduce diagnosis on three "classic" types of lymphoma: follicular, mantle-cell, and lymphocytic.

Contact Information:
Laboratory of Genetics
Biomedical Research Center, room 10B119
251 Bayview Boulevard, Suite 100
Baltimore, MD 21224-6825

Phone 410-558-8688
Fax 410-558-8331
E mail

For more information about the Laboratory:

Help Downloading Files on This Page
Skip Navigational Links
NIA Home  IRP Home     What's New     Contact Us     Accessibility     Disclaimer     Privacy     Site Map     NIA Home         
NIH logo-link to NIH Home Page DHHS logo-link to DHHS Web Site USA.Gov logo-link to USA.Gov Web Site
Updated: Saturday October 20, 2012