Tumor classification: molecular analysis meets Aristotle
© Berman 2004
Received: 15 November 2003
Accepted: 17 March 2004
Published: 17 March 2004
Skip to main content
© Berman 2004
Received: 15 November 2003
Accepted: 17 March 2004
Published: 17 March 2004
Traditionally, tumors have been classified by their morphologic appearances. Unfortunately, tumors with similar histologic features often follow different clinical courses or respond differently to chemotherapy. Limitations in the clinical utility of morphology-based tumor classifications have prompted a search for a new tumor classification based on molecular analysis. Gene expression array data and proteomic data from tumor samples will provide complex data that is unobtainable from morphologic examination alone. The growing question facing cancer researchers is, "How can we successfully integrate the molecular, morphologic and clinical characteristics of human cancer to produce a helpful tumor classification?"
Current efforts to classify cancers based on molecular features ignore lessons learned from millennia of experience in biological classification. A tumor classification must include every type of tumor and must provide a unique place for each tumor within the classification. Groups within a classification inherit the properties of their ancestors and impart properties to their descendants. A classification was prepared grouping tumors according to their histogenetic development. The classification is simple (reducing the complexity of information received from the molecular analysis of tumors), comprehensive (providing a place for every tumor of man), and consistent with recent attempts to characterize tumors by cytogenetic and molecular features. The clinical and research value of this historical approach to tumor classification is discussed.
This manuscript reviews tumor classification and provides a new and comprehensive classification for neoplasia that preserves traditional nomenclature while incorporating information derived from the molecular analysis of tumors. The classification is provided as an open access XML document that can be used by cancer researchers to relate tumor classes with heterogeneous experimental and clinical tumor databases.
In January 1999, the U.S. National Cancer Institute (NCI) issued a challenge to the scientific community "to harness the power of comprehensive molecular analysis technologies to make the classification of tumors vastly more informative. This challenge is intended to lay the groundwork for changing the basis of tumor classification from morphological to molecular characteristics."  Not surprisingly, this has resulted in lively debate over the relative value of morphologic and molecular classifications.
A classification is an organization of everything in a domain by hierarchical groups, according to features generalizable to the members of the groups. Four terms with distinctly different meanings have been used interchangeably with "classification," leading to considerable confusion among pathologists and cancer researchers. These terms are: identification, discrimination, taxonomy, and ontology . Identification (also known as diagnosing or naming) is the act of placing something into its correct slot within an existing classification. Discrimination is finding features that separate members of a group according to expected variations in group behavior. Examples of discrimination are "grading and staging." Grading and staging involve reporting additional morphologic features (grading) or clinical behavior (staging) that help predict a particular tumor's clinical course or response to therapy. A taxonomy is a complete listing of all the members of a classification. In the case of neoplasia, a taxonomy would be the complete listing of all the different named tumors. An ontology is a rule-based grouping of some portion of a taxonomy. Ontologies support queries and logical inferences pertaining to the [ontologic] group members.
Much of the current work in the molecular classification of tumors is actually discriminant analysis disguised as classification. In a typical gene expression array study, the researcher will look at a group of tumors of a specific type. Cluster analysis of the gene expression array values will help separate the tumors into groups with common expression patterns. Some of these groupings will prove to have a specific biologic feature (e.g. increased tendency to metastasize, higher response to a chemotherapeutic agent, lengthened survival) [4–8]. The groupings seldom qualify as new classes if they merely represent variations in the expected biology of a type of tumor. Variant groups are disqualified as classes if it can be shown that a tumor of a certain type may progress from one variant group to another variant group over time (e.g. slow-growing variant at one stage in development and fast-growing variant at another stage). A key concept in a classification is that the members of one class cannot transform into the members of another class (i.e. a colon carcinoma does not transform into a colon lymphoma).
In the author's opinion, common misuses of the term "classification" form the greatest impediment to progress in the field of cancer genetics. It is impossible to create a molecular classification of tumors based solely on the separation of tumors by variations of molecular markers. Clustering by variation only identifies differences among tumors and is not sufficient to establish a classification. Classification is the process of showing that certain differences reliably distinguish the members of a group from the members of all other groups, and that these differences apply to the group's hierarchical descendants. Therefore, the data that comes from the molecular analysis of tumors can be considered a first-step in the process of tumor classification.
Who actually uses tumor classifications? The author considers himself an example of someone who needs to have a comprehensive tumor classification. As Program Director for Pathology Informatics within the National Cancer Institute's (NCI's) Cancer Diagnosis Program, I am responsible for developing research initiatives that assemble and organize large amounts of pathology data. These efforts need to interoperate with other NCI programs, including the NCI Center for Bioinformatics, which has created tools for linking experimental data, pathology data and clinical trial data. A tumor classification serves as the "key" data structure that links the names of tumors to tumor-related data held in NCI's different clinical and experimental databases. A good classification should help drive-down the complexity of enormous databases and help us discover relationships among different data elements by assembling data under sensible group hierarchies.
Classifications are important because class properties are shared among the members of a class, and because members of a class inherit the properties of their ancestors. Simply knowing the class of a bacteria can provide a microbiologist with deep insights relating to the expected growth conditions for the organism and the types of antibiotics that may be effective against the organism.
A classification can be thought of as the encapsulation of all knowledge related to a domain. In a modern classification, the elements of the classification (classes and instances) serve as annotation keys and are capable of relating all data to the classification, regardless of the location of the data. Using classed tumor names [from a standard nomenclature], a search of gene expression array databases might locate gene array data specific to the class.
In the last few years, efforts have begun to characterize tumors based on molecular pathways that will serve as targets for new, non-toxic chemotherapeutic agents. There have been early successes with tumors sensitive to the inhibition of tyrosine kinases (gastrointestinal stromal tumors (GIST) and chronic myelogenous leukemia) with Gleevec . Both these tumors derive from non-endodermal/ectodermal embryonic layers, suggesting that molecular pathways (hence targets for chemotherapy) may be class-dependent.
Classifications are created piecemeal for specific sites or organ systems. Nobody has published a comprehensive classification, although comprehensive taxonomies have been attempted.
Classifications are often based on medical disciplines, rather than on any biologic principles (e.g. classification of dermatologic tumors).
A given tumor will appear redundantly when subclassifications are merged.
No tumor classification has been prepared in a standard format designed to exchange, merge or analyze heterogeneous biological data
The most widely-used authoritative resources are the World Health Organisation classifications, which list the tumors that occur at different body sites . The problem with an organ system approach to classification is that every organ contains organ-specific and organ non-specific cell types. The brain, for instance, contains connective tissue and lymphoid tissue, and therefore is prone to tumors of connective tissue and lymphoid tissue. A listing of tumors that occur in the brain must include: osteocartilaginous tumors, lipoma, fibrous histiocytoma, hemangiopericytoma, rhabdomyosarcoma, melanoma, lymphoma and myeloma, among others. These same tumors will be included again and again in every site-specific classification. Although each term may occur only once in each site-specific classification, the same lesion may occur a virtually limitless number of times when the site classifications are combined into a comprehensive classification of tumors.
Although cancer taxonomies are different from classifications, they usefully provide all the instances of tumors that must be grouped within a classification. Excellent tumor taxonomies are now publicly available at no cost.
The UMLS Metathesaurus is a collection of medical terms collected from about 100 different nomenclatures. It is the richest source of term synonymy in existence and has a comprehensive and detailed set of neoplasia-related terms. The UMLS Metathesaurus is available from the U.S. National Library of Medicine at: http://www.nlm.nih.gov/research/umls/
MESH is a curated hierarchical listing of medical terms and includes a detailed and comprehensive tumor nomenclature. MESH is provided by the U.S. National Library of Medicine at: http://www.nlm.nih.gov/mesh/filelist.html
The ICD-O is prepared by the World Health Organization. It is in its third version and is available at no cost in the U.S. via state central cancer registries. Non-U.S. facilities should contact their health services for information. Additional information can be obtained at the NCI's Surveillance Epidemiology and End Results (SEER) website at: http://training.seer.cancer.gov/module_icdo3/icdo3_home.html
The National Cancer Institute curates a collection of terms related to cancer. The thesaurus can be freely downloaded from: ftp://ftp1.nci.nih.gov/pub/cacore/EVS/
A classification is a hierarchical grouping, with each group defined by the greatest number of taxa (informative features) that can apply to every instance of the group
Every instance must fit into the classification, and every instance and group must have exactly one slot in the classification.
Instances and groups are separable from other instances and groups by taxa.
Every classification must be constantly tested and restructured (groups and instances) as needed
Organs may have multiple embryologic derivatives (e.g. skin contains tissues of ectodermal, neuroectodermal and mesodermal lineages), but any given cell has only one lineage. This means that a histogenetic classification can assign any tumor to a unique position within the classification.
For the most part, tumors have a cell developmental stage. For instance, blastomas are thought to arise from a cell type that precedes organ differentiation. Squamous cell carcinomas of skin are tumors that have features of cell type that developed from one of the embryonic layers (ectoderm). An embryologic approach permits us to assign a cell type and a developmental stage to tumors.
A classification based on developmental histogenesis is relevant to the behavior of tumors. Ectoderm and endoderm-derived tumors metastasize via lymphatics. Mesenchyme-derived tumors tend to metastasize by hematogenous spread.
A classification based on develomental histogenesis is consistent with modern molecular analysis of tumors. Mesenchyme-derived tumors tend to be characterized by simple fusion genes. Ectoderm and endoderm-derived tumors tend to be genetically unstable and cannot be characterized by a single genetic abnormality. Primitive blastomas share similar markers regardless of the organ of origin.
The complete classification is available as a supplemental XML file with this article. An abbreviated classification, containing developmental divisions, is shown below. The complete classification schema is available as an annotated XML document with this article [see Additional file 1].
The classification assigns tumors to the different stages in human development, ignoring embryologic categories that are not associated with tumors and combining embryologic categories that do not serve to distinguish tumors of a given type.
The first division of the classification divides tumors as embryonic (from which the body develops) or trophoblastic (extra-embryonic) lineage. From the embryonic class come the primitive cells (also call totipotent cells) that precede the differentiated cells. The primitive cells give rise to several subclasses of tumors
Primitive tumors that remain uncommitted. This would include the Ewing family of tumors, which includes Ewing's tumor peripheral neuroectodermal tumors (PNET), and intra-abdominal desmoplastic small-round-cell tumor . The tumor cells may show certain features of differentiation, but the cells are not equivalent to any specific differentiated cell-type, and seem to occupy a primitive developmental state that precedes the development of embryonic layers.
Primitive tumors that differentiate. This group can be divided into primitive tumors with multi-lineage differentation (as in teratomas) and primitive tumors with restricted differentiation (less than three embryonic layers, such as pancreatoblastoma and hepatoblastoma).
The concepts of germ cells and germinal cells are often confusing to pathologists, who use these terms somewhat differently than embryologists. For pathologists, germ cells are the specialized cells that give rise to ova (in the female) and sperm (in the male). The process of differentiation of ova and sperm is no different than the process of differentiation for any other cell type, and the most frequently occurring tumors in this lineage are dysgerminomas (in females) and seminomas (in males). Germ cells are differentiated cells and should not be classified with totipotent, primordial, uncommitted or primitive cells. The term germinal cells is probably best avoided altogether. It is sometimes used to mean totipotent, sometimes as a synonym for germ cells and sometimes to indicate lineage from one of the early germ layers. All three meanings are unrelated, and the term "germinal" is confusing when applied to tumor classification. In the classification, "germ cell" is intended to mean only one thing: the cell lineage that gives rise to differentiated ova or sperm.
In the classification, the class of endodermal and ectodermal derived tumors are combined. This is done simply because there seems to be no biologic, clinical, morphologic, or molecular differences among the tumors derived from either of these germ layers. This division contains most of the commonly occurring tumors of man. Among the tumors of the endoderm/ectoderm division, there may be some value in subdividing these by functional cell type. The classification separates the epithelial tumors that arise from ectoderm/endoderm surface epithelium (which would include squamous cell carcinoma of skin or bronchus or esophagus, and adenocarcinoma of colon) from tumors that arise from the ectodermal/endodermal organ epithelium (breast carcinoma, salivary gland carcinoma, pancreatic carcinoma, hepatocellular carcinoma) and tumors arising from ectoderm/endoderm endocrine epithelium (thyroid papillary and follicular carcinoma and pituitary adenoma).
The class of mesodermal tumors is particularly confusing to the non-embryologist because it contains all sarcomas, as well as tumors derived from specialized mesodermal epithelium. In human development, the mesoderm is the embryonic layer separating the ectoderm and the endoderm. It gives rise to all of the connective tissue of the body (i.e. the mesenchyme). The mesenchymal tissues are usually divided into the soft tissues (deriving from muscle, fibrous tissue, and vascular tissue), hard tissue (i.e. bone and cartilage lesions) and hematopoietic (which would include all lymphomas and leukemias regardless of their tissue of origin). In addition to the mesenchyme, the mesoderm is capable of creating lumen or cavities lined by specialized mesodermal epithelium. The coelomic cavities become the pleura, peritoneum, pericardium, and joint spaces, lined by mesothelium and synovium. These give rise to mesothelioma and to synovial sarcoma, two tumors that are morphologically similar, composed of both epithelial cells and spindle cells.
A very specialized coelomic lining cell covers the gonads. These cells are morphologically, clinically and genetically distinct from the other coelomic lining cells and are assigned their own subclass, which includes papillary serous carcinoma of ovary.
The coelom is also capable of forming epithelial-lined ducts. These ducts (such as the paramesonephric duct) give rise to the Fallopian tubes and uterus. All uterine and cervical cancer fall in the class of mesodermal, coelomic-ductal tumors. The mesodermal lineage of these tumors is consistent with the variety of mixed epithelial and non-epithelial tumors arising from the uterus (endometrial carcinoma, carcinosarcoma of endometrium, heterologous mixed mesodermal tumors of uterus).
A specialized mesoderm develops subjacent to coelomic cavities. Specialized sub-coelomic mesoderm tissues give rise to the gonads, the adrenal cortex and to the kidneys. Tumors derived from these specialized mesodermal tissues are assigned their own subclasses. The mesoderm that develops as gonadal stroma gives rise to the sex cord-stromal tumors of the ovary and testis (e.g. granulosa cell tumor, thecal tumors, Sertoli-Leydig tumors, sex cord tumor with annular tubules). The mesoderm that develops as adrenal cortex gives rise to cortical adenomas and carcinomas. The mesoderm that develops as kidney gives rise to renal cell carcinoma and all of its variants.
Cells deriving from neuroectoderm account for brain tumors (neural tube) and tumors derived from the neural crest (peripheral nervuous system tumors, some neuroendocrine tumors and melanocytic tumors). True blastic brain tumors (tumors derived from cells that have not demonstrated neuroectodermal differentiation) would not be classified among tumors deriving from neuroectoderm (e.g. medulloblastoma). Not every tumor with a blastoma suffix is derived from primitive cells. Some blastomas are highly anaplastic versions of tumors that derive from differentiated cells. Glioblastoma is a good example. These tumors are closely related to high grade astrocytomas, and they would be classed as tumors of neural tube parenchyma.
Each tumor occurs only once in the classification
The classification is comprehensive (e.g. every tumor of man can be placed somewhere within the classification.
The classification is simple. One of the purposes of a classification is to drive down the complexity that exists when the domain taxonomy is large. The entire classification is described by under 40 classifiers.
Other tumor classifications divide tumors by medical specialty (e.g. dermatologic neoplasms, hematologic neoplasms, thoracic neoplasms, etc.) This classification is based on biologic principles. The classification uses a feature from developmental biology to capture the most important genomic dichotomy in tumor biology, the separation of tumors with simple and characteristic genetic abnormalities from tumors with genetic instability.
The classification has "competence." In the field of informatics, competence is the ability to answer questions related to the instances of a data group.
The classification is represented as an XML document.
It is easy to add subdivisions to the classification. This is important, as the molecular analysis of tumors is likely to provide new taxa.
It is easy to move subdivisions of the classification. Classifications are hypothetical re-creations of reality and must be changed as information is accrued.
The classification is easily understood by developmental biologists. Developmental biologists are major participants in post-genomic science and need to have tools to relate basic research with clinical exigencies.
The classification is compatible with modern theories of the "stem cell" origin of tumors.
The classification does not invalidate existing diagnoses found in pathology reports. The medicolegal importance of this feature cannot be exaggerated. This relieves pathologists from reviewing all their prior cases and re-diagnosing them in conformance with a new classification.
The classification is an open access document that can be used or criticized freely by the biomedical community.
In general, creating a coherent classification is an intellectually demanding process. Aristotle was the first great classifier. Observing that dolphins have a placenta, he reasoned that dolphins are mammals, not fish. This insight was greeted with almost uniform derision for nearly two thousand years. The fortunes of taxonomists have barely improved in the interim. Gould complains that taxonomy is portrayed as the dullest of all fields . "But classifications are not passive ordering devices in a world objectively divided into obvious categories. Taxonomies are human decisions imposed upon nature – theories about the causes of nature's order . In a recent letter to Nature, Thiele and Yeates comment that research funds go into high profile projects (like the human genome project) but miss classification projects . Classification projects are never-ending because class assignments are tentative and subject to continual testing and improvement .
The highest levels of the classification are the primitive tumors (which include the teratomas and the primitive blastic tumors), tumors of endoderm/ectoderm lineage (containing the overwhelming majority of human cancers), tumors of mesodermal lineage (including all sarcomas) and tumors of neuroectodermal lineage.
The most important value of this classification is the disengagement of tumor type and tumor place of origin. A primitive blastoma may occur in the bone or the brain or the lung, but it is classified along with the other primitive blastomas regardless of location. This permits tumors with similar molecular profiles to be classified according to biological attributes rather than anatomic location (e.g. Ewing's tumor family).
The separation of endodermal/ectodermal from mesodermal tumors is one of the most successful categorizations in tumor biology. The sarcomas tend to have simple cytogenetic and molecular markers (typically translocations leading to gene fusions) . Ectodermal/endodermal tumors tend to have complex cytogenetic abnormalities and genetic instability . The sarcomas behave quite differently from endodermal/ectodermal tumors. Sarcomas metastasize via the blood vessels, and the endodermal/ectodermal tumors metastasize through the lymphatics. The morphologies of the tumor classes are different. Sarcomas tend to have a spindle cell appearance while endodermal/ectodermal cells have an epithelial appearance. It seems very likely that the functional molecular pathways responsible for the malignant phenotype in sarcomas will be different from the pathways followed for endodermal/ectodermal tumors, and that this will result in fundamentally different approaches to finding therapeutic targets against molecular pathways in these tumor classes.
The current classification replaces the morphologic and arbitrary dichotomy of epithelial and non-epithelial neoplasms with the histogenetically definitive concepts of ectodermal/endodermal and non-ectodermal/endodermal neoplasms. The class of ectodermal/endodermal tumors contains most of the neoplasms that are morphologically epithelial. However, a variety of epithelial tumors derive from mesoderm (e.g., renal cell carcinoma, adrenal cortical adenoma and papillary carcinomas of ovary), or neuroectoderm (e.g. melanoma, medullary carcinoma of thyroid). It would seem that these mesodermal and neuroectodermal epithelial tumors may display class behavior independent of their epithelial morphology.
The classification clarifies many of the genetic and molecular oddities in the field of tumor biology. It distinguishes endocrine tumors based on histogenesis, not by function. Adrenal glands should be thought of as two different glands (medulla and cortex) containing two endocrine cell-types, each with its own embryonic lineage. The medulla derives from neuroectoderm and is sometimes associated with genetic syndromes that involve neuroectoderm-classed tumors. The best example is multiple endocrine adenomatosis type 2a (MEN 2a), characterized by the combined presence of pheochromocytoma and medullary thyroid carcinoma. Both tumors have a neuroectodermal endocrine lineage, the former arising from the adrenal medulla and the latter arising from neural crest C-cells (calcitonin-producing cells) that migrate to the thyroid gland. MEN2 is characterized by ret gene mutations . A variety of related genetic syndromes are characterized by pheochromocytomas, medullary thyroid tumors and other neural crest-derivative tumors and carry the ret gene mutation [20, 21].
The adrenal cortex is mesodermal in lineage and produces adrenal cortical steroid producing cells, strikingly similar to the steroid cell tumors derived from ovarian mesoderm. The occurrence of adrenal cortical tumors in multiple endocrine adenomatosis type 1 (MEN 1) is somewhat anomalous because all the other endocrine tumors associated with this syndrome are of endoderm endocrine lineage. These include pancreatic islet cell tumors, pituitary tumors and parathyroid tumors. Recent evidence suggests that the adrenal cortical adenomas occur secondarily in response to ACTH hypersecretion and are genetically distinct from the endoderm-derived adenomas seen in MEN1 .
Germ cell tumors, in the present classification, are placed adjacent to, but separate from, the teratomas and embryonal carcinomas in a developmental stage prior to the developmental of the embryonic layers (ectoderm, endoderm and mesoderm). This is a departure from classifications of ovarian tumors that include germ cell tumors and ovarian teratomas in the same class. It is the author's opinion that germ cells are different from totipotent embryonic cells. The pure germ cell tumors are seminomas and dysgerminomas. When germ cell tumors are found mixed with teratomas, one can infer a transformation between the different cell types (i.e. germ cells giving rise to totipotent embryonic cells or vice versa).
The ovary, in the current classification, has three anlagen: germ cell, coelomic gonadal and sub_coelomic_gonadal. The appendages of the ovary are derived from coelomic_ducts (paramesonephric or mesonephric). None of the ovary is derived from endoderm or ectoderm. This explains the strangeness and the diversity of tumors arising in, on, or adjacent to the ovary.
Mesotheliomas and synovial sarcomas are placed into the class of tumors with mesodermal, coelomic cavity lineage. This classification emphasizes their similar histogenesis and similar morphology as biphasic epithelioid/spindle tumors. Both tumors have similar histochemical features, producing hyaluronic acid and chondroitin sulfate, the lubricants of coelomic cavities . These two tumors, however, have different cytogenetic features. Synovial sarcomas are characterized by syt-ssx fusion transcipts , while mesotheliomas have complex cytogenetic abnormalities. It is interesting that synovial sarcomas can arise from many different soft tissue locations, including pericardium and pleura (coelomic cavities) [25–28]. This finding suggests that mesotheliomas and synovial sarcomas are closely related.
Renal tumors are separated from the class of ectodermal/endodermal tumors. The histogenetic anlage for the kidney is metanephric mesoderm. Renal tumors clearly belong in a different class than endodermal/ectodermal tumors. Interestingly, a subset of renal tumors is characterized by fusion gene (PRCC to the TFE3 transcription factor gene). Fusion gene neoplasia is a characteristic absent from tumors of ectodermal/endodermal lineage.
The histogenesis of uterine tumors has always presented a special intellectual challenge . This classification of the uterine tumors offers a radical departure from the classic separation of tumors into epithelial and mesenchymal origins. The uterus, like the kidney, has a purely mesodermal lineage, with no contribution from endoderm or ectoderm (the common lineage for mucosal lining cells). Specifically, the uterus is formed from a duct that forms within the mesoderm (the paramesonephric duct). This duct gives rise to the endometrial epithelium as well as the underlying stroma. Consequently, tumors of endometrial and stromal cells share the same classification (sub-coelomic ductal). Like the kidney, this classification ignores morphologic differences (epithelial versus mesenchymal) and creates a grouping in concordance with the observed mixed epithelial/stromal manifestations of some uterine tumors.
The largest class of tumors falls into the ectoderm/endoderm class. This class includes the leading causes of death in man (bronchogenic carcinoma, colon adenocarcinoma, breast carcinoma and prostate carcinoma and the most frequently occurring (though usually non-lethal) tumors of man (squamous cell carcinoma of skin and basal cell carcinoma of skin). Because so many lesions fall into this one category, it might seem that the classification lacks sufficient complexity.
The human body can be envisioned as a topological donut. It is covered by lining cells, with the donut hole lined by endoderm and the donut outer surface lined by ectoderm. The donut pastry would be the mesoderm. Virtually all exposure to toxic and carcinogenic chemicals is via the surface (ectodermal skin or donut surface) or through our aero-digestive tract (endodermally derived lungs and alimentary tract or donut hole-lining cells). Since ectoderm and endoderm are the cells that are most exposed to carcinogens, it's not surprising that most human cancer falls under the class of ectodermal/endodermal tumors. Likewise, since exposure via the ectoderm/endoderm is a lifelong process, it is not surprising that ectodermal/endodermal tumors tend to increase in incidence with age, occurring disproportionately among the elderly. If further research reveals new taxa that can usefully separate tumors of endodermal and ectodermal lineage, the current classification would accommodate the change. The combined endodermal/ectodermal class would remain intact because tumors of either lineage have features in common that distinguish them from mesodermal or primitive tumors. Each member of the class of endodermal/ectodermal tumors would be assigned a specific subclass.
How do we know if the classification is correct? The correctness of a classification is determined by adding new feature information that characterize classes and instances. When newly added group features extend beyond the class or fail to extend to descendant classes or fail to extend to all the members of a class, the classification needs to be modified. Testing the classification is a never-ending but worthwhile process.
Mayr equates classifications with macrotaxonomies . The taxonomy is the list of objects included in the macro-taxonomy. The macro-taxonomy is the scaffold into which the members of the taxonomy might fit. Provided with this article is a "filled-in" classification, using approximately 55,000 neoplasia terms extracted from the NCI-Thesaurus [see Additional file 2].
A tumor classification is proposed that groups each tumor according to embryonic lineage. This classification is prepared as an XML file designed to accommodate additional attributes (e.g. genomic, proteomic, clinical information). The classification is comprehensive (can include all neoplastic entities) and parsimonious (each entity has one lineage). This classification blends traditional concepts of tumor nomenclature with post-genomic concepts of neoplastic development. The classification structure and the classification with taxonomic annotation (approximately 55,000 terms) are available as supplemental files with this article.
This work was conducted at the NIH as part of the author's customary work activities, and no specific financial support was received for this work.
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.