Prematurity, ventricular septal defect and dysmorphisms are independent predictors of pathogenic copy number variants: a retrospective study on array-CGH results and phenotypical features of 293 children with neurodevelopmental disorders and/or multiple congenital anomalies

Background Since 2010, array-CGH (aCGH) has been the first-tier test in the diagnostic approach of children with neurodevelopmental disorders (NDD) or multiple congenital anomalies (MCA) of unknown origin. Its broad application led to the detection of numerous variants of uncertain clinical significance (VOUS). How to appropriately interpret aCGH results represents a challenge for the clinician. Method We present a retrospective study on 293 patients with age range 1 month - 29 years (median 7 years) with NDD and/or MCA and/or dysmorphisms, investigated through aCGH between 2005 and 2016. The aim of the study was to analyze clinical and molecular cytogenetic data in order to identify what elements could be useful to interpret unknown or poorly described aberrations. Comparison of phenotype and cytogenetic characteristics through univariate analysis and multivariate logistic regression was performed. Results Copy number variations (CNVs) with a frequency < 1% were detected in 225 patients of the total sample, while 68 patients presented only variants with higher frequency (heterozygous deletions or amplification) and were considered to have negative aCGH. Proved pathogenic CNVs were detected in 70 patients (20.6%). Delayed psychomotor development, intellectual disability, intrauterine growth retardation (IUGR), prematurity, congenital heart disease, cerebral malformations and dysmorphisms correlated to reported pathogenic CNVs. Prematurity, ventricular septal defect and dysmorphisms remained significant predictors of pathogenic CNVs in the multivariate logistic model whereas abnormal EEG and limb dysmorphisms were mainly detected in the group with likely pathogenic VOUS. A flow-chart regarding the care for patients with NDD and/or MCA and/or dysmorphisms and the interpretation of aCGH has been made on the basis of the data inferred from this study and literature. Conclusion Our work contributes to make the investigative process of CNVs more informative and suggests possible directions in aCGH interpretation and phenotype correlation. Electronic supplementary material The online version of this article (10.1186/s13052-018-0467-z) contains supplementary material, which is available to authorized users.


Background
In the last 10-15 years, the advent of high-resolution microarray technologies has revealed that cryptic chromosomal deletions and duplications, commonly defined as copy number variations (CNVs), are at the origin of a wide variety of clinical manifestations, including neurodevelopmental disorders (NDD), multiple congenital anomalies (MCA) and dysmorphic features [1].
Over time, array-based Comparative Genomic Hybridization (aCGH) has increased our knowledge about microdeletions and microduplications and a large number of novel syndromes have been characterized [2], through a "reverse dysmorphology" method [3].
Since 2010, aCGH has been the first-tier test in the diagnostic approach of children with unexplained developmental disorders or congenital anomalies [4], with a diagnostic yield of about 15% [5,6].
The advances in molecular methodology and the broader application of aCGH led to the detection of novel pathogenic CNVs but also of numerous variants of uncertain clinical significance (VOUS).
How to appropriately interpret results of aCGH represents a challenge for the clinician especially when information found in genetic databases or scientific literature is not enough [7][8][9][10]. The clinical significance of CNVs has important implications on patient management and on family counseling, even in terms of reproductive health [11][12][13].
The aim of the study was to analyze clinical and molecular cytogenetic data of a sample of 339 patients with NDD/MCA in order to identify whether and which of these elements could be useful to interpret unknown or poorly described rearrangements. We also set out to establish whether some core features (NDD, dysmorphisms, MCA, epilepsy) are more probably linked to pathogenic or likely pathogenic variants when isolated and in what possible combination.
Finally, we delineated a diagnostic flow-chart based on our results that could help the clinician in aCGH interpretation and the management of patients.

Methods
We present a retrospective study on 339 patients evaluated at the Clinical Genetics Unit of Arcispedale Santa Maria Nuova, AUSL-IRCCS of Reggio Emilia. Inclusion criteria were the presence of unexplained NDD and/or MCA and/or dysmorphisms. For all patients we collected individual informed consent for the present study.
Patients were investigated through aCGH between 2005 and 2016, after signing the appropriate informed consent to genetic testing. Since 2012 aCGH have been systematically performed by 8x60K oligochips with a resolution of 100 Kb, whereas before 2012 the analysis was carried out by using different platforms and resolutions [Additional file 1: Table S1].
All data about family/clinical history and physical/dysmorphological evaluation of patients were retrospectively extracted from clinical reports. The clinical features included: family history, pre-perinatal history, neuropsychiatric evaluation, auxological parameters, minor dysmorphisms, organ malformations, neurological assessment, sensory deficits and/or anomalies of sensory organs, skeletal anomalies, joint anomalies, skin anomalies, hematologic or endocrinological diseases.
Regarding CNVs, we considered the nature of the rearrangement (deletion/duplication), the presence of multiple rearrangements, the gene content (total number of genes, disease genes, protein-encoding genes) and the presence of interrupted genes.

Statistical analysis
Descriptive statistics were used to present the data. Distribution of continuous data was assessed by Kolomogorov-Smirnov test. Categorical data, such as phenotype characteristics or somatic problems, were presented as frequencies (%) of total number of patients tested, while continuous data (number of genes and CNV size) were presented as median with interquartile range (IQR). Comparison of phenotype characteristics between two groups with negative aCGH or CNVs and between three groups of clinical significance (pathogenic, likely pathogenic and likely benign CNVs) was done using Pearson's chi-square test or Fisher exact test. Continuous data were compared between two groups using Mann-Whitney U test and among three groups using Kruskal-Wallis test. Post-hoc analysis was applied with Bonferroni correction for all multiple comparisons. Spearman's correlation analysis was applied to assess association of chromosome size and the number of CNVs per chromosomes. Separate logistic regression analyses were done to assess independent predictors of positive microarray and pathogenic CNVs, respectively. Variables that showed difference at p < 0.1 level in univariate analysis were entered into a multivariate logistic regression model and backward stepwise selection of variables was performed. Odds ratios (OR) with 95% confidence intervals (CI) were computed and the Hosmer-Lemeshow goodness-of-fit test was performed to assess overall model fit. Measures of discrimination (Nagelkerke r 2 and area under the receiver operating characteristic curve, ROC area) were calculated for all regression models.
All statistical tests were two-sided and were performed at a 5% significance level. SPSS software (version 20.0; SPSS Inc., Chicago, IL, USA) was used for the statistical analysis.

Clinical data
Of the total sample of 339 patients enrolled, 240 (70.8%) presented a genomic rearrangement (CNV), while 99 (29.2%) received a negative aCGH result. Within these two groups, some patients (15 and 31 respectively) subsequently received a different molecular or clinical diagnosis. Therefore, the final sample consisted of 293 patients (225 patients with CNVs and 68 control patients with negative aCGH) (Fig. 1).
The sample presented 169 males (57.5%) and 124 females (42.3%) with an average age at the time of the test of 7 years (range 1 month -29 years). The majority of patients executed aCGH in the first years of life (38.5%) or at primary school age (27.6%).
Neurodevelopmental disorders (such as psychomotor developmental delay, intellectual disability, autism spectrum disorder, attention deficit and hyperactivity disorder) were observed as the main indication to perform aCGH. Psychomotor developmental delay, in particular with impairment of language (78.5%), and intellectual disability (66.4%) were the most frequently observed. 28.3% of requests concerned isolate NDD, while NDD in combination with other features As regards to the severity of intellectual disability, 31.1% of the patients had mild ID, 5% of subjects with severe intellectual disability had developmental delay and language impairment, 14% of subjects presented with autism spectrum disorder.

Molecular cytogenetic data
Among the 225 patients with CNVs, 70 (31.1%) showed pathogenic CNVs and 155 (68.8%) carried VOUS (105 likely benign and 50 likely pathogenic). Of the pathogenic CNVs, 27 were associated with known syndromes, 26 were new microdeletions or microduplications containing at least one gene whose haploinsufficiency or amplification correlates with known pathogenic conditions, and 17 were rare de novo CNVs or other chromosomal imbalances (Table 1).
In the total group of patients with chromosomal rearrangements (225), 153 (68%) presented a single CNV (73 deletions and 75 duplications), while the remaining cases had multiple rearrangements, up to 5 CNVs in a single patient. Therefore, the total number of CNVs detected were 323: 81 pathogenic, 72 VOUS likely pathogenic and 170 VOUS likely benign [Additional file 1: Table S1].
The distribution of CNVs on chromosomes did not appear to be linked to chromosome size or gene density. Notably, we observed a greater concentration of rearrangements on chromosome X (10.8%) and chromosome 1 (9.6%), but the number of CNVs did not positively correlate with the size of the chromosome (r 2 0.29). There was no correlation between the number of CNVs and the gene density of the chromosomes (r 2 0.006) (Fig. 3a-c).

Data analysis
Comparing the group of patients with pathogenic CNVs and VOUS, higher statistically significant frequency (p < 0.01) of delayed psychomotor development, intellectual disability, IUGR, prematurity, congenital heart disease, cerebral malformations and dysmorphisms was detected in the pathogenic CNVs group. In addition, a   (1) -Unbalanced translocation [t(7;9), t(9;10), 2 t(10;16), t(8 (1) del deletion, dup duplication, mos mosaicism, UPD uniparental disomy The number of patients for each chromosomal anomaly is indicated within parentheses significant difference with p < 0.05 for absence of speech and anomalies of the interventricular septum was found. Somatic overgrowth and autism spectrum disorders were the only two data in which a statistically significant difference (p < 0.05) was found in favor of VOUS.
Clinical features showing statistically significant differences among patients with pathogenic CNVs, likely pathogenic CNVs and likely benign CNVs are reported in Fig. 4 [complete description in Additional file 4: Table S4].
Lastly, we compared patients with likely pathogenic VOUS [Additional file 5: Table S5] to the group with negative aCGH (comprising likely benign VOUS plus controls). We detected statistically significant differences in favor of likely pathogenic CNVs for abnormal EEG and for limb dysmorphisms [Additional file 6: Table S6].
Multivariate logistic regression analysis of independent predictors of pathogenic CNV was consequently conducted and results are shown in Table 2. Univariate As regards molecular cytogenetic characteristics, both the size of the rearrangements and the number of contained genes, protein coding genes and disease genes were statistically significant (p < 0.0001) for pathogenic CNVs in the comparison of the three groups (pathogenic, likely pathogenic and likely benign). The number of broken genes is not configured as a significant element. As far as inheritance is concerned, de novo CNVs were represented with statistical significance in the group of pathogenic CNVs; comparing the type of aberration, we found a greater percentage of deletions and fewer duplications in pathogenic CNVs, with statistical significance versus likely benign CNVs (Table 3).
In addition, for those patients (n = 16) who had a single CNV of uncertain significance and not containing any known protein coding genes, we performed an in silico prediction of the noncoding elements and of the possible modification of topologically associating domains (TADs) by consulting the 3D Genome Browser [23] [Additional file 7: Table S7]. This analysis provided some useful insights on CNVs previously dismissed as non-significant, suggesting a novel functional approach that might be included in the current interpretation guidelines.
Then we analyzed the core features (NDD/Dysmorphisms/MCA/Epilepsy) for which aCGH was performed in each patient, as either isolated or associated elements. Comparing these core features with the results of aCGH, we observed that it was more likely to find an abnormal rearrangement when NDD were associated with other features rather than isolated. In patients with NDD alone we observed a statistically significant presence of negative aCGH (p = 0.0003) compared to presence of CNVs, while in patients with a combination of NDD and dysmorphisms we found a statistically significant presence of CNVs (p = 0.0358) [Additional file 8: Table S8]. Specifically, in patients with isolated NDD and presence of CNVs we observed likely benign CNVs more frequently than pathogenic or likely pathogenic CNVs (p < 0.00001); whereas in subjects with NDD associated with dysmorphism, pathogenic CNVs were more likely to be detected (p = 0.0042) ( Table 4).  Regarding NDD reported in our sample, individuals with moderate to severe ID were around 20% (59/293) of total patients [Additional file 2: Table S2] and the rate of pathogenic CNVs in this group was 28.8% (17/59). Accordingly, next generation sequencing analysis identified single nucleotide variants in 39% of patients with severe intellectual disability while causative CNVs in only 21% of them [24].

Suggested diagnostic flow-chart
A flow-chart (Fig. 5) regarding the care for patients with NDD and/or MCA and/or dysmorphisms has been made on the basis of the data inferred from this study and their comparison with the literature [14][15][16][17][18][19][20]. Our purpose is to make the investigative process of CNVs more informative and to suggest possible guide elements in aCGH interpretation. It is of primary

Discussion
In the last 10-15 years, aCGH has been a revolutionary tool in identifying genomic aberrations in the broad spectrum of pediatric population with neurodevelopmental disorders and/or multiple congenital anomalies [4,25], modifying the management of these patients and their families [6,13].
In the literature, the detection rate of pathogenic CNVs in these patients ranges from 5 to 20% (on average 15%), depending on the preselection of patients and on the technical characteristics of the instrument used [26,27].
In our study of 339 patients with NDD and/or MCA tested by aCGH, the detection rate of pathogenic CNVs was 20.6% (70/339). This result could be representative of an appropriate selection of the patients who underwent this genetic test in our clinical unit.
In our sample, 55.6% (163/293) of patients presented with NDD associated with MCA and/or dysmorphisms and/or epilepsy, while 28.3% (83/293) had an isolated NDD. The residual percentage of patients had malformative, dysmorphic or neurological characteristics, isolated or in combination with each other [Additional file 7: Table S7]. The detection of pathogenic CNVs in case of isolated NDD was extremely low (2/293, 0.68%), while it reached 19.1% (56/293) in cases of NDD associated with other clinical elements. In particular, we found pathogenic CNVs in 8.5% (25/293) of subjects with NDD and dysmorphisms (Table 4).
In recent years, wide use of aCGH in patients with NDD and/or MCA has led to the detection of an impressive number of VOUS.
The interpretation of aCGH results became a crucial topic in clinical practice, in diagnostic, prognostic and ethical terms. Thus, recently, several studies focused on identifying specific clinical or phenotype variables that could be associated with the detection of pathogenic CNVs (Table 5).
Moreover, prematurity, dysmorphisms and interventricular septal defect resulted as independent predictors of pathogenic CNVs.
Prematurity has not been previously reported. Premature subjects, who survive the neonatal period, are characterized by a high risk of developing NDD. In light of this data, it could be interesting to consider prematurity as a phenotypic feature within the syndromic frame caused by pathogenic CNVs.   Fig. 5 Flow-chart in patients with NDD and/or MCA and/or Dysmorphisms. The first step is the collection of appropriate family and clinical history and physical/dysmorphological evaluation. If the patient has a recognizable syndrome, we have to confirm it with specific genetic tests. Otherwise, except for other possible neurological or metabolic implications, we will proceed by considering aCGH (in case of male subjects with ID, it would be appropriate to consider the molecular survey for Fragile X syndrome). The blood draw should always be done on the trio in order to perform aCGH on parent's sample if anomalous in the child. If aCGH detects CNVs, they will be carefully interpreted. Some CNVs can be classified as pathogenic because linked to known syndromes or to "new microdeletion/microduplication syndromes". If CNVs are less known or poorly described they have an uncertain clinical significance (VOUS): we suggest some variables that might be useful in distinguishing likely pathogenic from likely benign CNVs (continuous box). Additionally, the presence of some phenotypic variables, as well as the analysis of non-coding regions, could be useful in classifying VOUS as likely pathogenic (dashed box) [* Phenotypic variables significant for pathogenic CNVs: developmental delay, ID, prematurity, IUGR, dysmorphisms, congenital heart disease, hypotonia, cerebral malformations; Phenotypic variables significant for likely pathogenic CNVs: abnormal EEG, hand and lower limb dysmorphisms; Independent predictive factors for pathogenic CNVs: prematurity, ventricular septal defect, dysmorphisms]. In the case of normal chromosomal pattern or likely benign CNVs, it will be necessary to re-evaluate the patient. If the clinical features are strongly suggestive of a genetic/syndromic condition further genetic investigations will be carried out. These may include targeted sequencing, exome sequencing and, in selected cases, genome sequencing. Otherwise clinical follow up should be implemented in the event that evocative elements could emerge over time recommending future genetic investigations VSD, as well as congenital heart anomalies in general, have already been described as linked to pathogenic CNVs [19]. In our study, VSD assumed an independent predictive value for pathogenic CNVs. This data appears to be supported by recent literature, which reports significant CNVs in 16.9% of patients with VSD [28].
Finally, dysmorphisms play an important predictive role for pathogenic CNVs and present a statistically significant association with pathogenic CNVs when considered in association with NDD.
Regarding molecular cytogenetic characteristics, we found that pathogenicity is significantly correlated with the larger size of aberrations, the greater number of total/ protein-coding/disease genes located within, the de novo mode of inheritance and the deletion type of variants. These elements have already been described [10,29]. We also detected abnormal EEG, hand dysmorphisms and lower limb dysmorphisms as more frequent variables in likely pathogenic CNVs versus likely benign CNVs plus negative aCGH. Clinically, these data may have modest impact, but comparisons between these two groups of patients had not been previously described in literature.
Moreover, the presence of disrupted genes and the study of gene function compared to patient phenotype could provide important clues for the interpretation of CNVs. Likewise, the analysis of TADs appears to have a predictive value, also for the evaluation of likely benign VOUS (Fig. 5) [Additional file 7: Table S7].
We are aware that the molecular cytogenetic and phenotypic elements identified may not assume an absolute value in the interpretation of results, but they could contribute to the framework needed for the clinician in discerning between likely pathogenic and likely benign VOUS (Fig. 5).
The main limit of the study is that we did not use standardized criteria in classifying patients' CNVs (pathogenic, likely pathogenic and likely benign), because there are no specific references in the literature. The study was retrospective and patients were divided into the three categories of significance on the basis of their clinical reports. In any case, CNV interpretation has been the result of careful analysis of scientific literature, genetic database and phenotype evaluation.

Conclusions
In our retrospective analysis, we observed a detection rate of pathogenic CNVs at the upper limits of what was reported in literature [26,27]. It could be the result of a careful selection of patients that underwent aCGH in our clinical unit.
In patients with NDD, prematurity is usually considered as an environmental risk factor. In our study, the detection of prematurity as an independent predictor of pathogenic CNVs suggests that sometimes this feature can rather be considered as a main part of the underlying genetic disorder.
Dysmorphisms, especially if associated with NDD, seem to have a predictive significance for pathogenic aberrations.
We detected several elements related to pathogenic CNVs and some related to likely pathogenic CNVs that could be helpful in the interpretation of aCGH results, even though we acknowledge they may not assume an absolute significance in the interpretative process. This necessarily requires a combination of several factors, such as scientific literature, genetic databases, molecular cytogenetic characteristics, detailed patient anamnesis and phenotype evaluation.
However, it is necessary to emphasize the importance of a meticulous description of the phenotypic features of patients with pathogenic CNVs, both to contribute to scientific sharing of data and to facilitate accurate interpretation of aCGH results [17].
The purpose of the study was to improve diagnostic accuracy, with a positive impact on patients' clinical management, prognosis, follow-up and genetic counseling.