Multivariate Statistical Interpretation of Airborne Diseases using Principal Components Analysis

Airborne communicable diseases dimensionality reduction principal component analysis

Authors

February 6, 2025
February 7, 2025

Downloads

Accurate and timely determination of relationships among communicable diseases is crucial in taking precautionary measures to control and prevent the transmission of infectious diseases. This will inform the government to make important policies that will help provide effective patient treatments. This study employed the principal components analysis (PCA) method to explore and interpret the relationship among airborne communicable diseases: Tuberculosis (TB), chickenpox, and measles based on a medical dataset obtained from Bibiani Government Hospital, Ghana. The Kaiser criterion was employed to determine a suitable number of principal components (PCs) to feature in the statistical analysis. A scree plot diagram was also used to affirm the number of PCs needed in the analysis. Projection of diseases on PC planes was also used to interpret the relationships among the diseases. From the study, statistical results revealed that the first principal component (PC1), second principal component (PC2), and third principal component (PC3) performed significantly well in the disease interpretation by explaining a total variation of 46.70994%, 30.61631%, and 22.67376, respectively of the useful information in the dataset. There were also marked strong correlations among the diseases concerning PCs. Due to the limited number of diseases considered, this study will serve as preliminary investigation on the use of PCA as a versatile and promising multivariate statistical technique that can be relied upon by public health experts and policymakers to interpret relationships among diseases and make informed decisions very well.