DEEP FEATURE WISE ATTENTION BASED CONVOLUTIONAL NEURAL NETWORK FOR COVID-19 DETECTION USING LUNG CT SCAN IMAGES

This study anticipates to identify Covid-19 using Lung CT (Computed Tomography) scan image dataset with the help of effective DL(Deep Learning) based algorithms. Though several clinical procedures and imaging modalities exists to diagnose Covid-19, these methods are time-consuming processes and sometimes the predictions are incorrect. Concurrently, AI (Artificial Intelligence) based DL models have gained attention in this area due to its innate capability for efficient learning. Though conventional systems have tried to perform better prediction, they lacked in accuracy with prediction rate. Moreover, the conventional systems have not utilized attention model completely for Covid-19 detection. This research is intended to resolve these pitfalls of covid-19 detection methods with the help of deep feature wise attention based Convolutional Neural Network. For this purpose, the data has been pre-processed by image resizing, the Residual Descriptor with Conv-BAM(Convolutional Block Attention Module) has been employed to obtain refined features from spatial and channel wise attention based module. The obtained features are used in the present study to improvise the classification as covid positive or negative. The performance of the proposed system has been assessed with regard to metrics to prove better efficiency. The proposed method achieved high accuracy rate of 97.82%. This DL based model can be used as a supplementary tool in the diagnosis of Covid-19 alongside other diagnostic method.


Introduction
Covid-19 has created great impact in the lives of the people and it has evolved as a public health issue with 49 countries being affected as of 16 March, 2023 as claimed by Centers for Disease Control and Prevention (Covid-19, 2023). The deceiving characteristic of Covid-19 has made individuals to panic. The transmission rate and the mortality rate is also high in Covid-19 on the contrast to other viral outbreaks as stated in (Kollias, 2022). Accordingly, 86 million people were affected by the high transmittable and the infectious disease covid-19 and took the lives of 1.86 million people. The disease associated death count has increased excessively. Hence, the detection process has become an immediate need to save the individual's life (Bhatnagar, 2021). Triad of approaches are available in order to diagnose the Covid-19 starting from the screening to the clinical decisions in the massive population. There is an increasing need of molecular tests that are specific for the purpose of diagnosing the nucleic acids. The standard technique of RT-PCR is considered for the diagnosis process due to its features of the specificity as well as the sensitivity. The pandemic has paved the way of the usage of the RT-P CR because of exponential increased cases. For suspecting the individuals affected with Covid-19, detection in the initial phase is crucial. To accomplish this, the scientists have utilized POC tests to diagnose SARS-CoV-2 infection as mentioned by (Mercaldo, 2023). It is evident that these techniques has to pave ways for a rapid and accurate diagnosis of Covid-19 in the patients. Though POC tests are very effective that has the capability to predict the results in a rapid way and a user friendly method, there exists certain limitations. The limitation is the tedious nature and the time consumption of the model. Individuals are affected when the prediction is incorrect. Technological advancement devoid such mistakes and increases accuracy rate in diagnosing  Accordingly, (Kumar, A review of modern technologies for tackling COVID-19 pandemic, 2020) concentrates on the technologic advancement to abate the impacts of the ailments. There is an emerging growth and the enhancement in the technology for the diagnosis of covid-19 included that the proposed model bring optimal results than the performance of the state of art. The accuracy value attained by the proposed model is found to be 96.23% as concluded in (Afif, 2023).
Similarly, to diagnose the Covid-19 from the CT scan images, Generative Adversarial Networks (GANs) are utilized that is dependent on the CNN approach. The enhanced model of CNN that has a high efficiency in the datasets are proposed to sense Covid-19 from the CT scan. The suggested method rely upon synthetic generation as well as the image augmentation after that, CNN model is created for the datasets. The improvised CNN (Fan, 2022) that relies on augmented data are correlated with the two public datasets. The proposed model is superior in the performance-wise on the contrast to the classic CNN model. From the experimental results, it is evident that, enhanced CNN models as claimed by (Krishnaswamy Rangarajan, 2022) perform better rather than the classic CNN approach as considered by (Sushma Jaiswal, 2023). There are certain problems that occur in forecasting the Covid-19 from the CT scan. For that issues, the purpose of the study by (He X. Y., 2020) falls in generating a dataset that comprised of hundred CT scans and then the performance of the DL approach is enhanced in order to attain a better accuracy in the diagnosis of the Covid-19 through the CT scan images that are limited in number. The self-trans method is proposed by the integration of the transfer learning and the selfsupervised learning in a synergic manner. These methods reduces the risk of overfitting in the feature representations that are unbiased. The experiments exhibit the superior performance of the self-trans method. The suggested method accomplishes the accuracy rate of 94% though the training process of the CT is limited.
DL methods plays a prominent role in biomedical science. Not only that, it assists with forecasting the diseases and to classify it better mainly the ailment of corona. Among the deep learning architecture, Convolutional Neural Network (CNN) (Polsinelli, 2020) or ConvNet is utilized for the operation of the CT images as well as for the classification of the ailment of corona. (Cifci, 2020) aims in diagnosing the Covid-19 by making use of the Computed Tomography (CT) scans. The CNN has the ability to forecast the Covid-19 at the early stage. Clinical diagnosis is carried before the pathogenic testing process. The two DL networks that is proposed in the study is Inception-V4 and the AlexNet. These two deep learning architectures are utilized for the prognostic and diagnostic analysis. DL architectures used are much more effective than the traditional CNN methods as stated by (Pathan, 2021) in the classification of the images. The false-negative prediction is abated in the diagnosis of prediction when the AlexNet is utilized. In the Artificial Intelligence (AI), DL is standing in the peak considering the diagnostic accuracy in the ailment particularly automation in the detection of lung diseases. The main aim of CNN as used in (Alshazly, 2021) is to examine the potentiality of the diagnostic accuracy through the CT scan of the weak patient label in order which paves the way for the rapid operation. The 3D CNN is proposed by (Zheng, 2020) has predicted the effective accuracy rather than the deep learning assisted methods as used in (Alom, 2020).
In order to ensure the patients are affected with corona positive or not, the predominant tool that is used is the CT scans as concluded by (He X. W., 2020). This CT scan assures whether the patients are affected with corona or not. The deep learning architectures that are ubiquitous is independent in terms of data. Further, (Gozes, 2020), the algorithms process in the form of pipeline that included various processes for the image processing such as lung segmentation process, 2D classification and then the fine grain localization. The pre-processing method is utilized in the proposed model for the effective prediction accuracy. First step is to localize the lung region in the CT scan of the chest. The classification of 2D ROI that distinguishes the normal and the abnormal patients . After making use of deep learning methods that are enhanced in the image analysis, the classification results are better to diagnose the cases of corona and the non-corona patients with the accuracy rate of 94.8%. For the purpose of automation in the detection of Covid-19, the deep learning assisted spatiotemporal information fusion model is utilized to forecast the Covid-19. The method opens the door to perform auxiliary diagnosis of Covid-19 as stated by (Li T. W., 2021) with effective accuracy. CT scans are mostly effective in diagnosing the Covid-19. The binary classification supports better in achieving the accuracy rate of 86.9% as claimed by (Shambhu, Binary classification of covid-19 ct images using cnn: Covid diagnosis using ct. , 2021). The deep learning architecture that comprises of 13 segmentation models are utilized to accomplish the accuracy in diagnosing the Covid-19. 1800 CT scan images are utilized in order to examine the effectiveness of the deep learning assisted LungINseg as discussed by (Kumar, A review of modern technologies for tackling COVID-19 pandemic., 2020) . (Pydala, 2023)

Problem Identification
There are several problems that are present in the deep learning techniques to diagnose the Covid-19. The problems that exist in the current studies are examined and presented in the paper. In the medical domain, the design of the network and training process should be enhanced for the appropriate accuracy in diagnosis of the Covid-19. There are several studies that used the 3D segmentation of the Deep Neural Networks, which need further improvement in its process to gain precise results as claimed by (Zheng, 2020). The size of the dataset is main matter of concern, the deep learning must be designed for the Covid-19 classification in the massive datasets. The computational problem is major issue in the deep learning model which must be improved as suggested by (Addepalli Lavanya, 2023) (Maloth, 2012). The parameters play a significant role in determining the accuracy rate of the algorithm in prediction. If any of the parameters are changed, it has a high risk to abate the level of accuracy as mentioned by (Shambhu, Binary classification of covid-19 ct images using cnn: Covid diagnosis using ct. , 2021). So that the accuracy rate must be concentrated more in order to enhance the performance of the model.

Research Methods
The study anticipates identifying Covid-19 using effective DL based algorithms. Though conventional researches tried accomplishing this, they have hardly utilized attention mechanism to classify lung CT images. Traditional works have also failed to gain cross spatial and cross channel inter-associations within multiple scopes. To resolve this consequence, the present study endeavours to propose spatial and attention based feature refinement for detecting Covid-19 from lung CT scan images. It is performed based on a sequential procedure as depicted in figure.1 wherein, CT scan image dataset is considered. Data augmentation is considered in this study which is especially valuable in medical-imaging applications wherein, there might be less data accessible for training. In this circumstance of CT-images, there might be various reasons as to why augmentation could be profitable for non-Covid-19 and Covid-19 cases. These advantages include: • Enhanced robustness: by generation of new images which slightly vary from actual ones.
Model turns robust for input variation. This could be specifically significant in cases wherein CT images have been taken under varied circumstances or with varied scanners. • Enhanced generalization: the CT images could expose significant variations in the quality of image, inclusive of resolution, noise-levels and contrast. With the augmentation of data with differences in the factors, model could learn in generalizing better to unseen and new data. • Data balance: in several medical based imaging applications (inclusive of non-Covid-19 and Covid-19), there might be imbalance in images in individual class. The augmentation could be utilized for balancing the dataset with the creation of additional minority class images. • Averting overfitting: with the dataset augmentation, models are less probable to get over-fitted to training data. Moreover, overfitting happens when model turns specialized to training data, thereby performs poor with unseen and new data. Overall, merits of augmenting CT images include enhanced performance, ideal generalization and improved dataset balance. All these merits could be significant in medical based imaging applications wherein accurate diagnosis seem to be vital for patient care.
Proposed system consists of sequential processes where pre-processing is initially undertaken for enhancing the image quality by which, one could assess it optimally. Through preprocessing phase, undesired distortions could be suppressed and some features could be improvised that are crucial for further processing. In spatial and channel wise attention model module, the pre-processing process is initially performed, wherein, pixels are adjusted and subsequently it is inputted to CNN (Convolutional Neural Network) to retrieve features for procuring high levelled feature maps. Subsequent phase is attention module that comprise of two attention modules namely spatial and channel attention. With the usage of high levelled feature maps attained from initial phase as input for second-stage, noise features are suppressed and pathological features are improved by attention model. These modules attain spatial and channel inter-relationships in multi-scores through the use of multiple convolutions to procure spatial and channel wise feature maps (as shown in figure.2). This attention module could attain attention weights through the use of large sized convolutions to retrieve the dispersed features and trivial sized convolutions to retrieve concentrated features. Final phase is classification, wherein, classification is undertaken based on the attained features. In this stage, two attention based feature map results are combined from subsequent phase as input of third phase. These are then classified by Softmax and FC layers. Overall proposed work is validated in accordance with performance metrics for confirming its efficacy. Generally, in CNN, neurons within a layer could be seen as an organized one into a collection of K-D matrices that is termed as feature maps. On the other hand, the model encompasses of two parallel attention modules: spatial and channel attention. These modules are used to procure spatial and channel inter-associations in multi-scores based on multiple convolutions for attaining spatial wise and channel wise feature maps. After this, the descriptor attained with residual weights from spatial and channel wise features are fed into attention weights which are gradually fed into FC (Fully Connected) network to perform effective classification.

Spatial and Channel Wise Attention Model
Generally, attention model utilized in DL possess various advantages. The attention mechanism permits DL model for concentrating on crucial input points so as to attain optimal output interpretations. In addition, the attention permits researchers for interpreting DL model by human perception. Considering this, the present study considers Conv-BAM (Convolutional-Block Attention Module). The model is represented as shown in equation.1, equation.2 and equation.3, Feat * = Combine c (Feat 1 Feat 2 ) (Eq. 1) In equation.1, equation.2 and equation.3, Feat 1 Feat 2 indicates the channel wise and spatial wise attention based feature maps. Combine c denotes the concatenating operation of two procured attention based feature maps (Feat 1 Feat 2 ). Moreover, Feat * belongs to R H * W * 2C and it involves the overall attention based feature map. Further, Atten c (FM belongs to R 1 * 1 * C )and it indicates the channel based attention weights determined from the feature maps (FM belongs to R H * W * C ). Furthermore, Mul c denotes the channel wise multiplication of 1 * 1 * C-weight coefficients in (Atten c (FM)). Further, Atten s (FM belongs to R H * W * 1 ) indicates spatial wise attention based weights attained from input-feature map (FM belongs to R H * W * C ). On contrary, Mul s is a spatial and element wise multiplication parameter that indicates the multiplication of spatial-weight coefficient (H * W * 1)-in (Atten s (FM) by FM).

CWAM (Channel Wise Attention Based Module)
CWAM extracts relationships amongst feature planes retrieved by various convolutions within a feature map. Associations amongst cross channel in diverse ranges differ, so, CWAM is considered to combine the associations amongst cross channel in varied ranges. Resultant high layered Feature Map (FM) belongs to R 7×7×512 . Values for individual feature map dimensions are regarded along with the influence of cross channel range (K) on cross channel association and K = 1,2,3 represents the size of filter. For taking the complete merit association amongst feature planes retrieved by individual convolutional kernel, initially, global-spatial information is squeezed into channel descriptor. This indicates that, global average-pooling is undertaken for individual plane of high levelled feature map. This could be given by equation.4.
Wherein, out cd indicates channel descriptor attained by squeezing. Further, H * W indicates spatial-dimensions of feature map, Feat c (x, y) indicates individual pixel-point (x, y) in feature plane. Relationships across the channels are computed through varied sets of threecoefficient matrices. Then, CWAM weights are computed for specific channel descriptor (out cd belongs R 1 * 1 * C ). CWAM could be learned as per equation.5,

SWAM (Spatial Wise Attention Based Module)
SWAM attempts to retrieve the associations amongst pixels or receptive areas by inclusion of internal spatial associations of feature maps generated by individual convolutional kernel at specific channel. Particularly, SWAM is designed that considers feature maps retrieved in the topmost feature extraction layer, thereby, procure spatial attention based map through multiple kernel size based attention model. CWAM and SWAM support one another and could better retrieve the influence and interactions amongst two attention based models (Rudra Kumar, 2022). SWAM encompasses of average pooling functions that vary from prevailing functions in that, SWAM averages individual pixels along the dimensions of a channel. Average of pooling function is given by equation.13, out sd = In equation.13, out sd indicates spatial descriptor attained by squeezing. Further, Chann denotes the overall channels within feature map (Feat), Feat s (k) denote the local pixels of individual channels at particular spatial location. Three varied size coefficient matrices are used for computing relationships amongst cross and spatial locations, thereby compute SWAM for specific spatial descriptor (out sd belongs R H * W * 1 ). SWAM could be learned as per equation.14, Atten s = sigmoid( Weight s k=1 out sd +Weight s k=2 out sd +Weight s k=3 out sd 3 ) (Eq. 14) In equation.14, Weight s k=1 , Weight s k=2 , Weight s k=3 indicate coefficient matrices utilized for computing spatial attention, while k = 1, 2, 3. Matrix coefficient when k = 1 is given by equation.15,

15)
The above process is executed using convolution in a way that, the process could be upgraded with an end-to-end neural network training. Initially, feature information is squeezed at individual spatial point of high levelled feature map that is retrieved from the initial phase into representative descriptor. Subsequently, all the descriptors are aggregated into a spatial descriptor (Atten 2 belongs to R H * W * 1 ). Following this, convolutional operations are utilized for learning the associations amongst pixels and local areas. This is done in a way that the network possesses the ability for learning the associations through the spaces. As inter-associations in varied crossspace sizes vary, kernel size of convolution is selected as (1,1), (2,2) and (3,3). This is performed to learn the associations in varied cross space with sizes of different range. Subsequently, spatial descriptors are attained as SAtten 1 = f 2D 1 * 1 (Atten 1 ) ∈ R H * W * 1 , SAtten 2 = f 2D 2 * 2 (Atten 2 ) ∈ R H * W * 2 and SAtten 3 = f 2D 3 * 3 (Atten 3 ) ∈ R H * W * 3 . Final SWAM are calculated by sigmoid function after taking average of three descriptors. Lastly, spatial wise multiplication is performed on the procured attention based weights and actual high levelled feature map so as to obtain attention based weight in spatial dimensions. Complete computation process for SWAM is given inequation.16, equation.17 and equation.18, Atten 2 = AvgPool s (F) (Eq. 16) Atten s = sigmoid(mean (f 2D 1 * 1 (Atten 1 ); f 2D 2 * 2 (Atten 2 ); f 2D 3 * 3 (Atten 3 ))) (Eq. 17) Feat 2 = Mul s (Atten c Feat) (Eq.

Residual Attention with Convolutional-Block Attention Based Module
The present study considers residual attention to recover crucial information. Furthermore, for affording images of high quality at multi-time stages without image re-encoding, residual connection is used that would eventually alter the content of image from previous observed image. Figure.3 exposes the Conv-BAM model. It encompasses of SWAM and CWAM. When feature map (FM) is taken as input, Conv-BAM computes dual FM through the use of CWAM and SWAM. CNNs have revolutionized the image classification sector. Progress in hardware networks and algorithms have improved the capability to integrate layers to Deep CNN. With enhancement in network depth, it turns harder for training the NN due to vanishing gradient issue. Networks with several layers become unstable as the gradient value seems to be zero in the initial layers. Each of the gradient value for additional layer becomes small and insignificant. Moreover, vanishing gradient deprives network performance. Further, addition of several layers worsens the issue. For solving vanishing gradient issue, residual connections are considered with Conv-BAM. These connections merge the layer's result with the previous layer's input. This assures that gradient values don't vanish all of a sudden. Generally, DL model attempts to learn mapping function (H x ) from input (x) to output (y). It is given in equation.20, H x = y (Eq. 20) In the residual-block, rather than learning direct-mapping, it makes use of variance amongst mapping and actual input and is given by equation.21, Moreover, re-arranging affords, H x = F x + a. Besides, residual connections learn residual (F x ) with x as input and H x as actual result. This methodology assists when enhancing the depth of neural network.
In traditional residual attention model, there is an up-sample and performance of the model could be enhanced by improvising the up-sampling layers. Attention block is extended by integrating dual up-sampling layers. Encoder and the decoder are succeeded by dual convolutional layers and sigmoid-function (as shown in figure.4). The model as shown in figure.4 consists of channel based attention module, global max-pooling module, average-pooling module and shared MLP (Multi-Layer Perceptron) that encompasses of three dimensional Neural Network. FM is exposed to a max-pooling and average-pooling functions and to shared MLP. Results of shared MLP are included in an element wise manner. Subsequently, after sigmoid function, weight is attained. Subsequently, refined features finds consideration for spatial and channel wise attention. It is compressed by max-pooling and average-pooling in channel dimension. Dual retrieved FMs are exposed to channel oriented merging for attaining a two channelled FM. This is later downscaled to single channel prior to the employment of sigmoid function for generating resultant weight. Overall results of the module is given in equation.22, atten model (x) = (1 + H(x)) * T (x) (Eq. 22)

Fig. 4. Residual Descriptor with Convolutional Block Attention Module
The proposed residual descriptor with Conv-BAM is constructed by multi-stacks of fundamental unit module. Blocks are stacked to avert overfitting and attain ideal performance. Attention is designed for exposing significant feature maps as Covid-19 infections can be clear in CT images. Moreover, the residual descriptor obtains high levelled features, affords input to the attention block. Attention model retrieves specialized low levelled features on the residual input. Besides, it partitions the image into high levelled features. From these features, certain low levelled features are extracted. Attention and residual layers are alternatively stacked three times. Residual layer retrieves high levelled features from an input image that are later passed into attention block that retrieves features of low level. These features are fed as input to subsequent residual connection. It functions as a bottom up and top-down approach. In this case, top-down approach affords dense features, while, bottom up affords feature maps of low resolution. Final phase in proposed module lies in adding the retrieved features with actual input of the block so as to perform effective classification.

Results & Discussion
Results procured through the execution of the proposed system are discussed in this section with description about the considered dataset, performance metrics, EDA (Exploratory Data Analysis), experimental results, performance and comparative analysis.

Dataset Description
The present research has considered Covid CT dataset. As CT scans seem to be reliable in affording fast, and accurate testing and screening of Covid-19, this study considers a publicly accessible Covid-19 CT dataset. Image augmentation is utilized for enhancing the dataset size by developing new images which seems to be a variation of actual images. Accordingly, actual images in the dataset is 746 images with 275 positive cases and 471 negative cases. The augmented image counts include 162 images. Hence, as a result of augmentation, 908 images are attained. Taking this into account, the present work fosters the study and progress of DL based models that predict if an individual is infected with Covid-19 through the analysis of his or her CT scans. The dataset is considered from, https://www.kaggle.com/datasets/luisblanche/covidct?select=CT_NonCOVID

Performance Metrics
The metrics that are regarded to assess the performance of the proposed work are discussed in this section, A) Accuracy It indicates the computation of overall precise classification and is represented by equation.23. Accuracy = Tr −ve +Tr +ve Tr −ve +Fa −ve +Tr +ve +Fa +ve (Eq.

EDA (Exploratory Data Analysis)
EDA indicates the crucial procedure of undertaking initial data analysis for exposing patterns, for spotting anomalies, for testing hypothesis, for checking assumptions with graphical representations and summary statistics. The Covid-19 and Non-Covid-19 images are shown in figure.5. Following this, the target count plot for Covid and Non-Covid are shown in figure.6, wherein, Non-Covid-19 cases are more than Covid-19 cases.

Experimental Results
The outcomes attained through the execution of the proposed system are discussed in this section (as shown in figure.7). In this case, Gradcam is presented that makes use of classification score gradients in accordance with overall feature maps of convolution to find the input image parts that significantly influence classification score. Areas wherein this gradient seems to be large are actually the areas where overall score relies on data.  Figure.7a exposes Non-Covid-19 cases, Figure.7b represents the Non-Covid-19 cases, and Figure.7c denotes the Covid-19 cases.

Performance Analysis
Performance of the present proposed system is assessed with regard to confusion matrix and loss curve. The procured results are presented in this section. Initially, a confusion matrix is constructed which is a table that defines the classifier performance. This matrix visualizes and concludes the classifier performance. It affords information regarding errors made by classifier and error kinds that prevail. It also explores the way in which a classifier is confused or disorganized in making predictions. Confusion matrix is given in figure.8, while, the loss curve is shown in figure.9. Though there exists error, correct classification rate has been found to be higher than misclassification rate. This reveals the efficacy of the proposed system.

Internal Comparison
The proposed system has been assessed to determine its effectiveness in classifying Covid and Non-Covid cases. The respective outcomes are discussed in this section (as shown in table-1 with its equivalent graphical depiction in figure.10).  figure.10, accuracy value of proposed system to determine Non-Covid cases have been found to be 0.9789, while, the accuracy value of proposed system to determine Covid cases have been found to be 0.9774. Furthermore, the precision, F1-score and recall rate of the proposed system has been found to be better. Additionally, the overall performance rate has been assessed and the corresponding results are shown in table-2 with its equivalent graphical depiction in figure.11.  figure.11, overall accuracy of the proposed system has been found to be 97.82%, while, precision rate has been found to be 96%, recall rate has been exposed to be 96% and F1-score has been explored to be 96%. The outcomes confirmed the efficacy of the proposed system.

Comparative Analysis
The proposed system has been validated by comparing with traditional works in accordance with accuracy. The procured results are shown in table-3 with its corresponding graphical depiction in figure.12.  figure.13, the existing methods like random initialization has shown 83% accuracy, while TL has shown 87.1% accuracy, TL+CSSL has revealed 89.1% accuracy. However, the proposed method has exposed high accuracy rate of 97.82%. Contrarily, the F1score of random initializations has been found to be 83.2%, while, TL+CSSL has shown 89.6%, whereas, the proposed system has exposed high F1-score rate of 96%. From the comparative analysis, it has been found that, the proposed system has shown effective performance than existing system. The present work has used spatial and attention based feature refinement for detecting Covid-19. Moreover, residual connections with Conv-BAM have been employed to minimize overfitting. These advantages have made the proposed system to show better outcomes.

Performance Analysis
Accuracy F1-score The study strived to detect Covid-19 from Lung CT scan image dataset. The study also endeavoured to resolve the prevailing drawbacks of conventional works with regard to the prediction rate. For accomplishing this, the study used Residual Descriptor with Conv-BAM to procure the refined-features from SWAM and CWAM for improvising the classification rate. The proposed method was assessed with regard to metrics (accuracy, recall, F-measure and precision) for proving the efficacy of the system than existing works for detecting Covid-19. The outcomes exposed the better performance ability of the proposed system with 97.82% accuracy. The model performs in an optimal way for binary classification to determine the existence or absence of Covid-19 infection. However, there are several severities in Covid-19 that are valuable for medical practices to attain effective treatments, thereby monitor recovery. Like any other study, this study has certain limitations. First, this DL based covid detection methods is data driven and depend on patterns and features of the training data; the biases in training data can affect the results; thus, it has less generalizability. Second, interpreting the outcome of this model is difficult for the clinicians and it reduces the validity of the outputs among clinicians. Third, this model is trained with specific set of data which may generate overfitting issues. Hence, in future, the method can be extended to address the stage wise analysis of Covid-19 infection and integrating the data of subsequent sources and modalities.