Higher-order tensor decomposition based scalp-to-intracranial EEG projection for detection of interictal epileptiform discharges

Objective. Interictal epileptiform discharges (IEDs) occur between two seizures onsets. IEDs are mainly captured by intracranial recordings and are often invisible over the scalp. This study proposes a model based on tensor factorization to map the time-frequency (TF) features of scalp EEG (sEEG) to the TF features of intracranial EEG (iEEG) in order to detect IEDs from over the scalp with high sensitivity. Approach. Continuous wavelet transform is employed to extract the TF features. Time, frequency, and channel modes of IED segments from iEEG recordings are concatenated into a four-way tensor. Tucker and CANDECOMP/PARAFAC decomposition techniques are employed to decompose the tensor into temporal, spectral, spatial, and segmental factors. Finally, TF features of both IED and non-IED segments from scalp recordings are projected onto the temporal components for classification. Main results. The model performance is obtained in two different approaches: within- and between-subject classification approaches. Our proposed method is compared with four other methods, namely a tensor-based spatial component analysis method, TF-based method, linear regression mapping model, and asymmetric–symmetric autoencoder mapping model followed by convolutional neural networks. Our proposed method outperforms all these methods in both within- and between-subject classification approaches by respectively achieving 84.2% and 72.6% accuracy values. Significance. The findings show that mapping sEEG to iEEG improves the performance of the scalp-based IED detection model. Furthermore, the tensor-based mapping model outperforms the autoencoder- and regression-based mapping models.


Introduction
Epilepsy is a chronic brain disease characterized by epileptic seizures occurring due to excessive discharges of a group (or groups) of neurons in the cerebral cortex or hippocampus. It is associated with interictal epileptiform discharges (IEDs) occurred between two seizure onsets, which can be captured by electroencephalography (EEG) [1]. IEDs are transient activities distinguishable from the background activity. They consist of spikes, lasting 20 to 70 ms; sharp waves, lasting 70 to 200 ms; or polyspikes, a sequence of two or more spikes, which are variably followed by slow waves, lasting longer than 125 ms [2]. In terms of frequency range, the IEDs mostly consist of higher frequency components because of having very short duration. Bao et al [3] showed that the IEDs/spikes may occur at the frequency range of 15 to 50 Hz. Therefore, lower than 4 Hz frequencies are filtered in most IED studies [4][5][6]. Some studies investigate higher frequency rates of gamma band in the IED detection studies [5][6][7][8].
IED detection can establish a guideline for preictal state monitoring, seizure prediction, treatment and surgical planning. Seizure prediction alleviates taking regular anticonvulsants and fall injury. Now, detecting IEDs from the scalp EEG (sEEG) up to a sufficient accuracy has raised hopes for developing a new direction for more concise seizure monitoring and prediction.
The foramen ovale (FO) electrodes are used for recording intracranial EEG (iEEG) signals from patients suffering from temporal lobe epilepsy [9]. Recording via FO electrodes is a semi-invasive recording technique and an alternative to invasive recording techniques in the presurgical evaluation of patients with temporal lobe epilepsy [10]. The FO electrodes are bilaterally introduced via anatomical holes into the ambient cistern [11] and provide an opportunity to simultaneously record scalp and mesial temporal structures without disruption to brain coverings [12].
The sEEG recordings suffer from low sensitivity in capturing the IED signatures. Most IEDs are generated from deep sources [13][14][15]. Therefore, they are attenuated by being away from the source location and the resistance of the skull. Studies conducted on simultaneous sEEG and iEEG have proven that a large proportion of IEDs is invisible over the scalp [14,16]. Nayak et al reported that only 9% of IEDs are observable on the sEEG recordings [14]. In [16], the authors detected 22% of IEDs over the scalp using 19 scalp EEG channels. On the other hand, detection of IEDs from sEEG recordings is of paramount importance because of being a noninvasive technique and easy to record. Therefore, detecting scalp-invisible IEDs from over the scalp motivates a new direction in the IED detection and seizure prediction spheres.
Most of previously developed algorithms either detect IEDs from the iEEG recordings [17][18][19] or detect only scalp-visible IEDs from the sEEG recordings [20,21]. There are very few methods developed to detect both scalp-visible and scalp-invisible IEDs from over the scalp [22][23][24]. Spyrou et al developed a model based on time-frequency (TF) features to detect both scalp-visible and scalp-invisible IEDs from the sEEG signals [24]. The authors utilized concurrent sEEG and iEEG recordings in their study. The iEEG was used to annotate IEDs which were finally detected from the concurrent sEEG. The concurrent sEEG and iEEG recordings means that sEEG and iEEG are recorded simultaneously from the same subject.
Most of IED detection systems are designed based on features extracted from each channel individually [25]. In [26,27], a template matching method was employed for IED/spike detection. In template matching, firstly, an IED template is manually selected from the EEG recordings as the ground truth. Then, if the cross-correlation of a waveform with the template exceeds a predefined threshold value, the waveform is recognized as an IED segment. The mimetic technique is another channel-based method in which the EEG waveforms are usually decomposed into two half waves; then, the distinctive characteristics of IEDs such as amplitude, duration, height, sharpness, and slope of left and right half-wave spikes, usually used by neurologists for identifying IEDs, are extracted and used for automatically detecting the IEDs [28][29][30][31]. The major drawback of this particular method is that it ignores the IED spike shape variations across subjects, ages, and even trials. TF representation, separating the main components of each channel from background activity, is another used popular method in IED detection studies. The methods used for extracting the TF features include spectrogram [17,24], wavelet [32][33][34], Fourier [35], and Hilbert [36] transforms. However, some common features are shared among channels, trials, and subjects. Therefore, considering different EEG diversities in the analysis significantly boosts the IED detection systems performance.
Multi-way analysis (by means of tensor factorization), as a hot topic in signal processing, has been used in seizure and IED detection studies [22,23,[37][38][39][40]. In multi-way analysis, many aspects of data (e.g. time, space, frequency, trial, and subject) are integrated in an algorithm to more accurately detect, track and/or localize the brain-elicited signal sources. For the first time, Spyrou et al [37] applied tensor decomposition to detect IEDs from iEEG recordings by constructing a four-way tensor of time, frequency, channel, and trials and decomposing the tensor using Tucker decomposition (TD). In [23], the authors reduced the dimension of features by tensor factorization in order to detect IEDs from the concurrent sEEG and iEEG recordings. The iEEG was used for IED labeling and the sEEG for IED detection. The spectrogram method was applied to extract TF features. Then, a four-way tensor with the dimensions of channel, time, frequency, and trial (both IED and non-IED segments) was constructed. Next, TD was employed to decompose the tensor into the core tensor, spatial, temporal, and spectral factors. Finally, the test data was projected onto the factor matrices for IED detection. By exploiting small number of component, the number of features was reduced from 1260 to 64 without any reduction in the average classification accuracy. Quite recently, a model was proposed to detect IEDs from the sEEG recordings based on CAN-DECOMP/PARAFAC decomposition (CPD) and the probability of a waveform being an IED [22]. The IEDs are similar to other ordinary cortical activities and artifacts such as eye blink artifacts. Thus, their labeling involves uncertainty, meaning that the clinicians are not always sure whether the waveform is an IED or not. In [22], we incorporated the uncertainty in IED labeling in an IED detection system. Using this algorithm, all IEDs are concatenated into a threeway tensor-time, channel, and IED segment-and a probability tensor based on the probability of a waveform being an IED is constructed. After decomposing the data and probability tensors simultaneously using a weighted CPD algorithm [41] into temporal, spatial, and segmental factors, both IED and non-IED segments are projected into the spatial factors to extract the most discriminative features. Finally, the TF features of the projected segments are computed and used for classification.
The iEEG recordings enjoy high temporal resolution and hence high sensitivity in detecting IEDs/spikes. Therefore, if we map the sEEG to iEEG recordings by developing an effective projection model, the sensitivity of sEEG signals in the identification of IEDs can significantly increase. In a braincomputer-interface application, Kaur et al modeled the sEEG using a linear combination of iEEG via ordinary least-squares regression [42]. Spyrou et al trained a dictionary and a mapping function to map the sEEG to iEEG to increase the quality of data [43]. Quite recently, Antoniades et al developed a deep learning architecture using autoencoders (AEs) and convolutional neural networks (CNNs) to map the sEEG to iEEG to detect IEDs from concurrent sEEG and iEEG recordings [44]. Their model significantly outperformed the model in which only sEEG signals were employed for IED detection. These studies have motivated us to develop a model to map the TF features of sEEG to the TF features of iEEG recordings to detect IEDs from concurrent sEEG and iEEG recordings. Our proposed method is based on tensor factorization.
Tensor factorization has been used in mapping models in image and video processing domains [45][46][47]. Macêdo mapped a photographed expression performed by a given subject onto the photograph of another person's face by employing TD [45]. In [46], the authors mapped the video recorded performance of one person to facial animation of another. Wang et al [47] used tensor decomposition to model and synthesize the facial expressions of new persons. They extracted facial features by employing principle component analysis and concatenated them onto a threeway tensor of subject, expression, and feature, A ∈ R I×J×K , where I and J are respectively the number of subjects and facial expressions, and K indicates the dimension of the facial features. TD is employed to decompose the tensor into the core tensor and matrix factors: Then, two tensors are defined for the expression and person modes: Now, by having an expression of a new person, his or her other expressions can be synthesized through mapping the features of a given expression to the extracted expression tensor in (1). Similarly, if an unknown expression of a known person is given, it can be mapped onto the extracted person tensor in (2) to synthesize the same expression of other persons.
Here, we propose a model to map the TF features of sEEG to those of iEEG signals via tensor factorization techniques. The concurrent sEEG and iEEG recordings are utilized in this study. At first, both sEEG and iEEG recordings are decomposed by continuous wavelet transform (CWT) to obtain the TF features. All IED segments of iEEG recordings from the training dataset are concatenated into a four-way tensor of time, frequency, channel, and IED segment. The tensor is then decomposed into temporal, spectral, spatial, and segmental factors using TD and CPD. Finally, both IED and non-IED segments of sEEG from both training and test datasets are projected onto the temporal components to extract the discriminative features for IED detection. Two types of classifiers, namely decision tree ensemble (DTE) and K-nearest neighbors (KNN), are then applied for classification and the results are compared.
The rest of the paper is structured as follows. The mapping method is described in section 2. The experimental matters such as dataset description, preprocessing, and competing methods are provided in section 3. The results and discussion are respectively presented in sections 4 and 5. Section 6 concludes our work.

Proposed framework
The IED segments may share some temporal, spatial, and spectral features with each other. Meanwhile, non-IED segments can be non-epileptic spikes or normal cortical activities and thus it is assumed that there is no common feature among them. In [20], the authors show that when only IEDs are concatenated into a tensor the method provides significantly better performance as compared to when both IEDs and non-IEDs are included in the concatenation. Therefore, we are interested in concatenating only the IED segments into a tensor. The proposed method has three stages: (a) concatenating the TF features of intracranial IEDs into a tensor, (b) decomposing the constructed tensor, and (c) mapping the TF features of sEEG to those of iEEG recordings.

Concatenating the TF features of intracranial IEDs into a tensor
Suppose our training dataset consists of N IED segments from the concurrent iEEG recordings, X n ∈ R Lt×M fo , n = {1, · · · , N}, where L t is the number of time samples and M fo corresponds to the number of FO channels. CWT is applied to X n and the magnitude of wavelet coefficients is calculated as TF features. Then, the TF features of IEDs are concatenated into a four-way tensor, X ∈ R L×F×M fo ×N , where L and F respectively correspond to the number of temporal and spectral features. In CWT, complex Morlet wavelet is used as mother wavelet. Wavelet toolbox from Matlab 2018b is utilized to calculate CWT. In Matlab wavelet toolbox, the minimum and maximum scales are determined automatically based on the energy spread of the wavelet in frequency and time.

Decomposing the constructed tensor
In the second step, the constructed tensor is decomposed into temporal, spectral, spatial, and segmental factors. TD and CPD, which are the most popular tensor decomposition techniques, are employed in this study.

CPD based tensor decomposing
CPD decomposes the tensor into the sum of rank-one tensors. In the alternating least-squares (ALS)-based CPD algorithm, the factor matrices are exploited one by one. That is, the least-squared problem is solved for one of the matrices while holding other matrices fixed. However, ALS is unable to capture the underlying multilinear structure in the data. For solving this problem, Acar et al proposed a CPD algorithm based on a gradient optimization approach to find all factor matrices simultaneously [41]. Here, this gradientbased algorithm is used for optimizing the CPD problem.
Considering a four-way tensor, the CPD algorithm aims to find components such that where a ∈ R L , b ∈ R F , c ∈ R M fo , and d ∈ R N for r = 1, · · · , R are respectively temporal, spectral, spatial, and segmental vector factors. Note that R is the number of components and the symbol • indicates vector outer product. According to the 'Kruskal operator' [48], providing shorthand notation for sum of the outer products of the columns of a set of matrices, (3) can be modified to where A ∈ R L×R and B ∈ R F×R correspond respectively to the temporal and spectral factors, and C ∈ R M fo ×R and D ∈ R N×R are respectively the spatial and segmental factors. The problem of calculating the factor matrices in (4) can be formulated as a least-squares optimization problem: The gradient of the objective function j can be easily achieved by computing the partial derivatives with respect to each factor A, B, C, and D. The reader is referred to [41] for more details. After achieving the derivatives, any first-order optimization method can be applied. We employ a generic nonlinear conjugate gradient method like [41].

TD based tensor decomposition
TD enjoys more flexibility than CPD. Unlike CPD in which the core-tensor is diagonal, the core-tensor does not need to be diagonal in TD, while the factors are orthogonal matrices. We employ the higherorder orthogonal iteration algorithm proposed by De Lathauwer et al [49] to solve the TD problem. Suppose we are given the four-way IED tensor X ∈ R L×F×M fo ×N . The aim is to find a tensorX ∈ R L×F×M fo ×N , having rank 1 (X ) = R 1 , rank 2 (X ) = R 2 , rank 3 (X ) = R 3 , and rank 4 (X ) = R 4 , that minimizes the least-squares cost function The tensorX can be decomposed tō , all with orthonormal columns, are respectively temporal, spectral, spatial, and segmental factor matrices. G ∈ R R1 ×R2 ×R3 ×R4 is the core tensor and its entries illustrate the level of interaction between the factor matrices. Note that × i indicates the ith mode product. The reader is referred to [49] for more details.

Mapping the TF features of sEEG to those of iEEG recordings
Suppose our training and test datasets consist of K IED and non-IED segments from sEEG recordings, Y c ∈ R L×Msc , c = {1, · · · , C}, where M sc indicates the number of scalp channels. CWT is employed to decompose the segments. The magnitude of wavelet coefficients is obtained as TF features. Then, the TF features of each segment are concatenated into a three-way tensor, Y c ∈ R L×F×Msc , c ∈ C. Finally, Y c are projected onto the temporal factors obtained using CPD and TD, respectively referred to mapping scalp to intracranial recordings via CPD (MStI-CDP) and via TD (MStI-TD). Here, the temporal components can provide the most discriminative features since the IED signatures are captured by intracranial and scalp electrodes at the same time. Meanwhile, the spatial distributions of intracranial and scalp signals are generally different. Thus, projecting the segments onto the spatial components is meaningless.

MStI-CDP
The TF features of IED and non-IED segments extracted from sEEG recordings, Y c , are projected onto the temporal factors achieved using CPD, A, as follows: where Z c ∈ R R×F×Msc for c = {1, · · · , C} is the projected segment of {Y c , c ∈ C} and (.) ⊤ indicates transpose of a matrix. Now, the projected segments, Z c , are used for classification. Figure 1 shows the schematic of the proposed MStI-CPD model.

MStI-TD
After obtaining the temporal factors using TD,Ā, the TF features of IED and non-IED segments extracted from sEEG recordings, Y c , are projected onto them as follows:Z whereZ c ∈ R R1 ×F×Msc for c = {1, · · · , C} indicates the projected segment of {Y c , c ∈ C}. Finally, the projected segments,Z c , are employed to classify IEDs and non-IEDs. The schematic of the proposed MStI-TD model is presented in figure 1.

Dataset
The sEEG and iEEG signals of 18 patients suffering from mesial temporal lobe epilepsy were simultaneously recorded in the Department of Clinical Neurophysiology at the Maudsley and Kings College Hospitals. The signals were recorded at a sampling rate of 200 Hz. A bandpass filter with cutoff frequency of 0.3 and 70 Hz was used during recording. Twenty standard chloride silver cup electrodes were employed according to the 'Maudsley' electrode placement system [50,51] to record the sEEG signals. This system is essentially similar to the 10-20 system. But the mid-temporal, posterior-temporal and occipital electrodes in the Maudsley system are approximately 20 mm lower than in the 10-20 system [52]. The advantage of this system as compared to the standard 10-20 system is that it provides more extensive and adequate coverage of the lower part of the cerebral convexity adapting itself to cranial asymmetries [53]. For recording iEEG, two flexible bundles of six electrodes, a total of 12 electrodes, were inserted bilaterally via anatomical holes to lie next to mesial temporal structures. Both sEEG and iEEG signals were recorded as a common reference to Pz. A period of 20-min concurrent recordings of each epileptic subject is analyzed in this study.

IED scoring
The iEEG was used as the ground truth for scoring the IEDs by an expert epileptologist who scored the IEDs based on the morphology and spatial distribution of the observed waveforms and put them in different groups. In our previous work [24], the scoring method has been completely described. Briefly, each IED is classified into one of the following groups: (a) scalp-invisible IED (b) scalp-visible IED by considering the concurrent iEEG (c) scalp-visible IED without considering the concurrent iEEG.
For classification, all three groups of IEDs fall within the same IED class. An IED segment from each group and a non-IED segment are illustrated in figure 2. In the scalp-invisible IED segment, there is no sign of epileptiform discharges in the scalp channels, while the FO channels capture the IED waveforms. In the 'scalp-visible IED by considering the concurrent iEEG' , a weak IED waveform can be detected in the scalp channels by referencing to the concurrent iEEG. In the 'scalp-visible IED without considering the concurrent iEEG' , the epileptiform discharges are individually observable from the scalp channels.

Preprocessing
The IED spikes have high frequency components due to their sharpness. Therefore, filtering low frequency components such as delta does not affect the characteristics of epileptic discharges [4][5][6]. Here, a Butterworth high-pass filter with a cutting frequency of 4 Hz and the order of 6 was employed to remove DC shift, baseline fluctuations, and eye blink artifacts. In addition, a notch filter with notch frequency of 50 Hz was applied to eliminate the power line interference. The filters were applied to both sEEG and iEEG recordings.
For removing artifacts, a spatial filter is applied to only sEEG recordings. The average amplitude of sEEGs from four electrodes-namely F3, F4, F7, and F8-is calculated and subtracted from all the scalp signals. These electrodes are placed over the mesial temporal lobes, where the IEDs originate. Empirically, we found that these are the most effective electrodes in this application for re-referencing.

IED and non-IED segmentation
For classification, the sEEG and iEEG signals are segmented into either IED or non-IED segments. The segment length should be long enough to include the entire cycle of epileptiform activities and short enough to avoid the influence of background and undesired information. IEDs were selected with different lengths in previous studies, for example 250 ms in [19,54], 325 ms in [37,44], 500 ms in [55,56], one second in [57,58], and three seconds in [21]. Here, 160 ms before and 320 ms after the peaks marked as the onset of IED waveforms are selected as IED segments. That is, the length of each IED segment is 480 ms (96 samples). Non-IED segments with the

Decomposing the tensor using CPD or TD
In decomposing a tensor, the number of components should be large enough to avoid missing important information and small enough to avoid undesired information. In CPD, the numbers of components are selected to be between 3 and 8. Then, the optimized number of components is automatically obtained by a nested cross validation technique. The tensor X is decomposed into temporal, A ∈ R 96 ×R ; spectral, B ∈ R 37 ×R ; spatial, C ∈ R 12 ×R ; and segmental features, D ∈ R N×R . In TD, the numbers of temporal, spectral, and spatial components are selected to be between 3 and 8, and a nested cross-validation method is used to find the optimum number of components. However, the number of segmental components is selected to be the same as the number of segments R 4 = N. The TD algorithm is employed to decompose the tensor X into the core tensor, G ∈ R R1 ×R2 ×R3 ×N , and the factor matrices-temporal,Ā ∈ R 96 ×R1 ; spectral, B ∈ R 37 ×R2 ; spatial,C ∈ R 12 ×R3 ; and segmental features,D ∈ R N×N .

Mapping sEEG to iEEG through MStI-CPD and MStI-TD
For mapping the sEEG to iEEG, the TF features of all IED and non-IED segments from both the training and test datasets of sEEG recordings-Y c ∈ R 96 ×18 , c = {1, · · · , C}-are calculated. CWT with Complex Morlet wavelet is applied for extracting the TF features. Then, the TF features of each segment are concatenated into a three-way tensor, Y c ∈ R 96 ×37 ×18 , c ∈ C.
In MStI-CPD, Y c is projected onto the temporal factors obtained using CPD, A, through equation (8), which gives Z c ∈ R R×37 ×18 for c = {1, · · · , C}. Finally, Z c is vectorized and used for classification.

Feature selection
The number of extracted features depends on the number of temporal factors, selected to be between 3 and 8 by nested cross validation. Thus, the minimum number of features is 3 × 37 × 18 = 1998, and their maximum 8 × 37 × 18 = 5328. However, the number of features is relatively high for applying directly to a classifier. Therefore, we employ a feature selection method to find the most discriminative features.
Minimum redundancy-maximum relevance (MRMR) approach is used as a feature selection method [59]. In MRMR, the mutual information is calculated to measure the level of similarity between features. Those features that are mutually maximally dissimilar (their mutual Euclidean distances are maximized, or their pairwise correlations are minimized) are selected as discriminative features.

Classification and cross-validation
DTE (with bagging technique) and KNN are employed as classifiers. Two approaches of withinand between-subject classifications are used for evaluation. In the within-subject approach, IEDs and non-IEDs of each subject are classified separately, meaning that the training and test data come from the same subject. In this approach, k-fold (k = 5) cross-validation is employed for classification, that is, 4 folds are used for training and the rest for testing. On the other hand, in the between-subject approach, IEDs and non-IEDs of all the subjects are classified together. Leave-one-subject-out cross-validation is used in this approach. That is, the IED and non-IED segments of 17 out of 18 subjects are used for training the model, and the IEDs and non-IEDs of the left subject are used as the test data. This process is repeated for all subjects one-by-one. For finding the hyperparameters, including the optimized number of neighbors in KNN and the number of components in tensor decomposition techniques, nested crossvalidation is used. In the nested cross-validation, a subset of training data is employed for training the model and the rest are used for validation to optimize the hyperparameters. In previous tenor-based EEG studies, the optimized number of neighbors for a KNN classifier [22] and the optimized number of components for a tenor factorization method [22] were determined by cross-validation. It should be noted that the IEDs from the training dataset contribute to constructing the mapping model. Accuracy (ACC), sensitivity (SEN), specificity (SPEC), and F1 score (F1-S) are obtained as evaluation criteria as follows: where TP indicates the number of IED samples classified correctly in the IED class, TN is the number of non-IED samples recognized correctly as non-IED samples, FP corresponds to the number of non-IED samples detected incorrectly as IED samples, and FN corresponds to the number of IED samples categorized wrongly in the non-IED class. Accuracy shows the percentage of detection of IED and non-IED samples, sensitivity illustrates the ability of the classifiers in detecting the IED samples, and specificity shows the ability of the classifiers in detecting non-IED samples.

Competing approaches
Our proposed method is compared with the following four methods.
• SCA: In [22], a tensor-based method, namely spatial component analysis (SCA), has been proposed. In SCA, all IEDs from sEEG are concatenated into a three-way tensor. Temporal, spatial, and segmental factors are obtained by employing CPD. Finally, all IEDs and non-IEDs are projected onto the spatial components. In this method, only sEEG recordings are used in both training and test datasets. • TF: In the TF-based method [24], the TF features are extracted using the spectrogram method and used as the classification features for IED detection. In this work, an individual classifier is trained for each training subject.

Results
In our proposed method, two classifiers, namely KNN and DTE, are employed to classify the IEDs and non-IEDs. The results obtained using KNN and DTE are respectively illustrated in tables 3 and 2. The effective subset of features is selected based on the MRMR feature selection method. Figure 3 shows the performances of classifiers in the between-subject approach versus the number of selected features. The DTE classifier gives the best accuracy respectively through the first 100 and 80 features using MStI-TD and MStI-CPD methods. On the other hand, the KNN classifier achieves the best accuracy through the first 80 features in both MStI-TD and MStI-CPD methods. Table 2 shows the obtained results from MStI-TD and MStI-CPD using the DTE classifier. The performance of each subject, the mean of performance across subjects, and the standard error (SE) of the performance are illustrated. In the between-subject classification approach, the obtained accuracy values vary among subjects from 47.3% to 92.1% in both MStI-TD and MStI-CPD. However, the average accuracy values across subjects achieved using MStI-TD and MStI-CPD are respectively 72.6% and 72.7% with the SE of around 2.7% and 3.1%, which is promising as the accuracy value includes those of scalp-invisible spikes-which are over 80% of the total spikes-too. The obtained results from the within-subject classification approach are shown in the table in parentheses. MStI-TD provides the accuracy values of higher than 80% for most of subjects. In average, it achieves 84.2% accuracy, 80.5% sensitivity, 87.9% specificity, and 0.82 F1-score. The performance of MStI-CPD is comparable with MStI-TD and it classifies IED and non-IEDs with 81.3% accuracy and 75.5% sensitivity.
Overall, the performance of the KNN classifier shown in table 3 is relatively lower than that of the DTE classifier. It provides the accuracy of around 70% in the between-subject classification approach and around 80% in the within-subject classification approach (shown in parentheses) using both MStI-TD and MStI-CPD methods. However, the KNN classifier detects the IEDs with 62.4% and 62.0% sensitivity values using respectively MStI-TD and MStI-CPD in the between-subject approach. It also achieves 73.1% sensitivity value using both MStI-TD and MStI-CPD in the within-subject approach. Table 2. The performance of our proposed MStI-TD and MStI-CPD models obtained using the DTE classifier. SE shows the standard error of the performance. The numbers in parentheses are the within-subject classification performance. ACC, SEN, and SPEC are presented in percent (%). Recall that the accuracy value includes those of scalp-invisible spikes, which are over 80% of the total spikes, too.

MStI-TD
MStI-CPD   In subject 1, non of IEDs (SEN) is detected and both classifiers are biased to the non-IED class in both mapping models. In other words, most of the EEG segments are recognized as the non-IED segments. In subjects 3, 4, and 6, the KNN classifier is biased to the non-IED class as well. The value of sensitivity is less than 20% for these subjects, while that of specificity is 100%. The DTE classifier does not achieve high sensitivity value for these subjects as well. It can be concluded that these subjects may generate IEDs with different morphologies as compared to other subjects.
In the projection procedure, the segments are projected onto the temporal factors. When the IED morphologies of a test subject is different from those of the training subjects, the projection cannot give discriminative features and thus the classifier fails to detect the IEDs and non-IEDs. The number of segments does not affect the results. In the within-subject approach, for all the subjects, from those with less than 20 IED segments to those with more than 200 IED segments, a high performance is achieved. Conversely, in the betweensubject approach, where the number of segments in the training dataset is not significantly different among the subjects due to using leave-one-subjectout cross-validation, the performance differs for different subjects. Therefore, it can be concluded that the number of segments has no effect on the model performance.
The proposed mapping models have been compared with SCA, TF, LR, and ASAE-CNN. The obtained results are illustrated in table 4. The results of our proposed methods are reported based on the DTE classifier. The results of ASAE-CNN and LR are based on the results obtained in [44]. The performance of SCA and TF are based on [22] and [24], respectively. TF, LR, and ASAE-CNN studies reported the sensitivity, specificity, and F1-score only in the between-subject classification approach. All these studies [22,24,44] used the same dataset that we use for evaluating our methods. Our proposed mapping models outperform others. Among the compared methods, ASAE-CNN achieves the best accuracy of 68% which respectively is 4% less than that of our proposed MStI-TD and MStI-CPD models in the between-subject approach. ASAE-CNN and LR map the sEEG to iEEG. However, LR achieved 62% accuracy which is 3% less than the accuracy of TF, while the TF model is based on only sEEG recordings. SCA is biased to the non-IED class and provides a low sensitivity value of 43% in the between-subject classification approach.
Though our proposed method is biased to the non-IED class in very few subjects, there is a good balance between sensitivity and specificity. We project the IED and non-IED segments onto the temporal factors. Therefore, the bias occurs when the IED morphology of the test subject is different from those of the training subjects, which is rare. On the other hand, in SCA, the IED and non-IED segments are projected onto the spatial components. Because of spatial projection, SCA depends on the IED source location. Therefore, when the location of IED sources for the test subject differs from that of the training subjects, the system tends to be biased to the non-IED class. As it can be seen from table 4, SCA performs significantly better in the within-subject approach (the accuracy of 78% and sensitivity of 70%) as compared to the between-subject approach (the accuracy and sensitivity values of 62% and 43%, respectively). This difference between the performances of within-and between-subject approaches indicates the fact that SCA is sensitive to the epileptiform source location and is not necessarily suitable for employing in a between-subject classification approach.

Discussion
IED identification can establish a guideline for preictal state monitoring, seizure prediction, treatment, and surgical planning. However, the sensitivity of scalp recordings is low in capturing epileptiform discharges, and around 30% to 40% of patients considered for epilepsy surgery require intracranial recordings [60]. On the other hand, intracranial recordings are invasive and have side effects. Therefore, ameliorating the sensitivity of sEEG to detect IEDs motivates developing a new direction for more concise seizure prediction.
In some studies, the IEDs are detected with high sensitivity from sEEG signals [20,61]. Thanh et al proposed a tensor-based method to detect epileptic and non-epileptic spikes [20]. Their model achieved a sensitivity value of 83%. In [61], the authors developed a method based on deep neural networks. The model detects the epileptiform discharges with 81% sensitivity, while it suffers from low specificity of 46%. However, in both studies, scalp recordings are utilized for scoring IEDs. The main disadvantage of studies in which sEEG signals are used as ground truth for labeling IEDs is that they miss the scalp-invisible IEDs or spikes which often contribute to more than 80% of the IEDs by default.
Among the compared methods, TF and SCA detect IEDs from the concurrent sEEG recordings. That is, in these methods, the iEEG recordings are used as ground truth for labeling both scalp-visible and scalp-invisible IEDs, while only sEEG recordings are employed to detect IEDs. There is an appropriate balance between the sensitivity and specificity values in TF, although they are not really high. On the other hand, SCA suffers from low sensitivity, though it achieves high specificity. SCA is based on spatial components. Different subjects may have different epileptiform source locations. When the data from different subjects are combined, the spatial factors are corrupted and consequently the model performance deteriorates. Therefore, SCA fails to detect IEDs in the between-subject classification approach, though it is a powerful method for the within-subject classification approach and enable to classify the IEDs and non-IEDs with high accuracy.
LR and ASAE-CNN map the sEEG to iEEG recordings. The performance of LR is worse than all other methods. ASAE-CNN achieves the highest accuracy and sensitivity values among the competing methods in the between-subject approach, 68% and 67% respectively. Meanwhile, it provides a higher accuracy of 73% in the within-subject approach. However, when the training and test datasets come from different subjects (between-subject classification) and have different distributions, ASAE-CNN cannot be successful since the neural networks are highly sensitive to the data distribution.
One of advantages of our proposed tensor-based method is that it enjoys flexibility in terms of TF feature extraction and tensor decomposition methods. Here, the magnitude of continuous wavelet coefficients are extracted as TF features. Other TF methods, such as discrete wavelet, spectrogram, or synchrosqueezed transform highly used in biomedical signal processing [24,62,63], can be employed to extract the TF features. In addition, CPD and TD are applied here for decomposing the tensor into its factor matrices. Other algorithms, for example nonnegative-or sparse-based tensor decomposition methods, can be employed to decompose the tensor into the factor matrices. Here, the performances of both mapping methods are comparable. Therefore, it cannot be concluded that which one is superior to another. Since the magnitude of continuous wavelet coefficients used for classification are non-negative, non-negative-based tensor decomposition may give a better performance. In our future work, we aim to consider this point.
The main limitation of our proposed method is that it depends on the morphologies and shapes of IEDs. In our mapping model, the projection is performed onto the temporal factors that can vary if the IED morphologies significantly differ. Therefore, the performance of the model may be adversely affected for a new subject with an IED of very different morphology and shape.
Intracranial recordings are necessary for training the mapping-based methods. This can be a limitation for such methods. However, these methods can detect IEDs from new subjects without any need for the intracranial signals. Furthermore, we need to have some recorded iEEG signals from the training subjects to score the scalp-invisible IEDs.

Conclusion
Here, we propose a method based on TF and tensor decomposition to design a mapping algorithm for projecting sEEG to iEEG. At first, the TF features of intracranial IEDs from the training set are extracted and concatenated into a four-way tensor with time, frequency, channel, and IED segment slabs. Then, two tensor factorization algorithms, CPD and TD, are employed to decompose the tensor into factor matrices. In the mapping procedure, the TF features of scalp IEDs and non-IEDs from both training and test sets are computed and projected onto the temporal factors obtained from CPD and TD, respectively called MStI-CPD and MStI-TD. The results of other four methods, namely SCA [22], TF [24], LR [42], and ASAE-CNN [44], are reported here for comparison. The methods are validated in withinand between-subject classification approaches. MStI-CPD and MStI-TD outperform the competing methods in both approaches and achieve respectively the accuracy values of around 84.2% and 72.6% in within-and between-subject classifications. Meanwhile, among the competing methods, ASAE-CNN obtains the highest accuracy value of 68% in the between-subject classification approach. SCA and LR provide the worse accuracy value of 62%. These findings show that when the training and test data come from different subjects, ASAE-CNN and SCA cannot perform favorably because of being sensitive respectively to the data distribution and epileptiform source location (described in detail in section 5). However, the proposed method does not suffer from these limitations. Firstly, unlike ASAE-CNN, tensor factorization is not highly sensitive to the data distribution. Secondly, unlike SCA in which the data is projected onto the spatial factors, in the proposed MStI-CPD and MStI-TD methods the data is projected onto the temporal factors which is independent form IED source location.

Data availability statement
The data generated and/or analysed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.