TY - JOUR
T1 - Synthetic random environmental time series generation with similarity control, preserving original signal's statistical characteristics
AU - Aloni, Ofek
AU - Perelman, Gal
AU - Fishbain, Barak
N1 - Publisher Copyright:
© 2024
PY - 2025/2
Y1 - 2025/2
N2 - Synthetic datasets are widely used in applications like missing data imputation, simulations, training data-driven models, and system robustness analysis. Typically based on historical data, these datasets need to represent specific system behaviors while being diverse enough to challenge the system with a broad range of inputs. This paper introduces a method using discrete Fourier transform to generate synthetic time series with similar statistical moments to any given signal. The method allows control over the similarity level between the original and synthetic signals. Analytical proof shows that this method preserves the first two statistical moments and the autocorrelation function of the input signal. It is compared to ARMA, GAN, and CoSMoS methods using various environmental datasets with different temporal resolutions and domains, demonstrating its generality and flexibility. A Python library implementing this method is available as open-source software.
AB - Synthetic datasets are widely used in applications like missing data imputation, simulations, training data-driven models, and system robustness analysis. Typically based on historical data, these datasets need to represent specific system behaviors while being diverse enough to challenge the system with a broad range of inputs. This paper introduces a method using discrete Fourier transform to generate synthetic time series with similar statistical moments to any given signal. The method allows control over the similarity level between the original and synthetic signals. Analytical proof shows that this method preserves the first two statistical moments and the autocorrelation function of the input signal. It is compared to ARMA, GAN, and CoSMoS methods using various environmental datasets with different temporal resolutions and domains, demonstrating its generality and flexibility. A Python library implementing this method is available as open-source software.
KW - Air pollution
KW - Environmental simulations
KW - Fourier transform
KW - Sea waves
KW - Synthetic data generation
KW - Time series analysis
KW - Urban water demand
KW - Wind analysis
UR - http://www.scopus.com/inward/record.url?scp=85211127973&partnerID=8YFLogxK
U2 - 10.1016/j.envsoft.2024.106283
DO - 10.1016/j.envsoft.2024.106283
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85211127973
SN - 1364-8152
VL - 185
JO - Environmental Modelling and Software
JF - Environmental Modelling and Software
M1 - 106283
ER -