快速傅立叶变换fftMeteorology students hardly experience smooth and expeditious data analysis. When comes to results, they oftentimes plunge to nebulous insights and vague conclusions. Despite how ...
Meteorology students hardly experience smooth and expeditious data analysis. When comes to results, they oftentimes plunge to nebulous insights and vague conclusions. Despite how clearly the method used, meteorology students shouldn’t take the idea of being creative in handling the data for granted. The quality of results totally depends on how creative they scramble the data and extract its insights. Hence the better result quality comes the more beneficial explanation. For the practical instance, when harnessing the time series meteorological dataset on analysis, many students trivialize the existence of another possible domain on the dataset. Many students don’t treat the time series data as a signal or wave that related to many parameters such as frequency and period. In this post, I’m going to show an uncomplicated example on how we use other possible domain of time series data -which is frequency- on analyzing meteorology phenomenon encompassing its practical step-by-step methods using python programming language.
First things first, prepare the data that going to be analyzed. For example, I use 37-years daily precipitation from CHIRPS which is rainfall estimates from rain gauge and satellite observations. The data has been cropped at the specific BMKG Station in Bandung, named Cemara Weather Station. Since CHIRPS data more complete than BMKG data, I personally prefer to use CHIRPS rather than use BMKG data despite the considerable discrepancy in the data when compared.
Fig 1 shows the daily precipitation (mm) from 1st January of 1981 until 31st December of 2018. We might already have known that Bandung or any other region located on Java island has two seasons (dry and rainy). The peak of rainy season in Bandung normally happens in December-January-February (DJF) and the trough of rainy season -which also called dry season- in Bandung normally happens in June-July-August. Much of research proved that seasonality weather in Bandung (and mostly western Indonesia) is affected by Asia-Australia monsoon. As we know the trend of rain seasonality, we suppose to be able to depict it as a wave or signal since it has the peak and trough. For the better overview, instead of figure the data as a panel fraught of raw points connected to each other, let’s turn it to monthly precipitation.
Now we have a better overview of rain seasonality in Bandung. This wavy pattern transpires one time a year and some years may have shifted of peak and trough of the rainy season. Notwithstanding, the pattern remains.
How if there were any else possibly periodically pattern occur in a year? Semiannual pattern or quarter-annually pattern? It could be and it should have been known by meteorology students as they recognize that rain pattern in Indonesia is very complex and affected by up to global-scale events and influenced by various topographic features.
Fast Fourier Transform (FFT) is one of the most useful tools and widely used in signal processing. FFT has contributed to many problems solving observations in astronomy, physics, and chemistry. In meteorology, FFT has been utilized for many kinds of research and most of it is related to climate data analysis. Climate data tend to have a long period of observation which reflects scientist’s definition that climate normal as an average over a recent 30-year period.
FFT is a very complicated mathematical equation, and to be honest, I’ve never fully understood how’s the FFT equation works. To see more about FFT and how it works, check this out (A gentle introduction to the FFT). In this article, I want to introduce how to use scipy.fft library in python to decompose seasonality patterns in 37-year precipitation data and get yourself (and myself) to gradually adept with python data analysis library.
Use pandas library to read csv to see the data as data frame like this:
The data has 13,879 rows represents daily precipitation (mm) from 1981–2018 and has no null values. The annual seasonality in Fig. 1 can be seen by a naked eye. However, it’s not enough to present the existence of seasonality only by subjective plain sight towards a graph, thus we need more palpable images that could indicate the seasonality by exact number.
In this step, we start to use FFT to transform the precipitation data which only has time domain to frequency domain. Declare dt = 1 as long as we want to analyze the data on daily basis over a 37-year period of time. The FFT result represents by F.
Bear in mind that the time series precipitation data is a combination of many frequency waves which has each wave parameters and amplify one another. By transforming the data to frequency domain, we could get each set of waves with certain frequencies that build-up the data.
Next, assign data length to n and calculate sample frequency with fftfreq function in which requires n and dt as arguments. It returns float array w contains the frequency bin centers in cycles per unit of the sample spacing (with zero at the start). Don’t forget to set synthetic data variable which represents the time array with an increasing value from 1 until n (data length). Then, set a new variable that portrays the number of periodicity types of the signal whereby the length is half of the original data. After that, select only indices for elements that correspond to positive frequencies. To know the details about why we do this step, check this thread: Why is FFT “mirrored”?
The graph called periodogram. The difference between the two graphs above is on the x-axis. The upper graph shows periodogram with frequency as an x-axis, on the contrary, the lower graph uses 1/frequency (period) as x-axis. I found that the lower graph is way more intuitive and comprehensible since the period’s unit is the same as the data time unit which is a day. The y-axis is the amplitude of each periodicity/frequency that build-up the data. From the lower graph, it can be seen that from the 37-years length of data, the highest contributions are ranging between lower period sub-signal for exact < 1000 days of a period. Let’s point out that range.
By limiting the x-axis range, we get the clear-cut of the periodicity ranges which have a significant amplitude in forming the original data. Without being examined, 365-days periodicity has the highest amplitude as I mentioned earlier. It clearly represents the most common and well-known of Bandung rain seasonality (annual seasonality). The second highest amplitude is on 183-days (~6-month, semi-annual seasonality) and the third is on 122-days (~4-month, quarter-annual seasonality). It turns out that Bandung rain seasonality is not only transpired annually, but it also has semi-annual and quarter-annual patterns, nevertheless, the amplitude is not quite high which means that the patterns are less frequent to exist over the time scope of the data.
Meteorologically, these patterns could happen inflicted by various events such as local disturbances until global variabilities like El Nino-La Nina or Madden-Julian Oscillation (MJO) effects. We should need more researches to be able to answer the cause of these available patterns of Bandung rainfall seasonality. Python programming language with its user-friendly scientific package has been bringing us to advance the data analysis regarding many sectors. This simple use case in meteorology maybe not as complex as other use cases, but it worth knowing and worth sharing. After using FFT and knowing the hidden patterns of the data, there are so many analysis tools which also practical to meteorological science use cases, like filtering (low-pass, band-pass, high-pass filtering) and many more. Hope this article is pertinent to your study field and could help you to get more profound data analysis.
关于傅立叶变换，无论是书本还是在网上可以很容易找到关于傅立叶变换的描述，但是大都是些故弄玄虚的文章，太过抽象，尽是一些让人看了就望而生畏的公式的罗列，让人很难能够从感性上得到理解，最近，我偶尔从网上看到一个关于数字信号处理的电子书籍，是一个叫Steven W. Smith, Ph.D.外国人写的，写得非常浅显，里面有七章由浅入深地专门讲述关于离散信号的傅立叶变换，虽然是英文文档，我还是硬着头皮看完了有关傅立叶变换的有关内容，看了有茅塞顿开的感觉，在此把我从中得到的理解拿出来跟大家分享，希望很多被傅立叶变换迷惑的朋友能够得到一点启发，这电子书籍是免费的，有兴趣的朋友也可以从网上下载下来看一下，URL地址是：
让我们先看看为什么会有傅立叶变换？傅立叶是一位法国数学家和物理学家的名字，英语原名是Jean Baptiste Joseph Fourier(1768-1830), Fourier对热传递很感兴趣，于1807年在法国科学学会上发表了一篇论文，运用正弦曲线来描述温度分布，论文里有个在当时具有争议性的决断：任何连续周期信号可以由一组适当的正弦曲线组合而成。当时审查这个论文的人，其中有两位是历史上著名的数学家拉格朗日(Joseph
Louis Lagrange, 1736-1813)和拉普拉斯(Pierre Simon de Laplace,1749-1827)，当拉普拉斯和其它审查者投票通过并要发表这个论文时，拉格朗日坚决反对，在近50年的时间里，拉格朗日坚持认为傅立叶的方法无法表示带有棱角的信号，如在方波中出现非连续变化斜率。法国科学学会屈服于拉格朗日的威望，拒绝了傅立叶的工作，幸运的是，傅立叶还有其它事情可忙，他参加了政治运动，随拿破仑远征埃及，法国大革命后因会被推上断头台而一直在逃避。直到拉格朗日死后15年这个论文才被发表出来。