Thursday, January 3, 2013

Water Level Analysis

The water level sensor has been steadily collecting data for more than 4 years now. Well, not so steadily. There are small gaps in the data due to things like internet connectivity issues, power outages, code bugs, you name it. Then there are larger data gaps, usually due to a physical malfunction of the sensor. Not to say it was poorly set up or is poorly maintained, in fact it has lived through a good number of storms.

As mentioned in the previous post about the build details, two sets of data are held in the online database. One is high temporal resolution data from only the most recent few days, and the other is coarser data for as long as the project has been going. For this post I will be showing this second data set and hopefully pointing out some interesting features. No build details or motivation, just good ol' analysis.

(Click to Embiggen)

Without any processing, this is what it looks like for the first 1000 days. After this, it's mostly gaps in the data due to my own negligence. But even in this 'nicer' part, there are some obvious flaws. The first is the large gap around day 200. I'll get back to that one. Next is the spiderweb of lines going back and forth through time near day 400. At some point around that time there was a bug in determining the timestamp to associate with a data point, so data was being sent around in time. A quick sweep through the data to remove non-sequential data points would probably clear that up. Next is some anomalous data near data 800 and 1000 showing the water level suddenly dropping to a constant -2.7 feet. After conversion, that is the level returned if the ultrasonic sensor consistently reads the maximum value it can. I'll be honest, I don't remember why that happened. Oh well!

Now that we've pointed out all of the flaws, let's look at the clean parts.
The first couple of days of data sure look clean. The most obvious trend is the daily tides which cause about a foot of variation every day.
More normal daily variations, but suddenly we hit the large gap from the first plot. Even before the gap, there is a long trend of rising water level. Right around the day 175 marker on this plot, Hurricane Ike made landfall a few miles away. The damage to pretty much everything on the island was pretty bad, so getting the water level sensor back up and running was not the number one priority. And so, large data gap.

But back to the cleaner parts of the data, we can also look at tide patterns and how storms play into the mix.
This is the longest portion of uninterrupted data I have, so I'll spend some time analyzing it. As before, we see the daily tide variation causing a foot of variation, but there is something else on top of this signal causing low amplitude nodes every 14 days or so. These low amplitude tides are called neap tides, while the large tides that occur in between are spring tides (this wiki article explains it more). Near day 125 is a storm that rolled through and messed up the nice tide pattern from the weeks before. As seen in both this storm and the hurricane from before, it's possible to predict incoming storms by a day or so just by keeping track of the water level.

The daily and monthly tidal variations are pretty obvious by looking at these time-series plots, but periodic features such as these should also be visible when looking as a function of frequency. The neap/spring tide pattern mimics a beat, so we might expect to see two peaks of power separated by a frequency splitting corresponding to 14 days. To look for this, I apodized the data from the previous plot using a reasonable-looking Gaussian, took a Fourier Transform, and plotted the logarithm of the magnitude.
The whole spectrum, log power. Clearly I record too many data points per day.

Looking at the low frequency power, there are two significant peaks located at 11 uHz and 22 uHz. These frequencies correspond to 25.3 hours and 12.6 hours, respectively. Not quite the frequency splitting I was looking for, but nice to see the daily tides showing up strong. Now at this point in my analysis, I nearly gave up thinking the frequency splitting was hidden under the noise. Maybe it takes more than just a few weeks of clean data to measure this subtle sun-moon interaction.

On a whim, I decided to check out the two small peaks on top of the left larger peak. Considering the scale of noise in the rest of the spectrum, I wouldn't really want to assign much meaning to these. But maybe.. By eye, the two peaks have frequencies 10.76 uHz and 11.57 uHz, so a difference of 0.81 uHz, which is a period of... 14.3 days! Exactly what was expected. Unfortunately, the scientist in me is saying this is just a cute coincidence due to noise and we shouldn't take this seriously. Still, kind of cool to see.