What's happening at home?

What's happening at home?

Enough with the purely random time series. Let’s dive into generating time series that are based on a model simulation (yes, still randomness, but with some recognizable parts). For example, let’s generate a set of time series based on fictive sensors in a fictive apartment building.

This apartment we are simulating has five different sensors:

  1. Temperature: in degrees Celsius
  2. Humidity: percentage humidity
  3. CO2: parts per million (ppm) of CO2 (rises with the presence of humans)
  4. Light: light strength in lux
  5. Motion: is motion is detected?
Read More

Inverse Fourier for Repeating Pattern

Inverse Fourier for Repeating Pattern

The Fourier transform decomposes a signal into a sum of weighted sine waves. The rational behind the method is that slow moving sine waves capture the general trend of the time series. Whereas the fast moving sine waves capture the details in the time series.

Why decompse a time series into sine waves? Noise is seen in the details. Thus, removing the fastest moving sine waves corresponds to removing noise. Denoising is an example of a function that is easily implemented/expressed on sine waves, but very difficult on the original time series.

In this blog post, we will use the inverse of the Fourier transform: starting from a set of sine wave, we re-compose the time series. The contribution of each sine wave in the set is randomly chosen.

Read More

Walking Randomly

Walking Randomly

In the previous blog post, I wrote about generating random time series data. It was a first taste of time series and generating data with Python. In this post, I want to add historical context to the time series data. A (Gaussian) random walk, takes a random step at each time step. The random step is drawn from a normal distribution and added to the value of the previous point. As such, historical context is built up.

Read More

Generating Test Data

Generating Test Data

I’m writing this blog to learn about time series, programming in Python and Rust and database architecture. In this post, I want to get started with generating time series data.

What is a time series? Simply put, a time series is a sequence of data points indexed in time order. You encounter time series data every day — think of sensor measurements, sales data, stock prices, and weather forecasts.

The goal is to get a feel for what time series data looks like, what types of time series there are, what some properties are, and how to generate realistic looking fake data. Having test data with known properties will be extremely useful for testing and benchmarking time series databases.

Read More

Welcome to my blog

Welcome to my blog

Hi!

I’m Joris, a programmer and database enthusiast. I’m currently a Staff Software Engineer at a company that builds time series analytics software.

Why am I starting this blog? Because of the evolution of database technologies and the new possibilities they bring. In this space, I’ll explore topics related to database architecture, time series, query languages, composable data systems, and programming in general.

The recent emergence of technologies like Apache Arrow & DataFusion, DuckDB, and the data lakehouse architecture has made building custom databases more accessible than ever. These innovations are reshaping the data landscape, and I’m excited to delve into them and share my findings with you.

Everything on this blog represents my personal views, evolving over time. I hope my writing inspires you to explore these topics further and perhaps even challenge my perspectives. Feel free to reach out with your thoughts, questions, or suggestions—I’d love to hear from you!

Read More