Specialized database that efficiently stores and retrieves time-stamped data
Inverted Index
Distributed, cloud native time-series database
LSM-like engine
FDAP stack
How it works for time series
Arrow: columnar memory format for flat and hierarchical data
DataFusion: LLVM-like Infrastructure for Databases
Order is important!
Automatic clustering based on history queries
Open as a "normal" table (single point hotspot)
Open as a "virtual" table
Aggregation Pushdown
SELECT
time_bucket(`timestamp`, '5 min') AS `timestamp`,
SUM(`value`) AS `value_sum`
FROM
`table`
WHERE
`timestamp` >= '2023-12-15 07:17:00'
AND `timestamp` < '2023-12-14 08:17:00'
AND ((`col2` IN ('T')))
GROUP BY
time_bucket (`timestamp`, '5 min')
Simplified physical plan
pub enum AggregateMode {
/// Partial aggregate that can be applied in parallel across input partitions
Partial,
/// Final aggregate that produces a single partition of output
Final,
....
}