The real-time feature platform for machine learning

Every feature, right on
thyme.

Define ML features once in Python. Thyme compiles them to a high-throughput Rust streaming engine — real-time serving, point-in-time correct training, zero skew between the two.

Thyme
Developer Experience

Define your features in idiomatic Python

Powerful data engineering workflows, without the infrastructure headaches. Powered by Rust.

Time-windowed aggregations (1m, 24h, 7d) run on a continuous Rust streaming engine. Values are updated within milliseconds of new events arriving - a kappa based architecture that is constantly streaming fresh data.

features.py
1from thyme import *
2
3@source(name="transactions")
4class Transaction:
5    user_id: str = field(key=True)
6    ts: datetime = field(timestamp=True)
7    amount: float
8
9@dataset(index=True)
10class UserSpend:
11    user_id: str = field(key=True)
12    avg_24h: float
13    avg_7d: float
14
15    @pipeline(version=1)
16    @inputs(Transaction)
17    def compute(cls, t):
18        return t.groupby("user_id").aggregate(
19            avg_24h=Avg(of="amount", window="24h"),
20            avg_7d=Avg(of="amount", window="7d"),
21        )
The Problem

ML infrastructure is painful

Every team building real-time ML hits the same wall. Training features and serving features drift apart, and accuracy quietly erodes in production.

Training/serving skew

Offline metrics look great. Production accuracy drops within weeks — not because the model is wrong, but because the features it sees in production are computed differently than the features it trained on.

Diverging pipelines

Batch jobs (Spark, dbt) compute training features. Streaming systems (Flink, microservices) compute serving features. A bug fix in one doesn't propagate to the other. The logic drifts.

Silent accuracy drops

Batch pipelines run on schedules — hourly, daily. A user's last transaction was 4 minutes ago, but your model sees yesterday's aggregate. You're serving predictions on stale data.

Thyme runs one pipeline. Training and serving read the same state — skew is structurally impossible, not a convention you enforce in review.

Read the full story
Features

Everything your ML pipeline needs

From feature computation to serving, Thyme handles the entire lifecycle so your team can focus on building great models.

Rust-Powered Engine

Features defined in Python are compiled to a high-throughput Rust streaming engine. Real-time aggregations with millisecond freshness.

Time-Travel Queries

Point-in-time correct feature retrieval for training. Query any feature exactly as it was known at any past moment.

Zero Training/Serving Skew

One definition, two modes. The same feature logic runs in both streaming aggregation and offline point-in-time lookups — no divergence, no silent accuracy drops.

Datasets, Pipelines & Extractors

Composable abstractions: datasets define event streams, pipelines apply windowed aggregations, and extractors compute derived features on read.

Exactly-Once Semantics

Distributed leasing, checkpointing, and replay logs ensure exactly-once processing with no data loss or duplication.

Declarative, Not Operational

No Kafka consumers to manage, no state stores to tune, no checkpoint recovery to handle. You own the feature logic — Thyme owns the infrastructure.

Architecture

Two paths, one definition

A streaming write path keeps features fresh; a query-time read path composes them for your model. Both paths read the same event-time-keyed state, so training and serving cannot drift.

WRITE PATH ▸continuous ingestion
on query▸ READ PATH
Source
Streaming
Kafka · Kinesis
Source
Polling
Postgres · Iceberg · S3
Dataset
Raw Dataset
event-time keyed
Pipeline
Pipeline
Sum · Count · Avg · Min · Max
Shared state
Aggregated Dataset
event-time · exactly-once
HTTP
Pipeline
Query Server
Pipeline
Extractor
composes features
Featureset
Response
online · point-in-time
ThymePowered by Thyme
Performance

Built for simplicity and speed

Thyme compiles Python feature definitions to a Rust streaming engine. Low latency, zero skew, and a three-command deployment workflow.

<0ms

P99 Online Latency

0

Definition for Online & Offline

0

Training/Serving Skew

0

Commands to Deploy

features.py
from thyme import *

@dataset(index=True)
class UserStats:
    user_id: str = field(key=True)
    ts: datetime = field(timestamp=True)
    avg_spend_7d: float

    @pipeline(version=1)
    @inputs(Transaction)
    def compute(cls, t):
        return t.groupby("user_id").aggregate(
            avg_spend_7d=Avg(of="amount", window="7d")
        )

Define features in Python. Deploy with thyme commit. Serve in milliseconds.

Thyme

It's about thyme you upgraded
your feature platform

Join the teams shipping ML features faster with Thyme. Get up and running in minutes, not months.