Special Projects

Research Engineer (PhD) - Database Internals / Special Projects

About Hotdata, Inc

We’re an early-stage, VC-backed startup at the intersection of data systems and agentic AI. Founded by three experienced engineers and product leaders, we’re building an agent-first product at the intersection of data systems and developer platforms. We combine modern data systems, vector indexing, and RAG techniques to give agents instant, intelligent access to enterprise data - without the complexity of traditional data pipelines. We’re an early team solving hard distributed systems challenges with cutting-edge technologies (DataFusion, Arrow, Rust, cloud-native infra). If you want to shape the foundation of a new compute layer for AI agents, this is the place.

Description

Company: Hotdata, Inc.

Location: Remote or Bay Area

Type: Full-time

About Hotdata

Hotdata is building the next generation of the data layer for AI systems.

For the past decade, data infrastructure has been optimized for dashboards and analytics. But when agents are the creators and consumers of databases, we require new primitives for data systems.

Hotdata is building the infrastructure layer that enables agent-native applications to store, query, and reason over structured data efficiently. Our work sits at the intersection of database internals, distributed systems, query engines, and AI infrastructure.

We are looking for a research-oriented engineer with deep systems curiosity to explore new ideas in data systems and help translate them into working prototypes.

The Role

You will work directly with the founders on special projects exploring the future of database architecture, including:

New query engine architectures
AI-native data systems
Incremental computation and streaming data
Query planning and execution optimization
Vectorized execution and memory layouts
Agent-driven data workflows

Your work will start as research prototypes and often evolve into core components of the platform.

What You Will Work On

Examples of problems we are actively exploring:

Next-generation query planning and optimization
Systems built on Apache Arrow-style columnar memory models
Incremental and reactive query engines
New approaches to dataflow execution
Efficient state management for agents
Hybrid analytical/operational storage engines
Distributed query processing
Rust-based data infrastructure

This role sits close to database internals, not application development.

Responsibilities

Research and prototype new architectures for data systems
Build experimental query engines or execution frameworks
Work on low-level systems components in Rust or C++
Explore novel ideas in query planning, execution, and storage
Publish technical insights through internal papers, blogs, or talks
Translate research ideas into production-grade infrastructure
Collaborate with engineers building the Hotdata platform

What We’re Looking For

Required

PhD in Computer Science, Distributed Systems, Databases, or related field
Deep understanding of database internals or query engines
Strong systems programming experience (Rust, C++, or similar)
Experience working with at least one of:
query engines
compilers
distributed systems
storage engines
Strong research and prototyping ability

Bonus

Experience with Apache Arrow, DataFusion, DuckDB, Velox, or Spark
Contributions to open source data infrastructure
Experience with vectorized query execution
Background in query optimization
Research publications in data systems conferences (SIGMOD, VLDB, CIDR, etc.)

What Makes This Role Unique

Work directly on the core architecture of a new data system
Small team where research ideas ship into production
Opportunity to influence the future of AI data infrastructure
Freedom to pursue deep technical ideas

Ideal Candidates

You might be a:

PhD student finishing research in databases or distributed systems
Systems engineer who has worked on query engines or storage engines
Open source contributor to data infrastructure projects
Researcher interested in building real systems

Technologies We Care About

Examples of systems and technologies relevant to this work include:

Rust
Apache Arrow
DataFusion
Parquet
DuckDB
Query optimizers
Distributed data systems

Why Join Hotdata

Most database companies optimize existing systems. We believe the rise of AI agents fundamentally changes the data layer. Hotdata is building the infrastructure for that shift. If you enjoy working at the boundary of research and real systems, we'd love to talk.

Salary

$10,000 - $20,000 per month

Apply for Research Engineer (PhD) - Database Internals / Special Projects

Hotdata, Inc Research Engineer (PhD) - Database Internals / Special Projects San Francisco, CA · Full time Apply for Research Engineer (PhD) - Database Internals / Special Projects