Hotdata, Inc Research Engineer (PhD) - Database Internals / Special Projects San Francisco, CA · Full time

Research Engineer (PhD) - Database Internals / Special Projects

About Hotdata, Inc

We’re an early-stage, VC-backed startup at the intersection of data systems and agentic AI. Founded by three experienced engineers and product leaders, we’re building an agent-first product at the intersection of data systems and developer platforms. We combine modern data systems, vector indexing, and RAG techniques to give agents instant, intelligent access to enterprise data - without the complexity of traditional data pipelines. We’re an early team solving hard distributed systems challenges with cutting-edge technologies (DataFusion, Arrow, Rust, cloud-native infra). If you want to shape the foundation of a new compute layer for AI agents, this is the place.

Description

Company: Hotdata, Inc.

Location: Remote or Bay Area

Type: Full-time


About Hotdata

Hotdata is building the next generation of the data layer for AI systems.

For the past decade, data infrastructure has been optimized for dashboards and analytics. But when agents are the creators and consumers of databases, we require new primitives for data systems.


Hotdata is building the infrastructure layer that enables agent-native applications to store, query, and reason over structured data efficiently. Our work sits at the intersection of database internals, distributed systems, query engines, and AI infrastructure.


We are looking for a research-oriented engineer with deep systems curiosity to explore new ideas in data systems and help translate them into working prototypes.


The Role

You will work directly with the founders on special projects exploring the future of database architecture, including:

  • New query engine architectures
  • AI-native data systems
  • Incremental computation and streaming data
  • Query planning and execution optimization
  • Vectorized execution and memory layouts
  • Agent-driven data workflows

Your work will start as research prototypes and often evolve into core components of the platform.


What You Will Work On

Examples of problems we are actively exploring:

  • Next-generation query planning and optimization
  • Systems built on Apache Arrow-style columnar memory models
  • Incremental and reactive query engines
  • New approaches to dataflow execution
  • Efficient state management for agents
  • Hybrid analytical/operational storage engines
  • Distributed query processing
  • Rust-based data infrastructure

This role sits close to database internals, not application development.


Responsibilities

  • Research and prototype new architectures for data systems
  • Build experimental query engines or execution frameworks
  • Work on low-level systems components in Rust or C++
  • Explore novel ideas in query planning, execution, and storage
  • Publish technical insights through internal papers, blogs, or talks
  • Translate research ideas into production-grade infrastructure
  • Collaborate with engineers building the Hotdata platform



What We’re Looking For

Required

  • PhD in Computer Science, Distributed Systems, Databases, or related field
  • Deep understanding of database internals or query engines
  • Strong systems programming experience (Rust, C++, or similar)
  • Experience working with at least one of:
  • query engines
  • compilers
  • distributed systems
  • storage engines
  • Strong research and prototyping ability

Bonus

  • Experience with Apache Arrow, DataFusion, DuckDB, Velox, or Spark
  • Contributions to open source data infrastructure
  • Experience with vectorized query execution
  • Background in query optimization
  • Research publications in data systems conferences (SIGMOD, VLDB, CIDR, etc.)

What Makes This Role Unique

  • Work directly on the core architecture of a new data system
  • Small team where research ideas ship into production
  • Opportunity to influence the future of AI data infrastructure
  • Freedom to pursue deep technical ideas


Ideal Candidates

You might be a:

  • PhD student finishing research in databases or distributed systems
  • Systems engineer who has worked on query engines or storage engines
  • Open source contributor to data infrastructure projects
  • Researcher interested in building real systems


Technologies We Care About

Examples of systems and technologies relevant to this work include:

  • Rust
  • Apache Arrow
  • DataFusion
  • Parquet
  • DuckDB
  • Query optimizers
  • Distributed data systems


Why Join Hotdata

Most database companies optimize existing systems. We believe the rise of AI agents fundamentally changes the data layer. Hotdata is building the infrastructure for that shift. If you enjoy working at the boundary of research and real systems, we'd love to talk.



Salary

$10,000 - $20,000 per month