Hyperparam is building a new kind of frontend application for working with massive LLM datasets.
Hyperparam’s goal is to build a tool so efficient that one engineer can curate large ML datasets single-handedly. We believe that the way to accomplish this is: 1) build a highly scalable and interactive frontend experience that enables exploration and curation of massive ML datasets in the browser, and 2) dataset-scale inference that uses models to reflect back on their own training set to assist with curation. By combining these levers, we aim to make LLM dataset curation orders of magnitude more efficient than current approaches. By creating the best quality datasets, we will enable the creation of the world’s most capable models.
We're building complex technical infrastructure, not a typical web app. This is a frontend-only stack: there is no backend, everything is javascript-driven. If you enjoy reading V8's source code, diving into browser specs, or pushing the browser to its limits, we'd like to talk.
You can see examples of our open-source work here:
This opportunity is hybrid in-person in Seattle at a seed-stage startup. You would be one of the very first employees, working side-by-side with an experienced team building a new kind of dataset curation tool. This will require intense work ethic, dedication, creativity, and independence that is necessary at an early stage startup. For the right candidate, this is a unique opportunity to build a company from the earliest idea stages to building a product used by real customers.
Responsibilities:
- Develop and maintain a highly interactive web application for visualizing and manipulating large-scale ML datasets.
- Streaming and processing multi-gigabyte datasets directly in the browser
- Implementing efficient client-side data structures for real-time filtering and aggregation of millions of records
- Using Parquet and other modern data formats to stream data to the browser.
- Implement advanced data visualizations using d3, vega, or similar libraries.
- Leverage web workers and other multithreading techniques to enhance application responsiveness.
- Collaborate closely with machine learning engineers and data scientists to integrate machine learning functionalities.
- Contribute to architectural decisions to improve scalability and maintainability.
- Ensure cross-browser compatibility and responsiveness of the application.
We’re looking for:
- Deep understanding of JavaScript internals and browser performance characteristics.
- Strong grasp of memory management and garbage collection in JavaScript
- Experience with web workers, wasm, profiling and optimizing browser performance.
- Experience profiling and optimizing complex web applications
- Experience building complex and performant React.js apps and components.
- Proficiency in data visualization libraries: d3, vega, etc.
- Solid understanding of how modern browsers work under the hood
Additional Skills:
- Familiarity with machine learning concepts and handling large datasets.
- Excellent problem-solving abilities and attention to detail.
- Ability to operate independently in a small startup environment.
- Passion for staying current with emerging technologies and best practices.
What We Offer:
- Get in on the ground level of a funded seed start startup.
- Work side-by-side with experienced entrepreneurs who care deeply about advancing AI.
- A collaborative and close-knit work environment with a small team of highly motivated engineers, located in-person in Seattle.
- Competitive salary, equity, and comprehensive benefits package.
- Opportunity to work on groundbreaking projects in the ML and data visualization space.
- A collaborative and close-knit work environment with a small team of highly motivated engineers.
The ideal candidate will have a deep curiosity about browser internals, enjoy diving into technical specifications, and get excited about pushing the boundaries of what's possible in web applications.