Clinetic, Inc. Data Engineer Remote · Full time

Data Engineer

Description

Clinetic is putting health data in motion for good! We are a health software and technology company helping unleash the potential of electronic health record systems for research, evidence generation, and new care delivery models. We are passionate about developing creative solutions to modernize the way health care is delivered and how clinical research is performed to ultimately improve the health and lives of patients. We are actively seeking new people to join our data driven, collaborative, and solution-oriented team.


What You'll Do

  • Develop deep expertise in healthcare data
  • Explore and map EHR datasets to common data models
  • Work with product managers, engineers and customer data analytics teams
  • Create a cutting-edge distributed analytics platform that aggregates data from multiple different sources
  • Build robust data pipelines that empower researchers to conduct clinical trials and disease surveillance in real-time


Ideal Candidate

  • You use a combination of persistence, research, problem-solving skills, and experience to overcome obstacles
  • You take pride in your work. You are attentive to detail, but also flexible.
  • You are available for and responsive to questions. You are professional and collegial in your communications.
  • You like being the person that others rely on.
  • You quickly learn new technologies as needed and recognize that you are engaged in timely, business-critical tasks.
  • You are transparent in what you do. You discuss, document, and commit your work as needed.
  • You care about good data models, abstractions, and readable code.
  • You enjoy optimizing high throughput applications and ensuring data quality.
  • You enjoy working in an Agile environment and welcome constructive feedback.
  • You approach problems with a product development mindset.


Requirements

  • Strong proficiency in Apache Spark, including Spark SQL, DataFrame API, and Spark Streaming.
  • Proficiency in working with large-scale relational database management systems (RDBMS) such as PostgreSQL and SQL Server.
  • Deep understanding of SQL optimization techniques, query execution plans, and indexing strategies.
  • Experience with query profiling, identifying bottlenecks, and fine-tuning queries to enhance efficiency.
  • Solid understanding of distributed computing concepts and data processing frameworks.
  • Experience with data modeling, ETL/ELT processes, and data integration techniques.
  • Proficiency in programming languages such as Scala, Java, or Python.
  • Hands-on experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and related data services.
  • Proven ability to take ownership and work independently


Nice to Have

  • Experience working with NoSQL data stores like ElasticSearch or MongoDB
  • Experience working with healthcare data
  • Experience with Kubernetes
  • Experience with machine learning
  • Experience with synthetic data generation


Benefits

This is a full-time position based in Durham, NC, one of the highest ranked cities in the country for growth, entrepreneurship, affordability, dining and entertainment. As a rapidly growing startup, we offer a robust benefits package including the following:

  • Competitive compensation
  • Flexible work schedule
  • Health Care Plan
  • Retirement Plan
  • Unlimited PTO