Softek Enterprises LLC Data Developer Remote · Full time

Softek LLC seeks an experienced Data Developer to build robust, scalable data pipelines for the IRS. This hands-on technical role involves transforming and loading data from legacy mainframe systems and modern sources into cloud-based analytics platforms, directly supporting the IRS's mission-critical operations.

About Softek Enterprises LLC

Softek is a Minority-Owned Small Business that has been providing technology solutions and consulting to government clients since 2007. We are a leading provider of information technology, consulting and business process services focusing on cloud computing, web content management, Big Data, Health IT, custom application development, business automation, integration solutions and IT managed support that delivers accelerated results for cross-industry organizations.

Description

Key Responsibilities

  • Pipeline Development: Design, build, and deploy data pipelines using Databricks, Informatica, and AWS services
  • ETL Implementation: Develop complex Extract, Transform, Load processes handling both historical and incremental data
  • Code Development: Write efficient PL/SQL, Python, and Scala code for data transformations and processing
  • Legacy Integration: Convert existing Greenplum and Oracle stored procedures to Databricks-compatible code
  • Performance Tuning: Optimize pipeline performance for large-scale data processing and real-time requirements
  • Testing: Develop comprehensive testing strategies including unit, integration, performance, and 508 compliance testing
  • Monitoring: Implement data quality checks, error handling, and pipeline monitoring solutions
  • Documentation: Create technical documentation, deployment scripts, and operational procedures
  • Maintenance: Provide ongoing support, troubleshooting, and enhancement of existing pipelines

Required Qualifications

  • Experience: Minimum 4 years developing data pipelines with demonstrated expertise in ETL/ELT processes
  • Project Portfolio: Experience with at least 2 projects involving data transformation and loading into Databricks or similar platforms
  • Programming Skills: Proficient in SQL, Python, Scala, and PL/SQL with ability to optimize complex queries
  • Platform Experience: Hands-on experience with Databricks, AWS data services, and cloud-based ETL tools
  • Database Expertise: Strong background with both relational and NoSQL databases
  • Education: Bachelor's degree in Computer Science, Information Technology, or related technical field

Preferred Qualifications

  • Databricks Expertise: Databricks Certified Developer or equivalent hands-on experience
  • AWS Knowledge: AWS Data Engineer or Analytics certifications
  • Government Sector: Experience with federal data systems, compliance requirements, and security protocols
  • DevOps Skills: Knowledge of CI/CD pipelines, Infrastructure as Code, and automated deployment practices

Technical Environment

  • Development Platforms: Databricks Workspace, AWS EMR, Informatica PowerCenter
  • Programming Languages: Python, Scala, SQL, PL/SQL, Shell scripting
  • Data Sources: Mainframe DB2, Oracle, PostgreSQL, flat files, APIs
  • Target Systems: Databricks Delta Lake, AWS Redshift, DynamoDB, S3