Softek Enterprises LLC Data Developer Remote · Full time

Softek LLC seeks an experienced Data Developer to build robust, scalable data pipelines for the IRS. This hands-on technical role involves transforming and loading data from legacy mainframe systems and modern sources into cloud-based analytics platforms, directly supporting the IRS's mission-critical operations.

Description

Key Responsibilities

  • Pipeline Development: Design, build, and deploy data pipelines using Databricks, Informatica, and AWS services
  • ETL Implementation: Develop complex Extract, Transform, Load processes handling both historical and incremental data
  • Code Development: Write efficient PL/SQL, Python, and Scala code for data transformations and processing
  • Legacy Integration: Convert existing Greenplum and Oracle stored procedures to Databricks-compatible code
  • Performance Tuning: Optimize pipeline performance for large-scale data processing and real-time requirements
  • Testing: Develop comprehensive testing strategies including unit, integration, performance, and 508 compliance testing
  • Monitoring: Implement data quality checks, error handling, and pipeline monitoring solutions
  • Documentation: Create technical documentation, deployment scripts, and operational procedures
  • Maintenance: Provide ongoing support, troubleshooting, and enhancement of existing pipelines

Required Qualifications

  • Experience: Minimum 4 years developing data pipelines with demonstrated expertise in ETL/ELT processes
  • Project Portfolio: Experience with at least 2 projects involving data transformation and loading into Databricks or similar platforms
  • Programming Skills: Proficient in SQL, Python, Scala, and PL/SQL with ability to optimize complex queries
  • Platform Experience: Hands-on experience with Databricks, AWS data services, and cloud-based ETL tools
  • Database Expertise: Strong background with both relational and NoSQL databases
  • Education: Bachelor's degree in Computer Science, Information Technology, or related technical field

Preferred Qualifications

  • Databricks Expertise: Databricks Certified Developer or equivalent hands-on experience
  • AWS Knowledge: AWS Data Engineer or Analytics certifications
  • Government Sector: Experience with federal data systems, compliance requirements, and security protocols
  • DevOps Skills: Knowledge of CI/CD pipelines, Infrastructure as Code, and automated deployment practices

Technical Environment

  • Development Platforms: Databricks Workspace, AWS EMR, Informatica PowerCenter
  • Programming Languages: Python, Scala, SQL, PL/SQL, Shell scripting
  • Data Sources: Mainframe DB2, Oracle, PostgreSQL, flat files, APIs
  • Target Systems: Databricks Delta Lake, AWS Redshift, DynamoDB, S3