Job Summary
We are seeking a highly skilled and forward-thinking Big Data Engineer to join our healthcare data team. This role encompasses the end-to-end design, development, and management of large-scale data systems tailored for healthcare analytics. The ideal candidate will be responsible for architecting and maintaining robust, scalable, and secure data pipelines that support critical decision-making across the organization. This position requires deep technical expertise in modern Big Data tools, real-time and batch data integration, and a strong understanding of data governance and compliance in healthcare environments.
Knowledge/Skills/Abilities:
Architect and implement scalable, high-performance Big Data solutions that support structured and unstructured data from diverse sources.
Build and manage batch and real-time data ingestion/extraction pipelines using tools like Kafka, Spark Streaming, and Talend.
Develop reusable and efficient ETL frameworks using Python/Scala for high-volume data transformation and movement.
Design and optimize data models to support analytical and operational use cases, including healthcare claims and utilization data.
Collaborate with cross-functional teams, including data scientists, analysts, and business partners, to translate requirements into robust data products.
Deploy, monitor, and troubleshoot Hadoop-based infrastructure using tools such as Cloudera Manager, Ambari, and Zookeeper.
Enforce data quality, security, and compliance standards using tools such as Kerberos, Ranger, and Sentry.
Implement web services and APIs (REST/SOAP) to enable seamless integration with applications and visualization platforms.
Contribute to data governance initiatives, including metadata management, lineage tracking, and quality assurance.
Job Qualifications
Required Qualifications
Minimum 3 years of hands-on experience in Big Data engineering, data integration, and pipeline development.
Proficiency in Python, Java, or Scala for data transformation and system scripting.
Expertise in Big Data tools: Spark, Hive, Impala, Presto, Phoenix, Kylin, and Hadoop (HDFS, YARN).
Experience building real-time stream-processing systems using Kafka, Storm, or Spark Streaming.
Strong knowledge of NoSQL databases like HBase and MemSQL, and traditional RDBMS including PostgreSQL, Oracle, and SQL Server.
Skilled in ETL design and development using tools such as Talend or Informatica.
Demonstrated experience in deploying and monitoring big data infrastructure with Ambari, Cloudera Manager, and Zookeeper.
Solid understanding of data warehousing, data validation, data quality checks, metadata management, and governance.
Preferred Qualifications
5+ years of progressive experience in Big Data engineering or analytics.
Prior experience working in the healthcare industry with familiarity in clinical, claims, or care management data.
Experience with cloud platforms (AWS, Azure) and containerization tools (Docker, Kubernetes).
Technical Environment
Big Data Ecosystem: Hadoop, Spark, Hive, Kafka, Presto, Impala, Phoenix, Kylin, Zookeeper
Streaming & Messaging: Kafka, Spark Streaming, Storm
ETL & Integration: Talend, Informatica, Python/Scala-based ETL
Programming Languages: Python, Java, Scala, SQL
Databases: HBase, MemSQL, PostgreSQL, Oracle, SQL Server
Cloud & DevOps: AWS, Azure, Docker, Kubernetes, Git
Security & Governance: Kerberos, Ranger, Sentry, Metadata Management
Monitoring Tools: Ambari, Cloudera Manager
APIs: REST, SOAP
To all current Molina employees: If you are interested in applying for this position, please apply through the intranet job listing.
Molina Healthcare offers a competitive benefits and compensation package. Molina Healthcare is an Equal Opportunity Employer (EOE) M/F/D/V.
Pay Range: $77,969 - $155,508 / ANNUAL
*Actual compensation may vary from posting based on geographic location, work experience, education and/or skill level.