Spring 2026 – Data Engineering Internship

November 4, 2025

Job Description

About Us

We’re continuing to build a transformative healthcare accreditation platform that is revolutionizing how our clients and new hospitals manage compliance, quality improvement, and regulatory processes. Our platform combines cutting-edge technology with deep healthcare domain expertise to solve real problems for healthcare organizations nationwide.

The Opportunity

The goal is to have interns turn into full time employees; Therefore, you will be given full time responsibilities day one. To add onto that, you will be working in a high velocity growth startup and will be required to move fast. You’ll work directly with our engineering team on a production healthcare platform, gaining hands-on experience with enterprise-grade systems while making real contributions that impact our product and customers.

Compensation Structure: Base position is unpaid, however qualified candidates may receive upfront equity compensation based on their experience level and demonstrated capabilities. We evaluate each applicant individually and offer equity packages commensurate with their potential contribution.

What You’ll Build

Entity Resolution Systems: Healthcare facility lookup and matching components
Data Processing Pipelines: ETL workflows for compliance tracking
Analytics: Data trigger refresh and dashboard feeding systems
ML Pipeline Integration: Data infrastructure supporting our ML team’s models
Quality Monitoring: Data validation and monitoring systems for healthcare data

Key Responsibilities

Data Pipeline Development:

Build AWS Lambda functions for processing APIs and healthcare data sources
Develop ETL pipelines using Python and SQL for healthcare compliance data
Implement data validation and quality checks for sensitive healthcare information
Create automated data processing workflows for accreditation documents

Database & Storage:

Work with MongoDB Atlas and S3
Design and optimize SQL database schemas
Implement S3-based data lake architecture with Bronze/Silver/Gold zones
Build caching systems using Redis for performance optimization
Use/Configure Athena for interactive analytics and querying

Analytics Support:

Create data feeds for executive dashboards and KPI tracking
Build healthcare-specific analytics and benchmarking data pipelines
Support Batch processing systems for hospital quality metrics
Collaborate with our ML team to integrate predictive models into data workflows
Build healthcare-specific analytics and benchmarking data pipelines using Athena for database queries

Healthcare Compliance:

Implement HIPAA-compliant data processing and audit trail systems
Build data governance and documentation standards
Create automated monitoring and alerting for data quality issues

Required Qualifications

Technical Skills:

1-3+ years with Python and SQL: Data processing, scripting, and database querying
1-2 years AWS Experience: knowledge of Lambda, S3, or other cloud data services
Database Knowledge: Experience with SQL databases and/or NoSQL (MongoDB preferred)
ETL/Data Processing: Understanding of data pipeline concepts and batch processing

Data Engineering Fundamentals:

Experience with data transformation and cleaning
Understanding of data warehousing and lake concepts
Basic knowledge of data quality and validation techniques
Familiarity with version control (Git) and collaborative development

Nice to Have:

Healthcare or regulated industry experience
Experience with Spark, Delta Lake, or distributed computing
Knowledge of data visualization and BI tools
Understanding of ML pipeline integration
HIPAA compliance knowledge

Technical Stack

Data Processing:

Languages: Python, SQL
Cloud: AWS (Lambda, S3, EMR, Athena)
Databases: MongoDB Atlas, PostgreSQL, Redis
Processing: Spark, Delta Lake, batch and stream processing

Integration & ML:

ML Integration: SageMaker, MLflow (working with our ML team)
Analytics: Athena for scheduled/triggered queries
Monitoring: CloudWatch, automated data quality checks

Our Hiring Process

We believe in a transparent and thorough selection process that respects your time while ensuring mutual fit:

1. Initial Screening Call We’ll discuss your background, experience, and career goals, while providing an overview of the role and our team culture.

2. Technical Challenge You’ll receive a real-world technical challenge to complete within a specified timeframe. We encourage you to leverage all available resources—including AI tools, documentation, and libraries—just as you would in a production environment. This reflects how we actually work and allows you to showcase your problem-solving approach.

3. Technical Interview We’ll have an in-depth discussion about your solution and explore related technical concepts. You should be prepared to walk through every aspect of your submission—explaining architectural decisions, code logic, trade-offs, and potential improvements. Whether you wrote the specific code section manually or generated it with AI assistance, you must demonstrate complete ownership and understanding of the entire codebase. This is a production-level assessment: we expect you to discuss, debug, and defend your work as if it were going live tomorrow.

We’re looking for engineers who can think critically, adapt their approach, and truly understand the systems they build—not just those who can generate code.

Ready to apply? We look forward to hearing from you!

MedLaunch is an equal opportunity employer committed to diversity and inclusion.

Spring 2026 – Data Engineering Internship

Job Description

For Candidates

For Employers

About Us

Helpful Resources

Great! How many classes do you want this month?

Trial Class

2 classes per week

Employment License + Support

Employment License

Login to superio

Reset Password

Create a free superio account

Spring 2026 – Data Engineering Internship

Apply for this job

Job Description

Share this post

For Candidates

For Employers

About Us

Helpful Resources