Senior Data Engineer, DPD Team (Remote, International)

A bit about us: PulsePoint is a leading healthcare ad technology company that uses real-world data in real-time to optimize campaign performance and revolutionize health decision-making. Leveraging proprietary datasets and methodology, PulsePoint targets healthcare professionals and patients with an unprecedented level of accuracy—delivering unparalleled results to the clients we serve. The company is now a part of Internet Brands, a KKR portfolio company and owner of WebMD Health Corp. Sr. Data Engineer PulsePoint Data Engineering team plays a key role in our technology company that’s experiencing exponential growth. Our data pipeline processes over 80 billion impressions a day (> 20 TB of data, 200 TB uncompressed). This data is used to generate reports, update budgets, and drive our optimization engines. We do all this while running against tight SLAs and provide stats and reports as close to real-time as possible. The most exciting part about working at PulsePoint is the enormous potential for personal and professional growth. We are always seeking new and better tools to help us meet challenges such as adopting proven open-source technologies to make our data infrastructure more nimble, scalable and robust. Some of the cutting-edge technologies we have recently implemented are Kafka, Spark Streaming, Presto, Airflow, and Kubernetes. What you'll be doing: • Design, build, and maintain reliable and scalable enterprise-level distributed transactional data processing systems for scaling the existing business and supporting new business initiatives • Optimize jobs to utilize Kafka, Hadoop, Presto, Spark, and Kubernetes resources in the most efficient way • Monitor and provide transparency into data quality across systems (accuracy, consistency, completeness, etc) • Increase accessibility and effectiveness of data (work with analysts, data scientists, and developers to build/deploy tools and datasets that fit their use cases) • Collaborate within a small team with diverse technology backgrounds • Provide mentorship and guidance to junior team members Team Responsibilities: • Ingest, validate and process internal & third party data • Create, maintain and monitor data flows in Python, Spark, Hive, SQL and Presto for consistency, accuracy and lag time • Maintain and enhance framework for jobs(primarily aggregate jobs in Spark and Hive) • Create different consumers for data in Kafka using Spark Streaming for near time aggregation • Tools evaluation • Backups/Retention/High Availability/Capacity Planning • Review/Approval - DDL for database, Hive Framework jobs and Spark Streaming to make sure they meet our standards Technologies We Use: • Python - primary repo language • Airflow/Luigi - for job scheduling • Docker - Packaged container image with all dependencies • Graphite - for monitoring data flows • Hive - SQL data warehouse layer for data in HDFS • Kafka - distributed commit log storage • Kubernetes - Distributed cluster resource manager • Presto/Trino - fast parallel data warehouse and data federation layer • Spark Streaming - Near time aggregation • SQL Server - Reliable OLTP RDBMS • Apache Iceberg • GCP - BigQuery for performance, Looker for dashboards Requirements • 8+ years of data engineering experience • Strong skills in and current experience with SQL and Python • Strong recent Spark experience (3+ years) • Experience working in on-prem environments • Hadoop and Hive experience • Experience in Scala/Java is a plus (Polyglot programmer preferred!) • Proficiency in Linux • Strong understanding of RDBMS and query optimization • Passion for engineering and computer science around data • East Coast U.S. hours 9am-6pm EST; you can work fully remotely • Notice period needs to be less than 2 months (or 2 months max) • Knowledge and exposure to distributed production systems i.e Hadoop • Knowledge and exposure to Cloud migration (AWS/GCP/Azure) is a plus Location: • We can hire as FTE in the, U.S., UK and Netherlands • We can hire as long-term contractor (independent or B2B) in most other countries Selection Process: 1) CodeSignal Online Assessment 2) Initial Screen (30 mins) 3) Hiring Manager Interview (45 mins) 4) Tech Challenge 5) Interview with Sr. Data Engineer (60 mins) 6) Team Interviews (90 mins + 3 x 45 mins) + SVP of Engineering (30 mins) 7) WebMD Sr. Director, DBA (30 mins) Note that leetcode-style live coding challenges will be involved in the process. WebMD and its affiliates is an Equal Opportunity/Affirmative Action employer and does not discriminate on the basis of race, ancestry, color, religion, sex, gender, age, marital status, sexual orientation, gender identity, national origin, medical condition, disability, veterans status, or any other basis protected by law. Apply tot his job

Back to blog

Common Interview Questions And Answers

1. HOW DO YOU PLAN YOUR DAY?

This is what this question poses: When do you focus and start working seriously? What are the hours you work optimally? Are you a night owl? A morning bird? Remote teams can be made up of people working on different shifts and around the world, so you won't necessarily be stuck in the 9-5 schedule if it's not for you...

2. HOW DO YOU USE THE DIFFERENT COMMUNICATION TOOLS IN DIFFERENT SITUATIONS?

When you're working on a remote team, there's no way to chat in the hallway between meetings or catch up on the latest project during an office carpool. Therefore, virtual communication will be absolutely essential to get your work done...

3. WHAT IS "WORKING REMOTE" REALLY FOR YOU?

Many people want to work remotely because of the flexibility it allows. You can work anywhere and at any time of the day...

4. WHAT DO YOU NEED IN YOUR PHYSICAL WORKSPACE TO SUCCEED IN YOUR WORK?

With this question, companies are looking to see what equipment they may need to provide you with and to verify how aware you are of what remote working could mean for you physically and logistically...

5. HOW DO YOU PROCESS INFORMATION?

Several years ago, I was working in a team to plan a big event. My supervisor made us all work as a team before the big day. One of our activities has been to find out how each of us processes information...

6. HOW DO YOU MANAGE THE CALENDAR AND THE PROGRAM? WHICH APPLICATIONS / SYSTEM DO YOU USE?

Or you may receive even more specific questions, such as: What's on your calendar? Do you plan blocks of time to do certain types of work? Do you have an open calendar that everyone can see?...

7. HOW DO YOU ORGANIZE FILES, LINKS, AND TABS ON YOUR COMPUTER?

Just like your schedule, how you track files and other information is very important. After all, everything is digital!...

8. HOW TO PRIORITIZE WORK?

The day I watched Marie Forleo's film separating the important from the urgent, my life changed. Not all remote jobs start fast, but most of them are...

9. HOW DO YOU PREPARE FOR A MEETING AND PREPARE A MEETING? WHAT DO YOU SEE HAPPENING DURING THE MEETING?

Just as communication is essential when working remotely, so is organization. Because you won't have those opportunities in the elevator or a casual conversation in the lunchroom, you should take advantage of the little time you have in a video or phone conference...

10. HOW DO YOU USE TECHNOLOGY ON A DAILY BASIS, IN YOUR WORK AND FOR YOUR PLEASURE?

This is a great question because it shows your comfort level with technology, which is very important for a remote worker because you will be working with technology over time...