The Data Engineer works as part of the technical data management team, supporting the needs of data scientists and analytic developers who are looking to build interactive information systems, run queries and algorithms against distributed data assets for predictive analytics, machine learning and data mining purposes.
In many cases, data engineers also work with business units and development departments to deliver data aggregations to executives, business analysts and other consumers of more traditional types of data to aid in ongoing operations, downstream data entities and build with an eye towards transfer of support (training/documentation).
•Works on multiple projects as a technical team member or lead driving user story analysis and elaboration, design and development of data and analytic centric applications and foundational data assets.
•Codes, tests, and documents new or modified data systems to create robust and scalable applications for data analytics, and works collaboratively with our data scientists to enable their efforts.
•Can work seamlessly across HDFS and Columnar data systems (Spark/Redshift) as well as traditional row based environments (Oracle/SQL Server).
•Proficient in ETL tools such as Informatica Powercenter, writing stored procedures or creating Lambda functions. Familiar with upcoming ETL variants such as AWS Glue.
•Proficient with AWS Command Line Interface (CLI), PowerShell, or other Linux scripting variants.
•Familiarity with AWS technologies like EMR/Lambda/S3/KMS/DynamoDB.
Top Skills Details:
1. Pyspark or Python (5-7 yrs)- need to take big data, run through and setup the data in a specific way that's assigned- helping to prepare big data for the data scientists. Will by using Pyspark to help see and debug what's going on identify issues. Overall 'slicing and dicing' of data.
2. AWS Experience- specifically knowledge around moving big data onto S3 (AWS storage) and utilizing SageMaker to deploy machine learning models in the cloud.
Specific AWS technologies: EMR, Lamdas, S3, RDS, Sagemaker, EC2 components.
3. Creating data pipelines: will need to create data pipelines form various storage systems. Do this by running spark on EMR in order to limit the bottlenecking problems and under provisioning of the on-prem
Additional Skills & Qualifications:
4. Logical reasoning: these engineers need to retrieve data from many different areas (sometimes it's in oracle, salesforce, sql database, someone else's S3's bucket, etc..) so they need to apply logical reasoning on how to make that happen in the BEST way possible
5. Technologies: Java, sql, shellscripting
6. Excellent communication skills- will be working with multiple teams and needs to be able to effectively communicate with all members and help act as a mentor for team members
7. Healthcare industry experience or knowledge is preferred
We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.
The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.