We are hiring a Data Engineer for data migration projects (Primarily utilizing AWS, IDMC, Databricks and Tableau)
Roles and responsibilities
Design and architect data storage solutions, including databases, data lakes, and warehouses, using AWS services such as Amazon S3, Amazon RDS, Amazon Redshift, and Amazon DynamoDB, along with Databricks' Delta Lake
Integrate Informatica IDMC for metadata management and data cataloging
Integrate data from various sources, both internal and external, into AWS and Databricks environments, ensuring data consistency and quality, while leveraging Informatica IDMC for data integration, transformation, and governance
Develop ETL (Extract, Transform, Load) processes to cleanse, transform, and enrich data, making it suitable for analytical purposes using Databricks' Spark capabilities and Informatica Power Center/IDMC for data transformation and quality
Monitor and optimize data processing and query performance in both AWS and Databricks environments, making necessary adjustments to meet performance and scalability requirements.
Utilize Informatica Power Center/IDMC for optimizing data workflows
Implement security best practices and data encryption methods to protect sensitive data in both AWS and Databricks, while ensuring compliance with data privacy regulations.
Employ Informatica IDMC for data governance and compliance
Implement automation for routine tasks, such as data ingestion, transformation, and monitoring, using AWS services like AWS Step Functions, AWS Lambda, Databricks Jobs, and Informatica IDMC for workflow automation
Maintain clear and comprehensive documentation of data infrastructure, pipelines, and configurations in both AWS and Databricks environments, with metadata management facilitated by Informatica IDMC
Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to understand data requirements and deliver appropriate solutions across AWS, Databricks, and Informatica IDMC
Identify and resolve data-related issues and provide support to ensure data availability and integrity in both AWS, Databricks, and Informatica Power Center/IDMC environments
Optimize AWS, Databricks, and Informatica resource usage to control costs while meeting performance and scalability requirements
Create, manage, and optimize data visualization process using as Tableau, OAS or Power BI
Stay up-to-date with AWS, Databricks, Informatica Power Center/IDMC services, and data engineering best practices to recommend and implement new technologies and techniques
Requirements
Bachelor's or master's degree in computer science, data engineering, or a related field
Minimum 8 years of experience in data engineering, with expertise in AWS services, Databricks, and/or Informatica Power Center/IDM
Proficiency in programming languages such as Python, Java, or Scala for building data pipelines
Evaluate potential technical solutions and make recommendations to resolve data issues especially on performance assessment for complex data transformations and long running data processes
Strong knowledge of SQL and NoSQL databases
Familiarity with data modeling and schema design
Excellent problem-solving and analytical skills
Strong communication and collaboration skills
AWS certifications (e.g., AWS Certified Data Analytics - Specialty), Databricks certifications, and Informatica certifications are a plus
Preferred Skills
Experience with big data technologies like Apache Spark and Hadoop on Databricks
Knowledge of data governance and data cataloguing tools, especially Informatica IDMC
Familiarity with data visualization tools like Tableau or Power BI
Knowledge of containerization and orchestration tools like Docker and Kubernetes
Understanding of DevOps principles for managing and deploying data pipelines
Experience with version control systems (e.g., Git) and CI/CD pipelines
#J-*****-Ljbffr