Data Engineer- Consultant

About Data Engineering:

Data engineering involves the development of solutions for the collection, transformation, storage and management of data to support data-driven decision making and enable efficient data analysis by end users. It focuses on the technical aspects of data processing, integration, and delivery to ensure that data is accurate, reliable, and accessible in a timely manner. It also focuses on the scalability, cost-effectiveness, security, and supportability of the solution. Data engineering encompasses multiple toolsets and architectural concepts across on-premises and cloud stacks, including but not limited to data warehousing, data lakes, lake house, data mesh, and includes extraction, ingestion, and synchronization of structured and unstructured data across the data ecosystem. It also includes processing organization and orchestration, as well as performance optimization of data processing.

Duties:

Designing and optimizing data storage architectures, with One lake, data lakes, data warehouses, Serverless and any distributed file systems. Implementing techniques like partitioning, compression, or indexing to optimize data storage and retrieval. Identifying and resolving bottlenecks, tuning queries, and implementing caching strategies to enhance data retrieval speed and overall system efficiency.
Building data pipelines using wide set of Azure Data platforms and Spark/Databricks to ingest data from various sources such as databases, APIs, or streaming platforms. Integrating and transforming data to ensure its compatibility with the target data model or format.
optimizing data storage architectures, including data lakes, data warehouses, or distributed file systems. Implementing techniques like partitioning, compression, or indexing to optimize data storage and retrieval. Identifying and resolving bottlenecks, tuning queries, and implementing caching strategies to enhance data retrieval speed and overall system efficiency.
implementing data models with guidance, that support efficient data storage, retrieval, and analysis. Collaborating with data scientists and analysts to understand their requirements and provide them with well-structured and optimized data for analysis and modeling purposes.
Utilizing frameworks like Spark to perform distributed computing tasks, such as parallel processing, distributed data processing, or machine learning algorithms.
Establishing data governance practices to maintain data integrity, quality, and consistency.
Identifying and resolving issues related to data processing, storage, or infrastructure. Monitoring system performance, identifying anomalies, and conducting root cause analysis to ensure smooth and uninterrupted data operations.
Collaborating with cross-functional teams including data scientists, analysts, and business stakeholders to understand their requirements and provide technical solutions. Communicating complex technical concepts to non-technical stakeholders in a clear and concise manner.
Independence and responsibility for delivering a solution
Ability to work under Agile and Scrum development methodologies
Staying updated with emerging technologies, tools, and techniques in the field of big data engineering. Exploring and recommending new technologies to enhance data processing, storage, and analysis capabilities.
Upholds the company's core values.

Requirements:

A bachelor's or master’s degree in computer science, Information Systems, or a related field is typically required.
Having 3+ years of experience as a Data Engineer or a similar role.
Proficiency in MS Fabric, Azure Data Factory, Azure Synapse Analytics, Azure Databricks.
Experience with Lakehouse, OneLake, Data Pipelines, Real-Time Analytics, Data warehouse, Power BI Integration, Semantic Models, Spark Jobs, Notebooks and Realtime Analytics, Dataflow Gen1 and Gen2.
Strong understanding of Delta Lake, Parquet, and distributed data systems.
Strong programming skills in Python, PySpark, Scala and Spark SQL/TSQL for data transformations.
Excellent knowledge on Source control / Version Control along with CICD is a plus.
Proficiency in data integration techniques, ETL processes and data pipeline architectures.
Solid understanding of data processing techniques such as batch processing, real-time streaming, and data integration
Nice to have knowledge of data warehousing concepts and technologies like Synapse, Redshift, Snowflake, or BigQuery. Experience with MS Fabric is a plus.
Experience in data modelling techniques and database optimization. Knowledge of query optimization, indexing, and performance tuning is necessary for efficient data retrieval and processing.
Understanding of data security best practices and experience implementing data governance policies. Familiarity with data privacy regulations and compliance standards is a plus
Strong problem-solving abilities to identify and resolve issues related to data processing, storage, or infrastructure. Analytical mindset to analyze and interpret complex datasets for meaningful insights.
Knowledge of data orchestration tool will be beneficial
Knowledge on PowerBi is beneficial.
Experience in designing and creating integration and unit test will be beneficial.
Excellent communication skills to effectively collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders. Ability to convey technical concepts to non-technical stakeholders in a clear and concise manner.
Certifications are an advantage
Databricks Certified,
DP-600 (Fabric Analytics Engineer Associate) certification
DP-700 (Microsoft Certified: Fabric Data Engineer Associate)
DP-203 (Azure Data Engineer Associate) certification.

Why join us:

Stable employment. On the market since 2008, 1300+ talents currently on board in 7 global sites.
100% remote.
Flexibility regarding working hours.
Full-time position
Comprehensive online onboarding program with a “Buddy” from day 1.
Cooperation with top-tier engineers and experts.
Unlimited access to the Udemy learning platform from day 1.
Certificate training programs. Lingarians earn 500+ technology certificates yearly.
Upskilling support. Capability development programs, Competency Centers, knowledge sharing sessions, community webinars, 110+ training opportunities yearly.
Grow as we grow as a company. 76% of our managers are internal promotions.
A diverse, inclusive, and values-driven community.
Autonomy to choose the way you work. We trust your ideas.
Create our community together. Refer your friends to receive bonuses.
Activities to support your well-being and health.
Plenty of opportunities to donate to charities and support the environment.

Data Engineer- Consultant

india