Mid Data Engineer
Description
Description We leverage the power of a global crowd to provide
- quality data for artificial intelligence for some of the world’s biggest companies. We’re instrumental to the progression and development of AI and we’re proud to be involved in an industry that is changing the world. From a personal point of view, we’re a group of big thinkers, high achievers and creative problem solvers who bond over software engineering, data science, and strong coffee. We like online gaming, running marathons, and team activities. We celebrate authenticity and diversity and we’re invested in what we do. Our mission? World domination, obviously!
Responsibilities
- Design and implement scalable Py
Spark-based data pipelines to process (clean, validate, package and deliver) multimodal AI training datasets (e. g. , text, audio, video, images, etc. ). - Develop ETL pipelines to supply data for analytical dashboards in the Operations areas.
- Set software engineering tools, platforms, and best practices while performing
- off analysis to best match engineering, product, and project constraints and expectations. - Operate data pipelines to ingest data from multiple sources and deliver it to different destinations.
- Help the Product Manager and stakeholders structure, break down, and prioritize the product roadmap into backlog items.
- Collaborate with other software engineering teams such as SREs and Dev
Ops to achieve team goals. - Work with Software Engineering teams to integrate the Data Platform with other tools and platforms.
Who are we looking for?
Do you have the drive to work in an innovative and ambitious environment? We’re looking for someone with a determined and proactive mindset, inspired and passionate to help us achieve our goals. Our successful candidate is a strong critical thinker, reliable and transparent, with an ability to learn and communicate. We are looking for someone special to contribute to our unique culture.
Qualifications
- BSc or MSc in Computer Science or a related field.
- 3 to 5 years of experience.
- Experience in Py
Spark-based data pipelines and software quality best practices. - Experience with Azure services such as Synapse Analytics (Py
Spark Jobs, Pipelines, and Notebooks), ADLS, Power BI, Dev
Ops, and SQL and No
SQL databases. - Solid understanding of
- related architectures, concepts, technologies, and processes (e. g. , Medallion, Data Lake, Data Lakehouse, Data Warehouse, ETL). - Comfortable with evaluating and applying software design and architectural patterns/principles.
- Knowledge of RESTful APIs based on Fast
API, from provider and consumer perspectives. - Problem-solving skills.
- Proficient in both written and spoken English.
Benefits
- Flexible working schedule and hybrid model. Manage your schedule and work from one of our modern office spaces or from home.
- Excellent career development opportunities in a high growth company. Accomplish your career goals with a supported career path.
- Culture of feedback and continuous improvement. AI is
- paced, so we keep up with tech trends and value feedback. - An international and diverse team. More than 30 nationalities at our 3 locations, with language classes.
- Continuous training opportunities. Workshops, Udemy access, and formal development opportunities.
- We love to have fun together. Fun activities and team events are part of our culture.
About Us
Defined. ai offers a platform with multiple data delivery options that leverages machine learning technology and human intelligence to deliver
- guaranteed training data for AI systems. The platform offers
- service and fully customizable solutions that enable AI products to reach market quicker. Defined. ai has raised a total of $63. 6M in funding over 4 rounds. Our value proposition is quality, privacy, speed and scale, covering more than 50 languages. With strong expertise in speech and natural language processing, we have been serving AI companies and Fortune 500 companies since day one. Defined. ai was founded in Seattle and has offices in Lisbon.
Privacy Notice: defined. ai/candidate-privacy-statement
- Informações detalhadas sobre a oferta de emprego
Empresa: Defined.ai Localização: Lisboa
Lisboa, Lisboa, PortugalPublicado: 12. 10. 2025
Vaga de emprego atual
Seja o primeiro a candidar-se à vaga de emprego oferecida!