Data Scientist

Date Posted

January 10, 2025

Location

Remote

Job Description

Are you a Data Scientist based in Greece and eager to consider a remote project ?
How about a project at a Data Analytics & Publishing company?

General information

  • Duration:                                         6-month contract.
  • No. of working hours:                 36 hours per week
  • Location:                                         Remote
  • Contract type:                               Freelance
  • Other:                                               You should keep residence / be located in Greece

What is the project about?
Our client is seeking to add a new team member with a Data background. They are a diverse team of NLP and machine learning experts, taxonomy experts and scientific content experts in biology and chemistry domains. The client mainly develop best-in-class enrichment pipelines for their business and partners.

You will be responsible for building, testing, and maintaining our NLP solutions using (gen)AI technology. You will work throughout the whole life cycle of data science projects: design, implementation, production and beyond. You will deliver efficient and production-ready Python code. You will collaborate with the technology team to deploy and productionize our data science pipelines.
You will work in a team of 24 data scientists, located in Amsterdam, Greece and India. The team is mostly developing automation pipelines. These are automation pipelines are used to extra data from text articles and to feed the data to the 3 databases. The team is also exploring GenAI, with tasks such as embedding and vector database searching. They use LLMs mainly for entity- and relationships extraction.

Responsibilities

  • Data collection, data analysis, model development, defining quality metrics, quality assessment of models and regular presentations to stakeholders.
  • Creating production-ready Python packages for each component of data science pipelines (such as pre-processing and model inference) and their deployment together with the technology team
  • Integration of data science components and end-to-end quality assessment.
  • Keeping our data science pipelines robust against model drift and ensuring continuous output quality; development of needed tools and strategies for maintenance such as automatic model re-training.
  • Establishing the reporting process of the performance of the pipeline, and automatic re-training strategy for the existing pipelines
  • Leading and managing projects with a team of data scientists and independently executing the entire small-scale projects
  • Consistently communicating team goals and milestone achievements to internal stakeholders

 
Requirements

  • At least 4+ years of relevant applied experience and Msc/MTech in the field of computer science, data science, artificial intelligence, mathematics, statistics, bioinformatics or other quantitative fields or at least 5 years of relevant experience. Phd in the field is a plus. International working/education experience is a plus!
  • Strong hands-on knowledge of Python, able to write unit tests and production ready code adhering to best practices and object-oriented programming principles.
  • Data processing, cleaning, and analysis skills: experience with Pandas, NumPy, Matplotlib, SciPy
  • Hands-on machine learning experience on classification, regression, clustering, and text Mining. You have a good understanding of Neural Networks, Random Forests, Logistic Regression, SVM, K-Means etc., and are a confident user of Scikit-learn, PyTorch and/or Tensorflow.
  • Proven experience in NLP with the use of BERT models or its variations. They should be extraction models in general.
  • Experience in using LLMs for extraction or evaluations, and RAG infrastructure.
  • Experience or affinity with vector databases, embedding models is a plus
  • Very good communication and presentation skills, in particular proven ability to convey data science concepts effectively to non-technical audiences.
  • Proven experience in managing projects and communicating stakeholders
  • Willingness to learn, analytical thinking, problem solving skills; ability to translate complex requirements into practical solutions.
  • Experience with Git, basic DevOps and CI/CD skills, cloud computing (AWS, Azure), Open Search, Databricks
  • Interest and willingness to gain experience in MLOps and data science productionization.

Does this role spark your interest? Then please provide me with your most recent resume and contact details, so that we can discuss this vacancy more detailed by phone!

You can check other job opportunities in our website: Jobs – Magno IT
 

Contact Profile Picture

Contact

Debby de Groot

Email

debby@magno-it.nl

Contact

Debby de Groot

Email

debby@magno-it.nl