NLP Data Scientist

Job Type: Fixed Term

Salary Type



Job Start Date


Job Duration

5 months

UK based applicants only

Who we are

We are a growing Data Science Consultancy based in London and we are looking for a passionate NLP data scientist to work on the development of our regulation technology product.

What are the project objectives

The objectives of the project are to apply the state of the art Natural Language Processing (NLP) technology to the UK regulatory corpus, and use Artificial Intelligence (AI) to develop legal text classification and understanding models assisting businesses with regulatory compliance. 

What you’ll do

  • Programming primarily in Python using Jupyter notebooks and Python data science libraries (Pandas, SciPy, NumPy, NLTK, Gensim, BERT, etc.);
  • Developing text classification models using supervised and unsupervised machine learning;
  • Developing topic models using LDA/LSI/NMF and similar approaches;
  • Developing Named Entity Recognition models;
  • Developing extractive and abstractive text summarisation models;
  • Participate in planning, feasibility assessment, proof of concepting and reporting phases;
  • Iterative development, validation and testing phases;
  • Brainstorming and discussions with lead data scientists, data engineers and product managers. 

Who you’ll be

Minimal Requirements: 

  • Successful completion of studies in Computer Science, Engineering or a related subject;
  • 2 Years or more of programming experience;
  • Proficiency in Python and Jupyter notebooks;
  • Proficiency using data science libraries such as Pandas, SciPy, NumPy, SciKit Learn, and similar;
  • Experience with extraction and use of semantic vectors such as CountVector, TFIDF, Gensim Word2Vec, Fasttext, BERT, and others 
  • Experience in development, validation and testing of Python functions using pytest or similar library
  • Experience with text classification, topic modelling, text summarisation or other text related model design, development, validation and optimisation;
  • Ability to solve real-world problems with independent research while being able to work in a team. 

Preferred Qualification

  • Advanced R&D experience using state of the art NLP sentence vectorisers such as BERT;
  • Advanced knowledge in text summarisation particularly in abstractive text summarisation;
  • Advanced real world experience in application of topic modelling techniques such as LDA/LSI/NMF;
  • Advanced real world experience in text classification and clustering;
  • Advanced knowledge in supervised and unsupervised machine learning in NLP;
  • Experiences in independent design and development of NLP methodologies;
  • Experiences in RegTech field or application of NLP to legal texts;

Where you’ll work

We are currently based in Huckletree, which is located near Old Street Station in Shoreditch. Huckletree is a global community for tech entrepreneurs & startups, where you are surrounded by passionate and amazing people. There are many great talks and events, monthly drinks and other socializing activities.

During this project time frame (2021 Nov – 2022 March), we expect mostly work from home although may require in-person discussion/meetings in London.

How you’ll progress

  • We are recruiting you for a 5 month fixed term Phase 1 project, which is aimed to be followed by a 12 month Phase 2 if Phase 1 is successful;
  • Your development is as important to us as it is to you. You’ll be rewarded for hard work here, with support to get better at what you do;
  • We work in a fast-paced project environment where change is constant. If you’re up for the challenge, you’ll have opportunities to try new things and broaden your skills quickly through exposure to the executive team and new experiences.

What you’ll get

  • 5 month fixed term employment
  • 12 days holidays (excluding bank holidays)  in 5 months
  • Pension and insurance
  • Flexible working hours and work-from-home opportunities

We recruit based on excellence, and believe that diversity is vital to success. We have zero tolerance for bullying, harassment or any other behavior that stifles innovation and collaboration.

If you are interested or if you know of anyone who is interested, please don’t hesitate to contact Jiaxin ( or use the contact form and we would love to have a chat with you! 🙂

Subscribe to Our Newsletter

About Strix

Lorem ipsum dolor sit amet, consectetur adipiscing elit. In maximus ligula semper metus pellentesque mattis. Maecenas volutpat, diam enim sagittis quam, id porta quam.

Stay Connected