Paris NLP Season 4 Meetup #3

We would first like to thank Jobteaser as host of this Season #4 Meetup #3 and also our speakers for their very interesting presentations.


• Thomas Belhalfaoui, Lead Data Scientist @ JobTeaser

Siamese CNN for jobs-candidate matching: learning document embeddings with triplet loss.

At JobTeaser, we are the official career center of more than 500 schools and universities throughout Europe, where we can multipost companies job offers.
Our mission: help students and recent graduates find their dream job. Among other tools we develop, we try to recommend job offers of interest to our users.

For this purpose, we build a Siamese Convolutional Neural Network, that takes job offer and student resume texts as inputs, and yields job and resume embeddings in a shared euclidean space. Then, recommendation simply amounts to finding the nearest neighbors.
We train the network with a triplet loss on historical application feedback.

Slides Jobteaser (Siamese CNN job candidate matching)


• Djamé Seddah, Associate Professor in CS @ Inria

Sesame street-based naming schemes must fade out, long live CamemBERT et le French fromage!

As cliché as it sounds, pretrained language models are now ubiquitous in Natural Language Processing, the most prominent ones being arguably Bert (Delvin et al, 2018). Many works have shown that Bert-based models are able to capture meaningful syntactic information using nothing else than raw data for training (eg. Jawahar et al, 2019) and this ability is probably one of the reasons of its success.

Anyway, until very recently, most available models have either been trained on English data or on the concatenation of data in multiple languages. In this talk, we’ll present the results of a work that investigates the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating our language models on part-of-speech tagging, dependency parsing, named entity recognition and natural language inference tasks. We show that the use of web crawled data is preferable to the use of Wikipedia data. More surprisingly, we show that a relatively small web crawled dataset (a few gigabytes) leads to results that are as good as those obtained using two magnitudes larger datasets. Our best performing model Camembert reaches or improves the state of the art in all four downstream tasks.

Presented by Djamé Seddah, joint work with Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric Villemonte de la Clergerie, and Benoît Sagot.

Slides camemBERT

Video of the talks


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s