We would like first thank MeilleursAgents as host of this meetup, then thank our 3 speakers for their very interesting presentation and also thank the participants for coming still so many at this session.
You can find the slides of our three speakers below:
• Syrielle Montariol, LIMSI, CNRS
Word usage, meaning and connotation change throughout time ; it echoes the various aspects of the evolution of society (cultural, technological…). For example the word “Katrina”, originally associated with female surnames, came closer to the disaster vocabulary after Hurricane Katrina appeared in august 2005.
Diachronic word embeddings are used to grasp such change in an unsupervised way : it is useful to linguistic research to understand the evolution of languages, but also for standard NLP tasks to study long time-range corpora.
In this talk, I will introduce a selection of methods to train time-varying word embeddings and to evaluate it, placing greater emphasis on probabilistic word embeddings models.
• Pierre Pakey and Dimitri Lozeve, Destygo
If data beats model, why not build models that produce data ? Vast quantities of realistic labeled data will always make the difference in all machine learning optimization problems. At Destygo, we automatically leverage the interactions between users and our conversational AI agents to produce vast quantities of labelled data and train our natural language understanding algorithms in a reinforcement learning framework. We will present the outline of our self-learning pipeline, its relation with state of the art literature and the specificity due to the NLP space. Finally, we will focus more specifically on the network responsible for choosing whether to try something new or not, which is one of the important pieces of the process.
• Julien Perez, Machine Learning and Optimization group, Naver Labs Europe
Over the last 5 years, differentiable programming and deep learning have become the-facto standard on a vast set of decision problems of data science. Three factors have enabled this rapid evolution. First, the availability and systematic collection of data have enabled to gather and leverage large quantities of traces of intelligent behavior. Second, the development of standardized development framework has dramatically accelerated the development of differentiable programming and its applications to the major’s modalities of the numerical world, image, text, and sound. Third, the availability of powerful and affordable computational infrastructure have enabled this new step toward machine intelligence. Beyond these encouraging results, new limits have arisen and need to be addressed. Automatic common-sense acquisition and reasoning capabilities are two of these frontiers that the major research labs of machine learning are now involved. In this context, human language has become once again a support of choice of such research. In this talk, we will take a task of natural language understanding, machine reading, as a medium to illustrate the problem and describe the research progress suggested throughout the machine reading project. First, we will describe several of the limitations the current decision models are suggesting. Secondly, we will speak of adversarial learning and how such approach robustifies learning. Thirdly, we will explore several differentiable transformations that aim at moving toward these goals. Finally, we will discuss ReviewQA, a machine reading corpus over human generated hotel review, that aims at encouraging research around these questions.