## Group "Softskills"

### Softskills seminar (M2 only) (TPT-DATAAI941 Softskills seminar)

**Go to course webpage.**

*This course is taught by Fabian Suchanek.*

Students learn how to give good presentations, and present scientific papers. This is an obligatory course of the M2 DataAI. https://moodle.r2.enst.fr/moodle/course/view.php?id=94

## Group "Ethics"

### AI Ethics (TPT-DATAAI951)

*This course is taught by Maxwell Winston, Sophie Chabridon, Ada Diaconescu, Fabian Suchanek.*

Algorithmic fairness, ethical issues, privacy and security

## Group "Logics"

### Logics and Symbolic AI (TPT-IA301)

**Go to course webpage.**

*This course is taught by Isabelle Bloch & Natalia Diaz.*

This course aims at providing the bases of symbolic AI, along with a few selected advanced topics. It includes courses on formal logics, ontologies, symbolic learning, typical AI topics such as revision, merging, etc., with illustrations on preference modeling and image understanding.

*If you take this course you cannot take*Logic & Knowledge representation

### Logic & Knowledge representation (TPT-SD206)

**Go to course webpage.**

*This course is taught by J.-L. Dessalles.*

Prolog (recursivity, backtracking, unification) Formal Logic (propositions, predicates, proof by refutation) Natural language processing (DCG, parsing through unification) Symbolic machine learning (symbolic induction, complexity minimum) Knowledge representation Problem solving

*If you take this course you cannot take*Logics and Symbolic AI

## Group "Databases"

### Database management systems (X-INF553)

**Go to course webpage.**

*This course is taught by Ioana Manolescu.*

Relational databases: ER modeling, SQL, query execution, query optimization, schema refinement, application programming

*If you take this course you cannot take*Databases

### Databases (TPT-SD202)

**Go to course webpage.**

*This course is taught by Maroua Bahri.*

Relational databases: ER modeling, SQL, query execution, query optimization, schema refinement, application programming

*If you take this course you cannot take*Database management systems

## Group "Machine Learning"

### Machine Learning: Shallow & Deep Learning (TPT-DATAAI902)

**Go to course webpage.**

*This course is taught by Mounim El Yacoubi.*

Deep learning/Réseaux de neurones, Modèles de Markov cachés (HMM), Restricted Boltzmann Machines, Unsupervised Learning, Supervised Learning, Analyse de données (PCA, LDA), Support Vector Machines (SVM), Decision Trees, Transfer Learning, Adversarial Models, Deep Reinforcement Learning

*If you take this course you cannot take*Machine & Deep Learning Introduction or
Machine Learning

### Machine & Deep Learning Introduction (X-INF554)

*This course is taught by M. Vazirgiannis.*

The Machine Learning Pipeline Data Preprocessing and Exploration Feature Selection/Engineering & Dimensionality reduction Supervised Learning, Deep and Reinforcement Learning, Unsupervised Learning

*If you take this course you cannot take*Machine Learning: Shallow & Deep Learning or
Machine Learning

### Machine Learning (TPT-DATAAI901)

**Go to course webpage.**

*This course is taught by Filippo Miatto.*

This is an introductory course that will set the basis for the more advanced courses on the second period. Learning goals: - apply probability theory to the description of ML algorithms - describe the various frameworks of algorithmic learning (exact, PAC, agnostic) - analyze the bias-variance tradeoff in light of the information bottleneck principle and double-descent phenomena - explain how unsupervised learning algorithms work (k-means, expectation-maximization) - explain how supervised learning algorithms work (regressions, SVM, deep learning)

*If you take this course you cannot take*Machine Learning: Shallow & Deep Learning or
Machine & Deep Learning Introduction

## Group "Big Data Systems"

### Systems for Big Data (X-INF583)

*This course is taught by Angelos Anadiotis / Yanlei Diao.*

The course follows X-INF553 and covers data warehouses, OLAP, parallel databases, MapReduce, Spark, concurrency control, memory layouts and execution models. The course takes a systems-oriented approach and first explains the effect of different parts of the computer architecture to the query execution. Then, it moves to the fundamentals of parallelisation, including threads, processes and low-level synchronisation, before it moves to the coordination of transactions, both in scale-up and scale-out settings. After synchronisation, the course focuses on data analytics, by starting from low-level issues like memory layouts and their interaction with query operators and finally moves towards well-established scale-out platforms like Hadoop and Spark.

*If you take this course you cannot take*Architectures for Big Data or
Big data infrastructures or
Big Data Processing

### Architectures for Big Data (TPT-DATAAI921)

**Go to course webpage.**

*This course is taught by Ioana Manolescu.*

Mediator systems, P2P systems, structured data management in massively parallel settings

*If you take this course you cannot take*Systems for Big Data or
Big data infrastructures

### Big data infrastructures (TSP-CSC5003-1)

*This course is taught by Bruno Defude.*

Infrastructure pour le big data (Hadoop, MapReduce, des solutions de stockage NoSQL, SPARK) Scalable Machine Learning (SPARK)

*If you take this course you cannot take*Systems for Big Data or
Architectures for Big Data or
Big Data Processing

### Big Data Processing (TPT-DATAAI922)

**Go to course webpage.**

*This course is taught by Louis Jachiet.*

This module will present the basis of architectures and algorithms for bigdata processing at a very large scale. It covers Map Reduce Apache Spark, Lambda and Kappa Architectures.

*If you take this course you cannot take*Systems for Big Data or
Big data infrastructures

## Group "Fully optional courses"

### Probabilistic Models and Machine Learning (TSP-IA304)

*This course is taught by Wojciech Pieczynski.*

Bayes networks, hidden Markov models, theory of evidence, segmentation, filtering, smoothing. Examples of applications to image, finance, digital communications.

### Learning for robotics (ENSTA - IA305)

*This course is taught by S.M. Nguyen.*

Learning methods used in robotics and applications to human / robot interaction, demonstration learning or autonomous learning.

### Semantic Networks (CSC5003-2)

*This course is taught by Amel Bouzeghoub.*

Semantic networks, logic (logic of predicates, logiqe of description, ...), reasoning, ontologies discover Semantic Web languages (RDF, RDFS, OWL, SPARQL) TP (protégé, jena)

### Self-Organising Multi-Agent Systems (TPT-DATAAI961)

*This course is taught by Ada Diaconescu.*

Self-adaptation, self-organisation, autonomic control, multi-agent systems: architectures, design patterns, service-oriented platforms, practical project based on smart-home simulator

### Machine Learning in High Dimension (TPT-IA317)

**Go to course webpage.**

*This course is taught by Thomas Bonald, Robert Gower.*

Sparse data, dimensionality reduction, sketching & projections techniques, nearest-neighbor methods

### Constraint programming (ENSTA-IA302)

**Go to course webpage.**

*This course is taught by Julien Alexandre dit Sandretto.*

Constraint programming for solving discrete or continuous problems. Over-constrained problems and explanations

### Graph Mining (TPT-SD212)

**Go to course webpage.**

*This course is taught by Thomas Bonald.*

Graph analysis, sparse data, clustering, PageRank, classification, graph embedding, spectral methods, diffusion methods

### Efficient resolution of logical models (ENSTA-IA303)

*This course is taught by Alexandre Chapoutot.*

In AI or in Software Verification, logical formulas pay a crucial role to represent knowledge or model a system. This course will present the main algorithms used to check the satisfiability or the non-satisfiability of formulas of Boolean logics. Extension of these algorithms to deal with more expressive logics will also be presented. Two applications in AI (Logical Knowledge-based agent) and in Software Verification will be presented to illustrate the use of logical formulation. For example, tasks as path planning, task planning or bounded model checking will be used to illustrate theoretical notions and practical implementation of algorithms.

### Text Mining and NLP (X-INF582)

*This course is taught by M. Vazirgiannis/Buscaldi.*

Text preprocessing and Information Retrieval, graph-of-words, keyword extraction, Text categorization, topic modeling, supervised document classification, Word and document embeddings, unsupervised document classification with the Word Mover's Distance, Advanced deep learning architectures for NLP seq to seq tasks (HAN, ELMO, BERT/Transformer...), Lexical statistics and n-gram models, Sequence Labeling: Named Entity Recognition, POS-tagging, Introduction to Parsing, elements of Machine Translation, Semantics - Knowledge Bases, Relation Extraction

### Machine Learning for Text Mining (TPT-SD-TSIA214)

*This course is taught by Chloé Clavel.*

Text mining is a progressing and challenging domain. For example, a lot of efforts have been recently dedicated to the development of methods able to analyze opinion data available on the social Web. The first objective of this course is to tackle the different methods of language processing and machine learning underlying text and opinion mining. During this course, the students will acquire theoretical and technical skill on advanced machine learning methods and natural language processing. This course is designed for students who will be attending classes and labs. The techniques and concepts that will be studied include: -natural language pre-processing : tokenization, part-of-speech tagging, document representation and word embeddings techniques -natural language resources : lexicons, wordnet and framenet -text clustering and text categorization : advanced machine learning methods such as deep learning, hidden markov models, etc.

### Basics of image processing and analysis (TPT-DATAAI965)

*This course is taught by Pietro Gori.*

- Specificities of image data and image acquisition - Image sampling and quantization - Filtering and morphological image processing - Noise reduction and restoration - Segmentation - Image transformation and registration

### Data Stream Mining (TPT-DATAAI962)

*This course is taught by Albert Bifet & Jesse Read.*

Data stream mining or Real-Time Analytics relies on and develops new incremental algorithms that process streams under strict resource limitations. This course focuses on, as well as extends the methods implemented in open source tools as MOA and Apache SAMOA

### Data Visualization (X-INF552)

**Go to course webpage.**

*This course is taught by Emmanuel Pietriga (INRIA).*

This course first gives an overview of the field of data visualization. It then discusses fundamental principles of human visual perception, focusing on how they help inform the design of visualizations. The following sessions focus on visualization techniques for specific data structures, and discuss them in depth from both design and implementation perspectives, including: multi-variate data, hierarchical structures, networks, time-series, statistical data and geographical data. All exercises are based on Web technologies, including the D3 software library (Data-Driven Documents) and the Vega-lite interactive graphics grammar. While positioned at different levels of abstraction, both enable developers to create a wide range of interactive, Web-based visualizations that run on a variety of platforms, ranging from desktop workstations to mobile devices.

### Reinforcement Learning (TPT-IA318)

**Go to course webpage.**

*This course is taught by Thomas Bonald, Claire Vernade.*

Reinforcement learning: bandit algorithms, Q-learning, deep Q-learning, Monte-Carlo tree search.

### Navigation for autonomous systems (TPT-DATAAI963)

**Go to course webpage.**

*This course is taught by D. Filliat.*

We will give an overview of algorithmic aspects of Mobile Robotics and autonomous vehicles. We will cover the most common robotics platform and sensors (vision, 3D ultrasound, accelerometers, odometry) and the various navigation components: control; obstacle avoidance; localization; mapping (SLAM) and planning along with filtering (Kalman filter, particle filtering etc ...) and optimisation techniques used in these areas.

### Graph mining and Clustering (TPT-MITRO209)

**Go to course webpage.**

*This course is taught by Mauro Sozio.*

The course will cover both the theoretical and practical aspects of data mining, in particular it will focus on clustering, graph mining, community detection, and fully dynamic data mining algorithms. Students will be asked to design algorithms with provable guarantees, while they will efficiently implement one of the algorithms presented in course.

### Multimodal Dialogue (TPT-IA315)

*This course is taught by C. Clavel, G. Varni.*

Introduction to multimodal human-agent dialogue. Emotional recognition, gesture recognition, speech synthesis, multimodal dialogue system, interaction analysis This course provides students with foundational conceptual knowledge, methodologies, and tools for designing, implementing, and evaluating intelligent machines able to engage users in a multimodal dialogue. This requires students to know and apply computational methods for capturing, representing, automatically analyzing the behavior of the users, and generating the behavior of the machines. At the end of the course, the student will: Understand the principles of multimodal communication and its open challenges; Know and understand the motivations for using multimodality for designing intelligent machines Know and understand computational methods for managing the dialogue through the following communication modalities: speech, movement, and facial expressions Know and understand the foundations of conversational analysis

### Kernel Machines (TPT-IA326)

**Go to course webpage.**

*This course is taught by Florence d'Alché, Pavlo Mozharovskyi.*

This course gives an advanced and modern presentation of kernel machines and related tools at the light of recent results in Machine Learning. The course requires to have assimilated basics of Statistical Machine Learning and notions in convex programming. 1- Notions on Kernels and Reproducing Kernel Hilbert Space Theory 2- Kernel machines for regression, classification and dimensionality reduction 3- Kernel Machines for complex output prediction 4- Scaling up kernel machines 5- Relationship between kernel machines and neural networks

### Image understanding (TPT-AIC-DK922)

**Go to course webpage.**

*This course is taught by Isabelle Bloch (TP), Florence Tupin (TP), Antoine Manzanera (ENSTA), David Filliat (ENSTA).*

structural approaches for image understanding, with examples in medical imaging, remote sensing, robotic vision, and video. The methods taught include knowledge-based approaches, graphs, spatial ontologies, information fusion, highlevel recognition

### Programming with GPU for Deep Learning (TPT-IA307)

*This course is taught by Elisabeth Brunet, Goran Frehse.*

This course gives an introduction to GPU programming techniques used for deep learning. Starting from the ground up with basic matrix operations, students will develop code to implement classifiers based on gradient descent. Programs are written in C and use the CUDA API from Nvidia to access the GPU.

### Knowledge Base Construction (TPT-DATAAI964)

*This course is taught by Fabian Suchanek.*

This course will discuss the automated construction of large knowledge bases. For this, we will cover the basics of knowledge representation, natural language processing (POS tagging, dependency parsing), information extraction (fact extraction, named entity recognition), and rule mining and disambiguation. We will see both classical/symbolic methods and deep learning methods for these tasks. https://suchanek.name/work/teaching/kbc-2020/

### Emergence in Complex Systems (TPT-AthensTPT-09)

**Go to course webpage.**

*This course is taught by J.-L. Dessalles.*

The course will cover several social phenomena, including: collective decision, the cocktail party effect, scale-free social networks, the hawk-dove dilemma, cooperation in insect societies, emergence of segregationism, altruism, the "tragedy of the commons", the "green-beard" effect, social coordination, suicide "for the group", honest communication, charity and competitive helping. Several theoretical models will be studied, including preferential attachment, kin selection, the Prisoner’s dilemma, the handicap principle, social signaling. Several of these models derive from applying Game Theory to social dilemma. Content: Emergence, swarm intelligence, genetic algorithms, genetic programming, morphogenesis, emergence of sociality

### Cognitive approach to NLP (TPT-SD-213)

**Go to course webpage.**

*This course is taught by Jean-Louis Dessalles.*

Processing language is one of the most important and most challenging issues of Artificial Intelligence. NLP comes in two flavours. Many current approaches to language processing are based on large collections of texts. Statistics and machine learning provide quite good predictions about syntax, meaning and intentions. By contrast, symbolic approaches to NLP give priority to the analysis of structures and to exact computation. Often inspired by cognitive analyses, symbolic NLP takes the word "processing" literally: the ultimate goal is to reproduce computations that human individuals are supposed to perform when talking relevantly. Content: Basic parsing methods. Knowledge representation – Meaning representation – Procedural semantics – Aspect. Relevance, argumentation.

### Algorithmic information and artificial intelligence (TPT-IA703)

**Go to course webpage.**

*This course is taught by J.-L. Dessalles.*

Algorithmic information is a great conceptual tool to understand Artificial Intelligence. It describes what AI actually does, and it can help making optimal choices. Machine learning, decision making, randomness, probability, anomaly, analogy, interest, even the very act of understanding, all these things make sense in the light of Algorithmic Information Theory. Content: Kolmogrov complexiy applied to ML, to AI problems (meaning similarity, relevant descriptions), to maths (alg. probability, randomness, Gödel th.), to cognitive science (relevance, interest, aesthetics...)

### Mining of Large Datasets (TPT-SD201)

**Go to course webpage.**

*This course is taught by Mauro Sozio, Tiphaine Viard.*

The course will provide an introduction to data mining and will cover the following topics: clustering, decision trees, ranking, association rules, recommendation systems, introduction to MapReduce and Spark. Students will work on a project where they will implement some of the previously mentioned algorithms in Python or in Spark.

### Image mining and content-based retrieval (TPT-AIC-DK921)

**Go to course webpage.**

*This course is taught by Antoine Manzanera (ENSTA), David Filliat (ENSTA), Isabelle Bloch (TP), Henri Maître (TP).*

This course deals with visual data (images and videos), and talks about image representation, processing and indexing, for content-based retrieval purposes. - It starts from image data and their different models, from mathematical and algorithms viewpoints, by exploring the different models: frequency-, discrete-, or set-based, differential, or statistical... - It presents segmentation and feature extraction techniques, i.e. how to reduce the representation support, and what local and global representations can be used to describe the image content. - Practical Work #1 deals with salient point detection, description and matching - Approximately on third of the course is dedicated to classification, detection and image recognition techniques based on machine learning, using CNN (one session) and other unsupervised and supervised techniques (one session). - Practical Work #2 deals with image classification using CNN. - One session is dedicated to a significant use case: satellite image mining. - One session is on video analysis and the importance of motion in video mining, with an emphasis on object tracking methods. - Practical Work #3 is on object tracking in videos. The practical works use Python, OpenCV and Pytorch. The evaluation is based on the 3 reports on the practical works (Weight 0.6), and a theoretical exam (Weight 0.4)

### Advanced Machine Learning and Autonomous Agents (X-INF581-1)

**Go to course webpage.**

*This course is taught by Jesse Read.*

Multi-label and structured-output learning, probabilistic graphical models, modern neural-network architectures, machine learning for graphs, Monte Carlo methods, Introduction to reinforcement learning.

### Topological Data Analysis (X-INF556)

**Go to course webpage.**

*This course is taught by Steve Oudot.*

Objectives : Topological Data Analysis is an emerging trend in exploratory data analysis and data mining. It has known a growing interest and some notable successes in the recent years. The idea is to use topological tools to tackle challenging data sets, in particular data sets for which the observations lie on or close to nontrivial geometric structures that can fool classical techniques. Topological methods are indeed able to extract useful information about these geometric structures from the data, and to exploit that information to enhance the analysis pipeline. The objective of this course is to familiarize the students with this new topic lying at the confluence of pure mathematics, applied mathematics, and computer science. Emphasis is put on the methods and on their theoretical guarantees. Meanwhile, the lab sessions focus on challenging data sets, primarily multimedia data sets such as collections of images or 3d shapes. Content : The course is divided into nine lectures and nine exercise or lab sessions. These cover the main mathematical concepts and algorithmic tools involved in topological data analysis. The topics covered include: dimensionality reduction and its limitations, hierarchical versus density-based clustering, simplicial and singular homology, persistence theory, topological inference for data exploration, topological signatures for data classification, Reeb graphs and Mapper. Suggested readings: Gunnar Carlsson. Topology and Data, Bulletin of the American Mathematical Society Herbert Edelsbrunner and John Harer, Computational Topoogy: An Introduction, AMS press