The projects below are available for student projects, Bachelor and/or Master thesis. Please get in contact.

**1. Task Recommender System for Gliding**

**Description.** Gliding is both a recreational and competitive sport. On
good days, glider pilots can be in the air up to 8 hours or more, during
which they can cover significant distances (800km and more). Most glider
pilots upload their competitive flights to an online platform Onlinecontest.org, where flights are daily listed and ranked using
points that are based on the covered distance and the performance of the
aircraft.

To achieve high points on a given day, glider pilots
have to carefully choose a task (flight route) that, given their plane,
skills and the weather conditions, allows them to cover the maximal
distance. All of weather conditions, skills and plane can make a huge
difference, as overestimating the weather conditions may lead to not
completing the task, and as it is not uncommon that experienced pilots travel twice or more the distance that beginners travel.

The goal of this project is to develop a prototype of a gliding task recommendation system, which takes into account the factors mentioned above. A collaboration with the Aeroclub Bolzano allows to get feedback from domain experts.

**Prerequisites.** Opportunity for a student with some knowledge of recommender systems basics, or interest in spatial data analysis.

**2. Heuristics for an NP-hard Problem**

**Description.** Finding the largest subset of a partition of a bipartite graph that has at most n neighbours is an NP-complete problem [1] with practical applications in data management. And in contrast to the related problem of set-cover [2], a greedy approach can be arbitrarily worse than the optimal solution. The goal of this thesis is threefold:

- To construct problem instances where the optimal solution is known,
- To evaluate how well a greedy approach performs in practice,
- To devise heuristical methods that perform better.

**Prerequisites.** Opportunity for tackling a truly new problem for a student with basic programming skills (any language) and a good knowledge of data structures and algorithms.

**References**:

**3. Management of a Data Annotation Project**

**Description.** Wikidata [1] is a Wikipedia-style community project for structured knowledge. Completeness of data collected on Wikidata is highly varying and subjective. For instance, for US presidents, children are usually recorded, whereas for Physics Nobel Prize winners usually not.

The goal of this project is to use the Amazon Mechanical Turk crowdsourcing platform to annotate a small portion of Wikidata with

(1) completeness information relative to the information available on Wikipedia, and

(2) with subjective assessment of the completeness of the data.

This dataset will then be provide a basis for machine learning techniques for predicting the completeness of unlabelled objects (not part of this project).

**Prerequisites**. Opportunity to gain experience in managing data annotation projects for a student with some project experience.

**References**:

**4. Implementation of a Relational Algebra**

**Description.** In [1], the pattern algebra was presented as a way to compute the completeness assertions that hold for the answers of select-project-join-queries, given completeness patterns for database tables. The pattern algebra was defined for the operations selection by constant, selection by variable, projection and equijoin, and each pattern algebra operation is very similar to the corresponding relational algebra operation. In particular, it is possible to translate pattern algebra queries to relational algebra queries.

The goal of this project is to implement a compiler that takes as input a select-project-join-query, expressed as single-block SQL query, and outputs the corresponding pattern algebra query, again in SQL. Java code for translating between single-block SQL queries and algebra queries is already available, thus, the main task would be to implement the translation from relational algebra to the pattern algebra.

**Directions.** For students with interest in theory, the project may include a theoretical analysis of the translation, in particular, under which conditions the translation yields a query of polynomial size. For students with interest in quantitative methods, the project may include an experimental evaluation of the performance of pattern algebra queries, for instance using queries from the TPC-H benchmark.

**References**:

- [1] Identifying the Extent of Completeness of Query Answers over Partially Complete Databases, Simon Razniewski, Flip Korn, Werner Nutt, Divesh Srivastava, SIGMOD 2015