Summer In­tern­ships 2021 of the De­part­ment of Com­puter Science

The Department of Computer Science offers over 20 salaried internships in multiple research areas for summer 2021. The application deadline for these positions was on Monday the 1st of February, 2021. The decisions on these positions have now been made, and all the applicants will be notified on them by the end of March.

These internships are primarly aimed for computer science and data science students. In some groups or projects, there can also be internships for students of mathematics and statistics. Note that making a Master's thesis during or after the internship is only possible for the students of computer science and data science students at the University of Helsinki. 

The internships typically span three (3) months between May and September. The exact start and end date will be decided in individual negotiations. The salary of a summer intern depends on the phase of her/his studies (the number of credits), and it is usually a little bit more than 2 000 euros per month.

Ap­ply­ing for an in­tern­ship 

To apply for an internship, use the following electronic form: https://elomake.helsinki.fi/lomakkeet/109438/lomake.html?rinnakkaislomake=CS_summer_jobs_2021 (the form was closed by the application deadline).

Applicants must upload a study transcript (a list of passed exams and courses) and, optionally, a one-page curriculum vitae with the application form. 

Im­port­ant Dates 

  • February 1st, 2021: Deadline for applications
  • February 2nd – 28th, 2021: Possible interviews 
  • During March 2021: Notification on decisions
  • In August 2021: Summer Interns' Seminar

Ques­tions? 

Send your possible questions related to the application process to Pirjo Moen (pirjo.moen@helsinki.fi). More information on the positions themselves from the named contact persons below. 

The available summer internship positions in 2021 are the following:

In this position, you will be developing and implementing machine learning methods useful for data science, scientific computing and artificial intelligence. The methods developed in our group exploit a probabilistic representation of uncertainty, like humans do, to gather the data that are most informative for the task at hand ("active-sampling"). This approach makes our algorithms particularly robust and efficient. This project can be extended to a Master's thesis.

Required background: machine learning, mathematics, programming skills.

In this project we will study the application of these methods in physical domains in collaboration with substance area experts in meteorology and/or atmospheric physics and chemistry with focus on understandability and interpretability of the methods. The project will be done in collaboration with the Institute for Atmospheric and Earth System Research (INAR). In this position an interest and background in natural sciences is considered an advantage.

The Constraint Reasoning and Optimization Group has summer intern openings for summer 2021. Interns will engage in forefront research guided by senior researchers in the group. Topics include automated reasoning and optimization techniques for NP-hard real-world problems (ranging from theoretical analysis to practical algorithm development, implementation, and empirical studies); and symbolic techniques for formally verified and explainable AI.

Differentially private machine learning studies learning methods that can operate while guaranteeing privacy of the data subjects. We apply differential privacy in the context of various modern machine learning methods, including Bayesian methods, deep learning and federated learning.

The main focus of the project is on the use of GPU computing to accelerate Big Data processing using the Apache Spark distributed computing framework. The job can be also alternatively be extended to a Master's thesis position on the topic. The summer job is in the context of the Academy of Finland project "Design and Verification Methods for Massively Parallel Distributed Systems (DeVeMaPa)". In the project we will develop methodology for the design and verification of massively parallel heterogeneous computing. We need new methods to support the massive increase in the amount of parallelism at all levels of the hardware/software stack. Such massive increases in parallelism will make some currently used programming paradigms infeasible and thus new methods need to be devised to cope with industrial Big Data use cases. These methods must also be accompanied with solid theoretical foundations, allowing for the development of automated testing and verification tools that are required to validate the parallelization runtime software before production deployment. A key challenge is the need for seamless integration of heterogeneous computing with GPUs, how can they all be handled in a unified Big Data programming framework.

Graphs are a natural computational model for various problems in bioinformatics involving high-throughput sequencing data. We are currently working on two such applications: (1) pan-genome graphs, representing not only a single reference genome, but also all mutations observed in a population (2) assembly graphs used to assemble viral quasi-species (e.g. SARS-CoV-2). In this project you will develop graph algorithms, and implement and test them on simulated and real biological data, for one of these two application areas. This project can be extended into a master thesis after the internship. 

The task of the summer trainee is to help improve the home exercises and design an automated exam for the upcoming Autumn 2021 MOOC course on Big Data Platforms. The research group has already designed and run the course once in Autumn of 2020 and the main task of the trainee is to improve the used systems to allow for scalability to larger student volumes, especially by automating the exam system and improving the scalability of the home exercises. The main requirements are good programming skills and interest in learning the latest Big Data Technologies with the help of the other members of the research group supervising the project.

We offer an MSc Thesis position for a student that can be started during the summer. The position has an option to continue the work as a PhD student. The student will develop machine learning approaches for analyzing brain imaging data (EEG and fNIRS) in the context of new types of brain-computer interfacing. The work is conducted as part of an Academy of Finland funded project and involves access to unique brain imaging data and possibility to test and develop the methods in interaction with both machine learning and neuroscience experts. The student should have experience and basic knowledge in machine learning, neural networks, and practical implementation of these in Python (TensorFlow or similar framework). An interest in human cognition and brain-computer interfaces is an advantage.

Ultrasound can be used for non-invasive monitoring of what is inside some structure, but direct physical modeling of ultrasound propagation is slow and difficult for all but very simple cases. We develop physically motivated machine learning models for ultrasound propagation in complex environments, based on Bayesian models (e.g. Gaussian processes) and deep learning, to be used for building smart sensing network for industrial equipment. This is done based on both computational simulations and laboratory experiments conducted by our collaborators at the Physics department.

We are looking for an intern to develop machine learning models for various tasks in this general domain, and potentially for developing real-time control for lasers used for inducing the ultrasound. An ideal candidate has completed a few machine learning courses (e.g. Bayesian Machine Learning, Deep Learning, or Advanced Course in Machine Learning) with good grades before summer. Studies in physics, signal processing and statistics are considered a strength, and also physics and statistics students are encouraged to apply. The topic is suitable for a MSc thesis.

We develop models for describing and predicting how players behave in mobile games, by combining cognitive theory of human behavior with probabilistic machine learning methods. This allows understanding e.g. risk-taking behavior based on limited observations, simulating how a specific individual would play a game, and even identification of possible actions that could be carried out to increase engagement with the game. The project is carried out in collaboration with HCI researchers from Aalto University and a Finnish game company providing access to data collected for real players.

We are looking for an intern to help with various data processing tasks, as well as to take part in developing the computational models together with others already working for the project. The exact scope of tasks can be determined flexibly, based on the qualifications – we are looking for applicants that are interested in machine learning, cognitive psychology, economic decision making, HCI, and/or games research. The topic is suitable for a MSc thesis.

The genome of an oragnism can be investigated with DNA sequencing. DNA sequencing breaks the genome into small fragments and reports the nucleotide sequence of these fragments, i.e. substrings of the genome. The Nanopore Minion sequencing device is about the size of a mobile phone and allows sequencing in the field. However, while DNA sequencing can be performed with these portable devices in the field, the analysis of the data currently requires large data centers.

In this project, you will experiment on implementing some DNA sequencing analysis algorithms on portable devices. Possible methods to implement include for example sequencing error correction and de Bruijn graph construction. The exact scope of the project will be agreed with the summer worker. The topic can be tailored so that it is suitable for a Master's thesis.

Programming skills and knowledge of algorithms and data structures is needed. Knowledge of biology or bioinformatics is beneficial, but not necessary.

Using health data for analytics and research holds great promise, but the sensitive nature of the data requires additional precautions to ensure privacy. In this project you will participate in developing tools for understanding privacy risks in the use of health data and/or tools for facilitating privacy-preserving use of such data.

The detection of similar gene sequences across a diverse range of genomes (gene homology detection) is a crucial analytical step to annotate and derive the molecular function of proteins. The classical solutions involves dynamic programming algorithms computing the edit distance between two protein sequences. However, such protein alignment algorithms usually return only *one* optimal alignment, whereas in reality there might be *many* optimal and *sub-optimal* alignments, with no way of deciding which one is the true one. In this project you will develop dynamic programming algorithms that "reliably" address two issues, and test them on biological data. This project can be extended into a master thesis after the internship.

The task is devoted to development and evaluation of methods aimed at detection of temporal semantic shifts in large collections of text. The change of word meaning or word usage over time may indicate crucial events. For example, "home office" or "vaccine" are used in completely different contexts in 2020 than before. Usually, word usage change is more subtle but still could be automatically detected from text. This task has many real-world applications, one of them is news monitoring, which is relevant for the ongoing project in our group (Embeddia, http://embeddia.eu/). 

The intern will run computational experiments using large collections of news data, in Finnish and other languages. The working instruments will include word and document embeddings, topic modeling, recursive neural networks. The intern should have good programming skills and basic knowledge of machine learning. Interest in NLP is a plus. The topic is suitable for an MSc thesis.

We are looking for multiple interns to work on tools and techniques for the efficient development and operation of machine learning systems. To ensure that machine learning systems work for real, new ways are needed to ensure their correct and efficient operation as well as their smooth development and maintenance. In particular, testing of AI systems and continuous integration (CI/CD) are in the focus of our new research project. The work involves implementing research prototypes to try out ideas and performing measurements. We can flexibly tailor the work to match the applicants profile. Applicants are expected to have good coding skills. Experience in machine learning and software engineering is useful. 

The positions are related to the following projects:

The internships can be extended after summer as an MSc thesis worker or a part-time research assistant positions.

We are looking for talented students with strong background and interests in mathematics or theoretical computer science.  The research problem is to find the worst-case size bound for a graph pattern matching query. An important previous work on relation size bound is https://arxiv.org/pdf/1711.03860.pdf.   The applicant is supposed to write a Master thesis based on this topic.

Helsinki Institute for Information Technology (HIIT) can support a limited number of summer trainee positions within its four focus areas, which are Artificial Intelligence, Computational Health, Cybersecurity and Data Science. The focus area of Artificial Intelligence is linked to the Finnish Center for Artificial Intelligence (FCAI). However, the AI projects supported in this summer trainee call spans all of AI, and need not be linked to the FCAI programs or highlights.

There are no pre-defined positions defined here, but you can choose a focus area if you would like to sign up as a potential candidate to discuss with the focus area PIs about their project ideas. The individual positions listed in the summer internship call are given preference over the focus area slots, so it is advisable to choose one of those positions, if there are any of interest.

Helsinki Institute for Information Technology (HIIT) can support a limited number of summer trainee positions within its four focus areas, which are Artificial Intelligence, Computational Health, Cybersecurity and Data Science. The focus area of Computational Health aims to develop theory and computational methods for complex data and systems underpinning data-driven medicine and healthcare, to develop efficient and accurate next-generation analytics tools to tackle key problems in genomic analysis, computational metabolomics, and drug resistance and network pharmacology, and to bridge the gap to clinical practice through collaboration with top medical research groups and hospitals.

There are no pre-defined positions defined here, but you can choose a focus area if you would like to sign up as a potential candidate to discuss with the focus area PIs about their project ideas. The individual positions listed in the summer internship call are given preference over the focus area slots, so it is advisable to choose one of those positions, if there are any of interest.

Helsinki Institute for Information Technology (HIIT) can support a limited number of summer trainee positions within its four focus areas, which are Artificial Intelligence, Computational Health, Cybersecurity and Data Science. The focus area of Cybersecurity is linked to Helsinki-Aalto Institute for Cybersecurity (HAIC), which is an initiative for excellence in cybersecurity research and education. Topics of interest include (mobile) platform security, machine learning and security, 5G security, applied cryptography, security protocol engineering, network security, security for ubiquitous computing, protocol analysis, formal verification, foundations of cryptography, white-box cryptography, blockchains and consensus, and stylometry and security.

There are no pre-defined positions defined here, but you can choose a focus area if you would like to sign up as a potential candidate to discuss with the focus area PIs about their project ideas. The individual positions listed in the summer internship call are given preference over the focus area slots, so it is advisable to choose one of those positions, if there are any of interest.

Helsinki Institute for Information Technology (HIIT) can support a limited number of summer trainee positions within its four focus areas, which are Artificial Intelligence, Computational Health, Cybersecurity and Data Science. The focus area of Data Science is linked to to Helsinki Centre for Data Science (HiDATA). The aim is to solve significant societal and industrial challenges related to data analysis.

There are no pre-defined positions defined here, but you can choose a focus area if you would like to sign up as a potential candidate to discuss with the focus area PIs about their project ideas. The individual positions listed in the summer internship call are given preference over the focus area slots, so it is advisable to choose one of those positions, if there are any of interest.