DMS Lab

Data Management & Systems
University of Copenhagen (DIKU)

Overview

The DMS Lab conducts research in areas emerging with new challenges in data management. The lab is a part of the APL Section, which reflects our interest in bringing together software engineering and data management. Projects range from design of spatial databases, including map visualizations and generalizations, to main memory transactional databases, games and simulations, multidimensional indexing, information retrieval and data integration. The group is keen on validating their work experimentally -- we love writing code, which is not to say that our love for the blackboard is in any way diminished. :-)

When conducting our work we usually resort to one or more of the following:

  • Abstractions & Languages
  • Combinatorial Optimization
  • Data layout, Indexing and Data structures
  • System Implementation & Design
  • Statistics & Prediction
  • Parallelism & Distribution

People

Faculty

PhD students

Alumni

Projects

Current Projects

Online Transaction Processing Clouds

With changing architectural and application trends, we are re-visiting the design of online transaction processing databases. In this project, we are investigating the design tradeoffs OLTP databases in light of the demands of evolving online workloads. We are studying novel approaches for main memory transaction processing clouds that allow ease of administration, high resource utiliziation and a flexible programming model. We are currently working to build a prototype system that demonstrates these properties.

Open Geodata Serving

In a collaboration with the Danish Geodata Agency, we have explored new approaches to cook and serve geodata to the public on the Web. A main challenge in cartography is producing maps of high quality over complex shapes requires the craft of human expertise. However, given the explosion in geospatial data, the pressure for high-productivity tools for cartography is increasing at a fast pace. Our work has explored how to create a new class of declarative cartography tools. Our language CVL, the Cartographic Visualization Language, can be processed entirely within a spatial DBMS, opening up exciting opportunities for automatic optimization and scalability. In a separate line of work, we have also analyzed production logs for map-serving web services. These production logs reveal strong spatial and temporal concentration patterns which can be exploited for more efficient caching.

Behavioral Simulations and Computer Games

In collaboration with the Cornell Database Group, we have worked on a new scripting platform for games and agent-based simulations. Our recent work in this project has been around iterated spatial join techniques optimized for main memory, as well as communication, especially latency, optimizations for cloud environments. We have also explored techniques for automatic parallelization of large-scale behavioral simulations, as well as efficient checkpoint-recovery techniques for Massively Multiplayer Online Games (MMOs).

Past projects

Multidimensional Indexing and Large Main Memories

We have also studied index structures for either read-intensive or write-intensive workloads. For the first class of workloads, we have studied experimentally, together with collaborators from Saarland University and ETH Zurich, the performance of one specific index structure, the Dwarf index. For the second class of workloads, we have studied how to answer queries over collections of moving objects, e.g., for vehicle tracking or spatial agent-based simulations. The problem is challenging because these applications have very high update rates that result from continuous movement. Our technique, MOVIES, is based on frequently rebuilding index snapshots in main memory. Using data partitioning over multiple nodes in a small cluster, we have scaled MOVIES up to 100 million moving objects over the road network of Germany, while keeping snapshot latencies below a few seconds.

Dataspaces and Personal Information Management

In early work at the ETH Zurich Systems Group, we have co-designed the iMeMex Dataspace Management System, a hybrid information integration architecture that allows users to transition from search to data integration in a pay-as-you-go fashion. Unlike traditional relational DBMS, iMeMex does not take full control of the data, but offers services over one's complex personal dataspace. We have explored several interesting themes in the design of iMeMex, such as the definition of a unified data model for personal information, a novel technique based on mapping hints (called trails) to increase the level of integration of personal information over time, and the search over graphs of user data created by view definitions.

Publications

We list below our international peer-reviewed publications in journals and conferences (excludes workshops).

Courses at DIKU

Bachelor Courses

Master Courses

Research seminars and reading groups