Interactive and perceptually accurate visualization of multidimensional data.

Emerging Big Data (data-intensive) science and engineering workflows require new data analysis tools to help researchers move from data to insight. Since the highest-bandwidth input to the human brain is through the visual system, it makes sense that new data analysis tools harness the power of data visualization. New research is needed to unlock the potential of visualization in this context. Specific research questions include: (1) How can new computer graphics rendering algorithms be best utilized to turn data into a visual form that scientists can understand? (2) To what extent can we validate that the human visual system can correctly interpret these visual representations for data? (3) How can we build effective interactive data visualizations that support exploratory data analysis through interactive querying? (4) To what extent can emerging user interface hardware (e.g., multi-touch displays, 3D depth cameras, optical tracking of gestures, voice input) be utilized to make interactive data visualizations more effective for working with today’s massive and complex real-world datasets.

More Information:

Large-scale machine learning for data-driven discovery.

Large-scale machine learning problems demand scalable algorithms to extract patterns. We investigate novel mathematical algorithms to solve large-scale optimization problems that arise in machine learning and graph analysis. Machine learning in the context of extremely large datasets requires distribution of data and/or computation. We use proximal splitting methods to split a general machine learning problem into separate intermediate steps, each of which can be easily optimized or solved in closed form. The result is a class of first-order methods capable of converging to the general optimal solution. These methods are scalable to very large sizes, but the convergence rate can be extremely variable in ways that can be hard to predict. Using spectral methods, we are able to speed up convergence substantially. The goal of this project is to gain a better understanding of the convergence behavior and to use this understanding to construct accelerated algorithms with more consistent convergence properties. This will allow the application of machine learning techniques to a much wider class of problems.

More Information:

Enhancing spatial perception and presence in immersive virtual environments.

Immersive virtual reality technology offers tremendous potential for fundamental and transformative advances in education, training, rehabilitation, architectural design, psychotherapy, and a wide range of other application areas. To help realize this potential, our lab is addressing key challenges in data acquisition and model building, 3D self-representation and body tracking, spatially aware locomotion, and multi-user interaction, in collaboration with colleagues from the Department of Architecture in the College of Design at UMN.

More Information:

Big Tensor Mining: Theory, Scalable Algorithms, and Applications

Given triplets of facts (subject-verb-object), like ('Washington' 'is the capital of' 'USA'), can we find patterns, new objects, new verbs, anomalies? Can we correlate with brain scans, to discover which parts of the brain get activated, say, by tool-like nouns ('hammer'), or action-like verbs ('run')? Similarly, given a who-recommends-what recommendation system, with rich side information (e.g., user-product review texts), can we better explain recommendations, detect clusters and anomalies? Our project is designed to develop a unified coupled tensor factorization framework to systematically mine such datasets by focusing on two broad research thrusts: (i) new theory and methods for big sparse tensor and couple tensor/matrix factorization, and (ii) developing scalable methods and algorithms based on map-reduce and multi-core processing that will allow these methods to operate on tera- and peta-byte datasets.

More Information:

Learning Data Analytics: Providing Actionable Insights to Increase College Student Success

National studies report that the average six-year graduation rate across higher-education institutions has been around 59% for over 15 years; while less than half of college graduates finish within 4 years. This has high human, monetary, and societal costs in terms of workforce development, economic activity, and national productivity. Our project will develop new computational methods to analyze large and diverse types of education and learning-related data to improve undergraduate education by addressing three critical issues: (a) discovering successful academic pathways for students; (b) improving pedagogy for instructors; and (c) enhancing student retention and persistence for institutions.

More Information:

Analyzing the Earth system using graph-based approaches.

Climate change is a defining environmental challenge facing our planet as rising temperatures and the transformation of global ecosystems are placing unprecedented strains on natural resources, man-made infrastructure, and our society. Responding to this challenge is difficult because of great uncertainty in the projections of future climate. One of the primary tools used to study the climate system are general circulation models (GCMs), which are composed of interacting subsystems that implement a number of fundamental physical, chemical, and/or biological processes. As GCMs continue to grow in complexity and the simulations are run at higher spatial and temporal resolutions, it is increasingly difficult to analyze the data they produce. The rapid gains in computational power and storage capacity have resulted in a data deluge for climate science. Thus, there is a strong need for transformative advances in the tools and methodologies for model evaluation. This project aims to address the need for better understanding of interactions within the climate system and for improved model evaluation by developing a tool set for analyzing graphs that represent observed or simulated climate data.

More Information:

Online Social Support in Health and Wellness Contexts

People find support, advice, and a community in online spaces that focus on specific health conditions. We have been working with two large online communities where such support is readily exchanged. The first community --- --- focuses on enabling peer-support in recovery from substance use disorders (e.g., alcoholism) The second community --- --- focuses on empowering the friends and family members of patients with serious health conditions (e.g., cancer) to provide emotional and instrumental support. We have significant data sets from both communities, as well as smaller sets of self-report questionnaire and qualitative data from their members.

Students will be asked to contribute to ongoing projects that investigate one or more of the following questions: What characterizes active membership in these communities? How do members give and receive support in these communities? How can these communities better support their members? Students may use a variety of mixed methods approaches in answering these questions. Students with experience and interest in any of the following methods are particularly encouraged to apply: Natural Language Processing, Content Analysis and Grounded Theory, Machine Learning, Data Mining, and Social Network Analysis.

All potential students are advised to review Dr. Yarosh's "Work with Me" policies and follow the application process described there.