Research Themes +
Energy Technologies Area (ETA) researchers are continually building on the strong scientific foundation we have developed over the past 50 years. We address the world’s most pressing climate challenges by bringing to market energy-efficient innovations across the buildings, transportation, and industrial sectors. ETA is at the forefront of developing better batteries for electric vehicles; improving the country's aging electrical grid and innovating distributed energy and storage solutions; developing grid-interactive, efficient buildings; and providing the most comprehensive market and data analysis worldwide for renewable technologies like wind and solar.
Strategic Initiatives +
The Energy Technologies Area (ETA) Strategic Plan is the guiding force for our research and development for the next ten years. It clearly charts a path toward clean-energy solutions and focuses on five detailed Strategic Initiatives. The Plan provides an in-depth look at how ETA is accelerating research to provide affordable, clean energy to all while accomplishing deep, economy-wide decarbonization, looking to avoid a rise in global average temperature while simultaneously developing solutions to increase humanity's resilience to extreme weather volatility.
Publications
News +
For media inquiries,
please contact ETA
Interim Communications Manager
Kiran Julin

kjulin@lbl.gov
About Us +
The Energy Technologies Area (ETA) is unique in translating fundamental scientific discoveries into scalable technology adoption. Our approach combines an understanding of the marketplace and the role of state and federal regulation and policies. ETA's research drives real-world, practical results that affect and improve the everyday lives of Americans and those across the globe. Saving energy and battling the Climate Crisis are key to the foundation of our research, which is driven by technoeconomic analysis and in-lab experimentation and discovery.

Data quality challenges with missing values and mixed types in joint sequence analysis

Publication Type

Conference Paper

Date Published

12/2017

Authors

Lazar, Alina, Ling Jin, C Anna Spurlock, Annika Todd-Blick, Kesheng Wu, Alex Sim

DOI

10.1109/BigData.2017.8258222

Abstract

The goal of this paper is to investigate the impact of missing values in categorical time series sequences on common data analysis tasks. Being able to more effectively identify patterns in socio-demographic longitudinal data is an important component in a number of social science settings. However, performing fundamental analytical operations, such as clustering for grouping these data based on similarity patterns, is challenging due to the categorical and multi-dimensional nature of the data, and their corruption by missing and inconsistent values. To study these data quality issues, we employ longitudinal sequence data representations, a similarity measure designed for categorical and longitudinal data, together with state-of-the art clustering methodologies reliant on hierarchical algorithms. The key to quantifying the similarity and difference among data records is a distance metric. Given the categorical nature of our data, we employ an “edit” type distance using Optimal Matching (OM). Because each data record has multiple variables of different types, we investigate the impact of mixing these variables in a single similarity measure. Between variables with binary values and those with multiple nominal values, we find that the ability to overcome missing data problems is harder in the nominal domain versus the binary domain. Additionally, artificial clusters introduced by the alignment of leading missing values can be resolved by tuning the missing value substitution cost parameter.

Journal

2017 IEEE International Conference on Big Data (Big Data)2017 IEEE International Conference on Big Data (Big Data)

Year of Publication

2017

Organization

Energy Markets and Policy Department, Energy Technologies Area, Energy Analysis and Environmental Impacts Division

Research Areas

Sustainable Energy Systems, Energy & Data Behavior Analytics, EAEI Energy Markets & Policy, Metrics and evaluation, Efficiency, Electrification, and Flexibility, Energy Equity 2, Behavior Analytics 2