Skip Navigation Links

Project Information

EFFICIENT DATA REDUCTION AND SUMMARIZATION

Agency:
NSF

National Science Foundation

Project Number:
1444124
Contact PI / Project Leader:
LI, PING
Awardee Organization:
RUTGERS THE ST UNIV OF NJ NEW BRUNSWICK

Description

Abstract Text:
The ubiquitous phenomenon of massive data (including data streams) imposes considerable challenges in data visualization and exploratory data analysis. About 15 years ago, terabyte datasets were still considered `ridiculous.' However, modern datasets managed by Stanford Linear Acceleration Center (SLAC), NASA, NSA, etc. have reached the perabyte scale or larger. Corporations such as Amazon, Wal-Mart, Ebay, and search engine firms are also major generators and users of massive data. The general theme of data reduction and summarization has become an active and highly inter-disciplinary area of research. This project proposes to develop various approximation techniques, which generate a "fingerprint" or "sketch" of the massive data by transforming the original data. These `sketches' are reasonably small (hence easy to store) and can provide approximate answers which are usually good enough for practical purposes.

This proposal concerns the fundamental problems of processing/transforming massive (possibly dynamic) data. In particular, it focuses on (A) developing systematic fundamental tools for effective data reduction and efficient data summarization; (B) applying these tools to improve numerical analysis, visualization, and exploratory data analysis. Two lines of theoretically sound techniques for data reduction and summarization will be developed and further improved: (1) the method of stable random projections (SRP), effective in heavy-tailed data; (2) the method of Conditional Random Sampling (CRS), mainly for sparse data. Concrete applications of SRP and CRS will be investigated. Widely-used basic numerical algorithms can be rewritten by taking advantage of SRP or CRS. Popular methods/tools for exploratory data analysis will also benefit considerably from the development of data reduction techniques.
Project Terms:
Acceleration; Algorithms; Area; Data; Data Analyses; data reduction; Data Set; Development; Fingerprint; Imagery; improved; Methods; Process; Research; Sampling; sound; Stream; Tail; Techniques; tool; United States National Aeronautics and Space Administration

Details

Contact PI / Project Leader Information:
Name:  LI, PING
Other PI Information:
Not Applicable
Awardee Organization:
Name:  RUTGERS THE ST UNIV OF NJ NEW BRUNSWICK
City:  NEW BRUNSWICK    
Country:  UNITED STATES
Congressional District:
State Code:  NJ
District:  06
Other Information:
Fiscal Year: 2014
Award Notice Date: 05-Aug-2014
DUNS Number: 001912864
Project Start Date: 16-Apr-2014
Budget Start Date:
CFDA Code: 47.049
Project End Date: 28-Feb-2015
Budget End Date:
Agency: ?

Agency: The entity responsible for the administering of a research grant, project, or contract. This may represent a federal department, agency, or sub-agency (institute or center). Details on agencies in Federal RePORTER can be found in the FAQ page.

National Science Foundation
Project Funding Information for 2014:
Year Agency

Agency: The entity responsible for the administering of a research grant, project, or contract. This may represent a federal department, agency, or sub-agency (institute or center). Details on agencies in Federal RePORTER can be found in the FAQ page.

FY Total Cost
2014 NSF

National Science Foundation

$104,921

Results

i

It is important to recognize, and consider in any interpretation of Federal RePORTER data, that the publication and patent information cannot be associated with any particular year of a research project. The lag between research being conducted and the availability of its results in a publication or patent award varies substantially. For that reason, it's difficult, if not impossible, to associate a publication or patent with any specific year of the project. Likewise, it is not possible to associate a publication or patent with any particular supplement to a research project or a particular subproject of a multi-project grant.

ABOUT FEDERAL REPORTER RESULTS

Publications: i

Click on the column header to sort the results

PubMed = PubMed PubMed Central = PubMed Central Google Scholar = Google Scholar

Patents: i

Click on the column header to sort the results

Similar Projects

Download Adobe Acrobat Reader:Adobe Acrobat VERSION: 3.41.0 Release Notes
Back to Top