Lossless Compression of High-volume Numerical Data from Simulations pdf

**project girl** · 24-11-2012, 12:25 PM

Lossless Compression of High-volume Numerical Data from Simulations

.pdf

1Lossless Compression.pdf (Size: 221.26 KB / Downloads: 81)

Abstract

Applications in scientific computing operate with high-volume numerical data and
the occupied space should be reduced. Traditional compression algorithms cannot
provide sufficient compression ratio for such kind of data. We propose a lossless algorithm
of delta-compression (a variant of predictive coding) that packs the higher-order
differences between adjacent data elements. The algorithm takes into account varying
domain (typically, time) steps. The algorithm is simple, it has high performance and
delivers high compression ratio for smoothly changing data. Both lossless and lossy
variants of the algorithm can be used. The algorithm has been successfully applied
to the output from a simulation application that uses a solver of ordinary differential
equations.

Introduction

Applications in scientific computing often operate with large volumes of output and input
data. Despite the huge capacity of the modern disk devices, the space required for data storage
is often larger than the hardware allows. Data transfer over communication networks
is another bottleneck in scientific computing. Text compression tools cannot compress
binary numeric data. Image compression algorithms are not intended for such data. Signal
compression algorithms don’t work directly with numerical data of time series, when time
steps are varying. Also, they are typically designed for measured values, not for simulation
results and therefore not intended for lossless compression. Therefore some new data
compression algorithms must be designed.

Fixed step delta-compression

In this section we consider the simplest variant of delta-compression algorithm. Our algorithm
computes the first (and higher-order) differences using 64-bit integer arithmetic; after
that meaningless 0-s or 1-s are truncated in the result.

Using fixed step extrapolation

The algorithm of differences of ordermcan be formulated differently in terms of extrapolation
of order (m−1). When this reformulation is done, just slightly different8 computations
take place, and these can be seen from different point of view. We introduce the extrapolation
technique here (instead of differences) in order to proceed later to varying step
extrapolation algorithm in Section 4.

Varying step extrapolation algorithm

The previous algorithms assumed that the sequence a can be approximated by polynomials
with fixed steps between extrapolation points. However in practice, simulations use adaptive
ODE solvers and produce state variable values for varying, non-equidistant time steps.
Therefore we should consider smooth functions with values taken with varying steps, and
adapt the compression algorithm for such application data.

Application to simulation results

The compression algorithms were applied to the output from a numerical solver of ordinary
differential equations, which serves as a component in our software for dynamic simulation
of bearing [1]. The program produces some quantities for every time step and writes them
to the output file for analysis and further simulation. Every quantity (position, force etc.)
changes very slowly from one step to another. Extreme accuracy and lossless compression
is necessary, since a relative error of order 10−10 can substantially change the simulation
results.
To chose a particular algorithm and its order the compression routines estimate achievable
compression ratio by subsampling, trying different algorithms and choose the best
one.

Conclusion

A lossless algorithm for floating-point data compression has been developed. It has similarities
to image compression, since it works on bit level. It resembles wavelet compression
since it uses floating-point computations for compression and decompression. The algorithm
works best if the data are values of a function in some points, and this function is
close to a polynomial.
The algorithm uses subtraction of one 64-bit integer representation of floating-point
value from another (Int(aj)−Int('m(tj))). If the difference would be computed between
floating-point representations (aj − 'm(tj)) there would be no win in data storage.
The algorithm is implemented as a C++ class and linked to an industrial-level application.
The measurements show high compression ratio (in comparison with traditional tools)
as well as high speed[5]. In the future we are going to test and measure the algorithm with
the data samples from other applications.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Software Crisis pdf	study tips	1	2,117	21-09-2017, 04:31 PM Last Post: jaseela123
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123
	Data Mining: What is Data Mining? Report	project girl	1	2,262	21-09-2017, 11:47 AM Last Post: jaseela123
	HOW EMAIL WORKS pdf	project girl	1	3,067	20-09-2017, 11:39 AM Last Post: jaseela123
	Cyber crime detection, investigation and prosecution pdf	seminar projects maker	1	958	20-09-2017, 11:31 AM Last Post: jaseela123
	Review: Context Aware Tools for Smart Home Development pdf	study tips	1	1,227	20-09-2017, 11:22 AM Last Post: jaseela123
	DEMONSTRATING DATAPOSSESSION AND UN CHEATABLE DATA TRANSFER	seminar flower	1	1,466	19-09-2017, 11:05 AM Last Post: jaseela123
	Getting Started with the MAXQ1103 Evaluation Kit and the CrossWorks Compiler pdf	project girl	1	969	15-09-2017, 03:11 PM Last Post: jaseela123
	Wireless Application Protocol (WAP) pdf	project girl	1	1,531	15-09-2017, 02:42 PM Last Post: jaseela123
	Processing of collected data PPT	seminar projects maker	1	718	15-09-2017, 12:48 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.