Optimizing software in C++

**seminar flower** · 26-05-2012, 10:45 AM

An optimization guide for Windows, Linux and Mac
platforms

.pdf

Optimizing software in C++.pdf (Size: 879.17 KB / Downloads: 22)

Introduction

This manual is for advanced programmers and software developers who want to make their
software faster. It is assumed that the reader has a good knowledge of the C++
programming language and a basic understanding of how compilers work. The C++
language is chosen as the basis for this manual for reasons explained on page 8 below.
This manual is based mainly on my study of how compilers and microprocessors work. The
recommendations are based on the x86 family of microprocessors from Intel, AMD and VIA
including the 64-bit versions. The x86 processors are used in the most common platforms
with Windows, Linux, BSD and Mac OS X operating systems, though these operating
systems can also be used with other microprocessors. Many of the advices may apply to
other platforms and other compiled programming languages .

The costs of optimizing

University courses in programming nowadays stress the importance of structured and
object-oriented programming, modularity, reusability and systematization of the software
development process. These requirements are often conflicting with the requirements of
optimizing the software for speed or size.
Today, it is not uncommon for software teachers to recommend that no function or method
should be longer than a few lines. A few decades ago, the recommendation was the
opposite: Don’t put something in a separate subroutine if it is only called once. The reasons
for this shift in software writing style are that software projects have become bigger and
more complex, that there is more focus on the costs of software development, and that
computers have become more powerful.
The high priority of structured software development and the low priority of program
efficiency is reflected, first and foremost, in the choice of programming language and
interface frameworks. This is often a disadvantage for the end user who has to invest in
ever more powerful computers to keep up with the ever bigger software packages and who
is still frustrated by unacceptably long response times, even for simple tasks.

Choice of microprocessor

The benchmark performance of competing brands of microprocessors are very similar
thanks to heavy competition. Processors with multiple cores are advantageous for
applications that can be divided into multiple threads that run in parallel. Small lightweight
processors with low power consumption are actually quite powerful and may be sufficient for
less intensive applications.

Choice of programming language

Before starting a new software project, it is important to decide which programming
language is best suited for the project at hand. Low-level languages are good for optimizing
execution speed or program size, while high-level languages are good for making clear and
well-structured code and for fast and easy development of user interfaces and interfaces to
network resources, databases, etc.
The efficiency of the final application depends on the way the programming language is
implemented. The highest efficiency is obtained when the code is compiled and distributed
as binary executable code. Most implementations of C++, Pascal and Fortran are based on
compilers.
Several other programming languages are implemented with interpretation. The program
code is distributed as it is and interpreted line by line when it is run. Examples include
JavaScript, PHP, ASP and UNIX shell script. Interpreted code is very inefficient because the
body of a loop is interpreted again and again for every iteration of the loop.

Conclusion

Vectorized code often contains a lot of extra instructions for converting the data to the right
format and getting them into the right positions. The amount of extra data conversion and
shuffling that is needed determines whether it is profitable to use vectorized code or not.
The code in example 12.7 is slower than non-vectorized code on older processors, but
faster on processors with 128 bit execution units. The code in example 12.6b and 12.6c is
faster than the non-vectorized code on all processors despite the extra data conversion,
packing and unpacking. This is because the bottleneck here is not data conversion and
packing, but division. Division is very time-consuming and there is a lot to save by doing
division in single precision vectors. The code in example 12.8b and c benefit a lot from
vectorization

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Automated exam cell application software	mkaasees	1	1,040	08-01-2018, 10:03 AM Last Post: Raymondnof
	Software Requirements Specification for Web Publishing System	study tips	1	972	12-09-2017, 10:34 AM Last Post: jaseela123
	Content Based Image Retrieval Software Requirement Specifications	seminar surveyer	1	6,061,281	06-09-2017, 10:29 AM Last Post: jaseela123
	Software Defect Association Mining and Defect Correction Effort Prediction	project topics	1	342,050	02-09-2017, 03:21 PM Last Post: jaseela123
	Web-based Chat room Software	seminar topics	1	11,921,225	28-08-2017, 01:48 PM Last Post: jaseela123
	Development Analysis of a 3D Game Engine Using Object Oriented Software Engineering	nit_cal	0	16,432,691	25-08-2017, 09:32 PM Last Post: nit_cal
	CREB Software Project	Computer Science Clay	0	23,575,163	25-08-2017, 09:32 PM Last Post: Computer Science Clay
	Java based Transation in Software Engineering Project Ideas	electronics seminars	0	7,113,567	25-08-2017, 09:32 PM Last Post: electronics seminars
	A PHOTO MANGEMENT SOFTWARE	Electrical Fan	0	16,488,573	25-08-2017, 09:32 PM Last Post: Electrical Fan
	COMPLIANT HANDLING SOFTWARE	Electrical Fan	0	11,711,855	25-08-2017, 09:32 PM Last Post: Electrical Fan

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.