You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
539 lines
23 KiB
539 lines
23 KiB
Copyright 1994, 1995, 1996, 1999, 2000, 2001, 2002 Free Software |
|
Foundation, Inc. |
|
|
|
This file is free documentation; the Free Software Foundation gives |
|
unlimited permission to copy, distribute and modify it. |
|
|
|
|
|
Perftools-Specific Install Notes |
|
================================ |
|
|
|
*** NOTE FOR 64-BIT LINUX SYSTEMS |
|
|
|
The glibc built-in stack-unwinder on 64-bit systems has some problems |
|
with the perftools libraries. (In particular, the cpu/heap profiler |
|
may be in the middle of malloc, holding some malloc-related locks when |
|
they invoke the stack unwinder. The built-in stack unwinder may call |
|
malloc recursively, which may require the thread to acquire a lock it |
|
already holds: deadlock.) |
|
|
|
For that reason, if you use a 64-bit system, we strongly recommend you |
|
install libunwind before trying to configure or install gperftools. |
|
libunwind can be found at |
|
|
|
http://download.savannah.gnu.org/releases/libunwind/libunwind-0.99-beta.tar.gz |
|
|
|
Even if you already have libunwind installed, you should check the |
|
version. Versions older than this will not work properly; too-new |
|
versions introduce new code that does not work well with perftools |
|
(because libunwind can call malloc, which will lead to deadlock). |
|
|
|
There have been reports of crashes with libunwind 0.99 (see |
|
http://code.google.com/p/gperftools/issues/detail?id=374). |
|
Alternately, you can use a more recent libunwind (e.g. 1.0.1) at the |
|
cost of adding a bit of boilerplate to your code. For details, see |
|
http://groups.google.com/group/google-perftools/msg/2686d9f24ac4365f |
|
|
|
CAUTION: if you install libunwind from the url above, be aware that |
|
you may have trouble if you try to statically link your binary with |
|
perftools: that is, if you link with 'gcc -static -lgcc_eh ...'. |
|
This is because both libunwind and libgcc implement the same C++ |
|
exception handling APIs, but they implement them differently on |
|
some platforms. This is not likely to be a problem on ia64, but |
|
may be on x86-64. |
|
|
|
Also, if you link binaries statically, make sure that you add |
|
-Wl,--eh-frame-hdr to your linker options. This is required so that |
|
libunwind can find the information generated by the compiler |
|
required for stack unwinding. |
|
|
|
Using -static is rare, though, so unless you know this will affect |
|
you it probably won't. |
|
|
|
If you cannot or do not wish to install libunwind, you can still try |
|
to use the built-in stack unwinder. The built-in stack unwinder |
|
requires that your application, the tcmalloc library, and system |
|
libraries like libc, all be compiled with a frame pointer. This is |
|
*not* the default for x86-64. |
|
|
|
If you are on x86-64 system, know that you have a set of system |
|
libraries with frame-pointers enabled, and compile all your |
|
applications with -fno-omit-frame-pointer, then you can enable the |
|
built-in perftools stack unwinder by passing the |
|
--enable-frame-pointers flag to configure. |
|
|
|
Even with the use of libunwind, there are still known problems with |
|
stack unwinding on 64-bit systems, particularly x86-64. See the |
|
"64-BIT ISSUES" section in README. |
|
|
|
If you encounter problems, try compiling perftools with './configure |
|
--enable-frame-pointers'. Note you will need to compile your |
|
application with frame pointers (via 'gcc -fno-omit-frame-pointer |
|
...') in this case. |
|
|
|
|
|
*** TCMALLOC LARGE PAGES: TRADING TIME FOR SPACE |
|
|
|
You can set a compiler directive that makes tcmalloc faster, at the |
|
cost of using more space (due to internal fragmentation). |
|
|
|
Internally, tcmalloc divides its memory into "pages." The default |
|
page size is chosen to minimize memory use by reducing fragmentation. |
|
The cost is that keeping track of these pages can cost tcmalloc time. |
|
We've added a new, experimental flag to tcmalloc that enables a larger |
|
page size. In general, this will increase the memory needs of |
|
applications using tcmalloc. However, in many cases it will speed up |
|
the applications as well, particularly if they allocate and free a lot |
|
of memory. We've seen average speedups of 3-5% on Google |
|
applications. |
|
|
|
This feature is still very experimental; it's not even a configure |
|
flag yet. To build libtcmalloc with large pages, run |
|
|
|
./configure <normal flags> CXXFLAGS=-DTCMALLOC_LARGE_PAGES |
|
|
|
(or add -DTCMALLOC_LARGE_PAGES to your existing CXXFLAGS argument). |
|
|
|
|
|
*** SMALL TCMALLOC CACHES: TRADING SPACE FOR TIME |
|
|
|
You can set a compiler directive that makes tcmalloc use less memory |
|
for overhead, at the cost of some time. |
|
|
|
Internally, tcmalloc keeps information about some of its internal data |
|
structures in a cache. This speeds memory operations that need to |
|
access this internal data. We've added a new, experimental flag to |
|
tcmalloc that reduces the size of this cache, decresaing the memory |
|
needs of applications using tcmalloc. |
|
|
|
This feature is still very experimental; it's not even a configure |
|
flag yet. To build libtcmalloc with smaller internal caches, run |
|
|
|
./configure <normal flags> CXXFLAGS=-DTCMALLOC_SMALL_BUT_SLOW |
|
|
|
(or add -DTCMALLOC_SMALL_BUT_SLOW to your existing CXXFLAGS argument). |
|
|
|
|
|
*** NOTE FOR ___tls_get_addr ERROR |
|
|
|
When compiling perftools on some old systems, like RedHat 8, you may |
|
get an error like this: |
|
___tls_get_addr: symbol not found |
|
|
|
This means that you have a system where some parts are updated enough |
|
to support Thread Local Storage, but others are not. The perftools |
|
configure script can't always detect this kind of case, leading to |
|
that error. To fix it, just comment out the line |
|
#define HAVE_TLS 1 |
|
in your config.h file before building. |
|
|
|
|
|
*** TCMALLOC AND DLOPEN |
|
|
|
To improve performance, we use the "initial exec" model of Thread |
|
Local Storage in tcmalloc. The price for this is the library will not |
|
work correctly if it is loaded via dlopen(). This should not be a |
|
problem, since loading a malloc-replacement library via dlopen is |
|
asking for trouble in any case: some data will be allocated with one |
|
malloc, some with another. If, for some reason, you *do* need to use |
|
dlopen on tcmalloc, the easiest way is to use a version of tcmalloc |
|
with TLS turned off; see the ___tls_get_addr note above. |
|
|
|
|
|
*** COMPILING ON NON-LINUX SYSTEMS |
|
|
|
Perftools has been tested on the following systems: |
|
FreeBSD 6.0 (x86) |
|
FreeBSD 8.1 (x86_64) |
|
Linux CentOS 5.5 (x86_64) |
|
Linux Debian 4.0 (PPC) |
|
Linux Debian 5.0 (x86) |
|
Linux Fedora Core 3 (x86) |
|
Linux Fedora Core 4 (x86) |
|
Linux Fedora Core 5 (x86) |
|
Linux Fedora Core 6 (x86) |
|
Linux Fedora Core 13 (x86_64) |
|
Linux Fedora Core 14 (x86_64) |
|
Linux RedHat 9 (x86) |
|
Linux Slackware 13 (x86_64) |
|
Linux Ubuntu 6.06.1 (x86) |
|
Linux Ubuntu 6.06.1 (x86_64) |
|
Linux Ubuntu 10.04 (x86) |
|
Linux Ubuntu 10.10 (x86_64) |
|
Mac OS X 10.3.9 (Panther) (PowerPC) |
|
Mac OS X 10.4.8 (Tiger) (PowerPC) |
|
Mac OS X 10.4.8 (Tiger) (x86) |
|
Mac OS X 10.5 (Leopard) (x86) |
|
Mac OS X 10.6 (Snow Leopard) (x86) |
|
Solaris 10 (x86_64) |
|
Windows XP, Visual Studio 2003 (VC++ 7.1) (x86) |
|
Windows XP, Visual Studio 2005 (VC++ 8) (x86) |
|
Windows XP, Visual Studio 2005 (VC++ 9) (x86) |
|
Windows XP, Visual Studio 2005 (VC++ 10) (x86) |
|
Windows XP, MinGW 5.1.3 (x86) |
|
Windows XP, Cygwin 5.1 (x86) |
|
|
|
It works in its full generality on the Linux systems |
|
tested (though see 64-bit notes above). Portions of perftools work on |
|
the other systems. The basic memory-allocation library, |
|
tcmalloc_minimal, works on all systems. The cpu-profiler also works |
|
fairly widely. However, the heap-profiler and heap-checker are not |
|
yet as widely supported. In general, the 'configure' script will |
|
detect what OS you are building for, and only build the components |
|
that work on that OS. |
|
|
|
Note that tcmalloc_minimal is perfectly usable as a malloc/new |
|
replacement, so it is possible to use tcmalloc on all the systems |
|
above, by linking in libtcmalloc_minimal. |
|
|
|
** FreeBSD: |
|
|
|
The following binaries build and run successfully (creating |
|
libtcmalloc_minimal.so and libprofile.so in the process): |
|
% ./configure |
|
% make tcmalloc_minimal_unittest tcmalloc_minimal_large_unittest \ |
|
addressmap_unittest atomicops_unittest frag_unittest \ |
|
low_level_alloc_unittest markidle_unittest memalign_unittest \ |
|
packed_cache_test stacktrace_unittest system_alloc_unittest \ |
|
thread_dealloc_unittest profiler_unittest.sh |
|
% ./tcmalloc_minimal_unittest # to run this test |
|
% [etc] # to run other tests |
|
|
|
Three caveats: first, frag_unittest tries to allocate 400M of memory, |
|
and if you have less virtual memory on your system, the test may |
|
fail with a bad_alloc exception. |
|
|
|
Second, profiler_unittest.sh sometimes fails in the "fork" test. |
|
This is because stray SIGPROF signals from the parent process are |
|
making their way into the child process. (This may be a kernel |
|
bug that only exists in older kernels.) The profiling code itself |
|
is working fine. This only affects programs that call fork(); for |
|
most programs, the cpu profiler is entirely safe to use. |
|
|
|
Third, perftools depends on /proc to get shared library |
|
information. If you are running a FreeBSD system without proc, |
|
perftools will not be able to map addresses to functions. Some |
|
unittests will fail as a result. |
|
|
|
Finally, the new test introduced in perftools-1.2, |
|
profile_handler_unittest, fails on FreeBSD. It has something to do |
|
with how the itimer works. The cpu profiler test passes, so I |
|
believe the functionality is correct and the issue is with the test |
|
somehow. If anybody is an expert on itimers and SIGPROF in |
|
FreeBSD, and would like to debug this, I'd be glad to hear the |
|
results! |
|
|
|
libtcmalloc.so successfully builds, and the "advanced" tcmalloc |
|
functionality all works except for the leak-checker, which has |
|
Linux-specific code: |
|
% make heap-profiler_unittest.sh maybe_threads_unittest.sh \ |
|
tcmalloc_unittest tcmalloc_both_unittest \ |
|
tcmalloc_large_unittest # THESE WORK |
|
% make -k heap-checker_unittest.sh \ |
|
heap-checker-death_unittest.sh # THESE DO NOT |
|
|
|
Note that unless you specify --enable-heap-checker explicitly, |
|
'make' will not build the heap-checker unittests on a FreeBSD |
|
system. |
|
|
|
I have not tested other *BSD systems, but they are probably similar. |
|
|
|
** Mac OS X: |
|
|
|
I've tested OS X 10.5 [Leopard], OS X 10.4 [Tiger] and OS X 10.3 |
|
[Panther] on both intel (x86) and PowerPC systems. For Panther |
|
systems, perftools does not work at all: it depends on a header |
|
file, OSAtomic.h, which is new in 10.4. (It's possible to get the |
|
code working for Panther/i386 without too much work; if you're |
|
interested in exploring this, drop an e-mail.) |
|
|
|
For the other seven systems, the binaries and libraries that |
|
successfully build are exactly the same as for FreeBSD. See that |
|
section for a list of binaries and instructions on building them. |
|
|
|
In addition, it appears OS X regularly fails profiler_unittest.sh |
|
in the "thread" test (in addition to occassionally failing in the |
|
"fork" test). It looks like OS X often delivers the profiling |
|
signal to the main thread, even when it's sleeping, rather than |
|
spawned threads that are doing actual work. If anyone knows |
|
details of how OS X handles SIGPROF (via setitimer()) events with |
|
threads, and has insight into this problem, please send mail to |
|
google-perftools@googlegroups.com. |
|
|
|
** Solaris 10 x86: |
|
|
|
I've only tested using the GNU C++ compiler, not the Sun C++ |
|
compiler. Using g++ requires setting the PATH appropriately when |
|
configuring. |
|
|
|
% PATH=${PATH}:/usr/sfw/bin/:/usr/ccs/bin ./configure |
|
% PATH=${PATH}:/usr/sfw/bin/:/usr/ccs/bin make [...] |
|
|
|
Again, the binaries and libraries that successfully build are |
|
exactly the same as for FreeBSD. (However, while libprofiler.so can |
|
be used to generate profiles, pprof is not very successful at |
|
reading them -- necessary helper programs like nm don't seem |
|
to be installed by default on Solaris, or perhaps are only |
|
installed as part of the Sun C++ compiler package.) See that |
|
section for a list of binaries, and instructions on building them. |
|
|
|
** Windows (MSVC, Cygwin, and MinGW): |
|
|
|
Work on Windows is rather preliminary: we haven't found a good way |
|
to get stack traces in release mode on windows (that is, when FPO |
|
is enabled), so the heap profiling may not be reliable in that |
|
case. Also, heap-checking and CPU profiling do not yet work at |
|
all. But as in other ports, the basic tcmalloc library |
|
functionality, overriding malloc and new and such (and even |
|
windows-specific functions like _aligned_malloc!), is working fine, |
|
at least with VC++ 7.1 (Visual Studio 2003) through VC++ 10.0, |
|
in both debug and release modes. See README.windows for |
|
instructions on how to install on Windows using Visual Studio. |
|
|
|
Cygwin can compile some but not all of perftools. Furthermore, |
|
there is a problem with exception-unwinding in cygwin (it can call |
|
malloc, which can call the exception-unwinding-setup code, which |
|
can lead to an infinite loop). I've comitted a workaround to the |
|
exception unwinding problem, but it only works in debug mode and |
|
when statically linking in tcmalloc. I hope to have a more proper |
|
fix in a later release. To configure under cygwin, run |
|
|
|
./configure --disable-shared CXXFLAGS=-g && make |
|
|
|
Most of cygwin will compile (cygwin doesn't allow weak symbols, so |
|
the heap-checker and a few other pieces of functionality will not |
|
compile). 'make' will compile those libraries and tests that can |
|
be compiled. You can run 'make check' to make sure the basic |
|
functionality is working. I've heard reports that some versions of |
|
cygwin fail calls to pthread_join() with EINVAL, causing several |
|
tests to fail. If you have any insight into this, please mail |
|
google-perftools@googlegroups.com. |
|
|
|
This Windows functionality is also available using MinGW and Msys, |
|
In this case, you can use the regular './configure && make' |
|
process. 'make install' should also work. The Makefile will limit |
|
itself to those libraries and binaries that work on windows. |
|
|
|
|
|
Basic Installation |
|
================== |
|
|
|
These are generic installation instructions. |
|
|
|
The `configure' shell script attempts to guess correct values for |
|
various system-dependent variables used during compilation. It uses |
|
those values to create a `Makefile' in each directory of the package. |
|
It may also create one or more `.h' files containing system-dependent |
|
definitions. Finally, it creates a shell script `config.status' that |
|
you can run in the future to recreate the current configuration, and a |
|
file `config.log' containing compiler output (useful mainly for |
|
debugging `configure'). |
|
|
|
It can also use an optional file (typically called `config.cache' |
|
and enabled with `--cache-file=config.cache' or simply `-C') that saves |
|
the results of its tests to speed up reconfiguring. (Caching is |
|
disabled by default to prevent problems with accidental use of stale |
|
cache files.) |
|
|
|
If you need to do unusual things to compile the package, please try |
|
to figure out how `configure' could check whether to do them, and mail |
|
diffs or instructions to the address given in the `README' so they can |
|
be considered for the next release. If you are using the cache, and at |
|
some point `config.cache' contains results you don't want to keep, you |
|
may remove or edit it. |
|
|
|
The file `configure.ac' (or `configure.in') is used to create |
|
`configure' by a program called `autoconf'. You only need |
|
`configure.ac' if you want to change it or regenerate `configure' using |
|
a newer version of `autoconf'. |
|
|
|
The simplest way to compile this package is: |
|
|
|
1. `cd' to the directory containing the package's source code and type |
|
`./configure' to configure the package for your system. If you're |
|
using `csh' on an old version of System V, you might need to type |
|
`sh ./configure' instead to prevent `csh' from trying to execute |
|
`configure' itself. |
|
|
|
Running `configure' takes awhile. While running, it prints some |
|
messages telling which features it is checking for. |
|
|
|
2. Type `make' to compile the package. |
|
|
|
3. Optionally, type `make check' to run any self-tests that come with |
|
the package. |
|
|
|
4. Type `make install' to install the programs and any data files and |
|
documentation. |
|
|
|
5. You can remove the program binaries and object files from the |
|
source code directory by typing `make clean'. To also remove the |
|
files that `configure' created (so you can compile the package for |
|
a different kind of computer), type `make distclean'. There is |
|
also a `make maintainer-clean' target, but that is intended mainly |
|
for the package's developers. If you use it, you may have to get |
|
all sorts of other programs in order to regenerate files that came |
|
with the distribution. |
|
|
|
Compilers and Options |
|
===================== |
|
|
|
Some systems require unusual options for compilation or linking that |
|
the `configure' script does not know about. Run `./configure --help' |
|
for details on some of the pertinent environment variables. |
|
|
|
You can give `configure' initial values for configuration parameters |
|
by setting variables in the command line or in the environment. Here |
|
is an example: |
|
|
|
./configure CC=c89 CFLAGS=-O2 LIBS=-lposix |
|
|
|
*Note Defining Variables::, for more details. |
|
|
|
Compiling For Multiple Architectures |
|
==================================== |
|
|
|
You can compile the package for more than one kind of computer at the |
|
same time, by placing the object files for each architecture in their |
|
own directory. To do this, you must use a version of `make' that |
|
supports the `VPATH' variable, such as GNU `make'. `cd' to the |
|
directory where you want the object files and executables to go and run |
|
the `configure' script. `configure' automatically checks for the |
|
source code in the directory that `configure' is in and in `..'. |
|
|
|
If you have to use a `make' that does not support the `VPATH' |
|
variable, you have to compile the package for one architecture at a |
|
time in the source code directory. After you have installed the |
|
package for one architecture, use `make distclean' before reconfiguring |
|
for another architecture. |
|
|
|
Installation Names |
|
================== |
|
|
|
By default, `make install' will install the package's files in |
|
`/usr/local/bin', `/usr/local/man', etc. You can specify an |
|
installation prefix other than `/usr/local' by giving `configure' the |
|
option `--prefix=PATH'. |
|
|
|
You can specify separate installation prefixes for |
|
architecture-specific files and architecture-independent files. If you |
|
give `configure' the option `--exec-prefix=PATH', the package will use |
|
PATH as the prefix for installing programs and libraries. |
|
Documentation and other data files will still use the regular prefix. |
|
|
|
In addition, if you use an unusual directory layout you can give |
|
options like `--bindir=PATH' to specify different values for particular |
|
kinds of files. Run `configure --help' for a list of the directories |
|
you can set and what kinds of files go in them. |
|
|
|
If the package supports it, you can cause programs to be installed |
|
with an extra prefix or suffix on their names by giving `configure' the |
|
option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'. |
|
|
|
Optional Features |
|
================= |
|
|
|
Some packages pay attention to `--enable-FEATURE' options to |
|
`configure', where FEATURE indicates an optional part of the package. |
|
They may also pay attention to `--with-PACKAGE' options, where PACKAGE |
|
is something like `gnu-as' or `x' (for the X Window System). The |
|
`README' should mention any `--enable-' and `--with-' options that the |
|
package recognizes. |
|
|
|
For packages that use the X Window System, `configure' can usually |
|
find the X include and library files automatically, but if it doesn't, |
|
you can use the `configure' options `--x-includes=DIR' and |
|
`--x-libraries=DIR' to specify their locations. |
|
|
|
Specifying the System Type |
|
========================== |
|
|
|
There may be some features `configure' cannot figure out |
|
automatically, but needs to determine by the type of machine the package |
|
will run on. Usually, assuming the package is built to be run on the |
|
_same_ architectures, `configure' can figure that out, but if it prints |
|
a message saying it cannot guess the machine type, give it the |
|
`--build=TYPE' option. TYPE can either be a short name for the system |
|
type, such as `sun4', or a canonical name which has the form: |
|
|
|
CPU-COMPANY-SYSTEM |
|
|
|
where SYSTEM can have one of these forms: |
|
|
|
OS KERNEL-OS |
|
|
|
See the file `config.sub' for the possible values of each field. If |
|
`config.sub' isn't included in this package, then this package doesn't |
|
need to know the machine type. |
|
|
|
If you are _building_ compiler tools for cross-compiling, you should |
|
use the `--target=TYPE' option to select the type of system they will |
|
produce code for. |
|
|
|
If you want to _use_ a cross compiler, that generates code for a |
|
platform different from the build platform, you should specify the |
|
"host" platform (i.e., that on which the generated programs will |
|
eventually be run) with `--host=TYPE'. |
|
|
|
Sharing Defaults |
|
================ |
|
|
|
If you want to set default values for `configure' scripts to share, |
|
you can create a site shell script called `config.site' that gives |
|
default values for variables like `CC', `cache_file', and `prefix'. |
|
`configure' looks for `PREFIX/share/config.site' if it exists, then |
|
`PREFIX/etc/config.site' if it exists. Or, you can set the |
|
`CONFIG_SITE' environment variable to the location of the site script. |
|
A warning: not all `configure' scripts look for a site script. |
|
|
|
Defining Variables |
|
================== |
|
|
|
Variables not defined in a site shell script can be set in the |
|
environment passed to `configure'. However, some packages may run |
|
configure again during the build, and the customized values of these |
|
variables may be lost. In order to avoid this problem, you should set |
|
them in the `configure' command line, using `VAR=value'. For example: |
|
|
|
./configure CC=/usr/local2/bin/gcc |
|
|
|
will cause the specified gcc to be used as the C compiler (unless it is |
|
overridden in the site shell script). |
|
|
|
`configure' Invocation |
|
====================== |
|
|
|
`configure' recognizes the following options to control how it |
|
operates. |
|
|
|
`--help' |
|
`-h' |
|
Print a summary of the options to `configure', and exit. |
|
|
|
`--version' |
|
`-V' |
|
Print the version of Autoconf used to generate the `configure' |
|
script, and exit. |
|
|
|
`--cache-file=FILE' |
|
Enable the cache: use and save the results of the tests in FILE, |
|
traditionally `config.cache'. FILE defaults to `/dev/null' to |
|
disable caching. |
|
|
|
`--config-cache' |
|
`-C' |
|
Alias for `--cache-file=config.cache'. |
|
|
|
`--quiet' |
|
`--silent' |
|
`-q' |
|
Do not print messages saying which checks are being made. To |
|
suppress all normal output, redirect it to `/dev/null' (any error |
|
messages will still be shown). |
|
|
|
`--srcdir=DIR' |
|
Look for the package's source code in directory DIR. Usually |
|
`configure' can determine that directory automatically. |
|
|
|
`configure' also accepts some other, not widely useful, options. Run |
|
`configure --help' for more details.
|
|
|