NEWS 43 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187118811891190119111921193
  1. == 2 March 2021 ==
  2. gperftools 2.9.1 is out!
  3. Minor fixes landed since previous release:
  4. * OSX builds new prefer backtrace() and have somewhat working heap
  5. sampling.
  6. * Incorrect assertion failure was fixed that crashed tcmalloc if
  7. assertions were on and sized delete was used. More details in github
  8. issue #1254.
  9. == 21 February 2021 ==
  10. gperftools 2.9 is out!
  11. Few more changes landed compared to rc:
  12. * Venkatesh Srinivas has contributed thread-safety annotations
  13. support.
  14. * couple more unit test bugs that caused tcmalloc_unittest to fail on
  15. recent clang has been fixed.
  16. * usage of unsupportable linux_syscall_support.h has been removed from
  17. few places. Building with --disable-heap-checker now completely
  18. avoids it. Expect complete death of this header in next major
  19. release.
  20. == 14 February 2021 ==
  21. gperftools 2.9rc is out!
  22. Here are notable changes:
  23. * Jarno Rajahalme has contributed fix for crashing bug in syscalls
  24. support for aarch64.
  25. * User SSE4 has contributed basic support for Elbrus 2000 architecture
  26. (!)
  27. * Venkatesh Srinivas has contributed cleanup to atomic ops.
  28. * Đoàn Trần Công Danh has fixed cpu profiler compilation on musl.
  29. * there is now better backtracing support for aarch64 and
  30. riscv. x86-64 with frame pointers now also defaults to this new
  31. "generic" frame pointer backtracer.
  32. * emergency malloc is now enabled by default. Fixes hang on musl when
  33. libgcc backtracer is enabled.
  34. * bunch of legacy config tests has been removed
  35. == 20 December 2020 ==
  36. gperftools 2.8.1 is out!
  37. Here are notable changes:
  38. * previous release contained change to release memory without page
  39. heap lock, but this change had at least one bug that caused to
  40. crashes and corruption when running under aggressive decommit mode
  41. (this is not default). While we check for other bugs, this feature
  42. was reverted. See github issue #1204 and issue #1227.
  43. * stack traces depth captured by gperftools is now up to 254 levels
  44. deep. Thanks to Kerrick Staley for this small but useful tweak.
  45. * Levon Ter-Grigoryan has contributed small fix for compiler warning.
  46. * Grant Henke has contributed updated detection of program counter
  47. register for OS X on arm64.
  48. * Tim Gates has contributed small typo fix.
  49. * Steve Langasek has contributed basic build fixes for riscv64 (!).
  50. * Isaac Hier and okhowang have contributed premiliminary port of build
  51. infrastructure to cmake. This works, but it is very premiliminary.
  52. Autotools-based build is the only officially supported build for
  53. now.
  54. == 6 July 2020 ==
  55. gperftools 2.8 is out!
  56. Here are notable changes:
  57. * ProfilerGetStackTrace is now officially supported API for
  58. libprofiler. Contributed by Kirill Müller.
  59. * Build failures on mingw were fixed. This fixed issue #1108.
  60. * Build failure of page_heap_test on MSVC was fixed.
  61. * Ryan Macnak contributed fix for compiling linux syscall support on
  62. i386 and recent GCCs. This fixed issue #1076.
  63. * test failures caused by new gcc 10 optimizations were fixed. Same
  64. change also fixed tests on clang.
  65. == 8 Mar 2020 ==
  66. gperftools 2.8rc is out!
  67. Here are notable changes:
  68. * building code now requires c++11 or later. Bundled MSVC project was
  69. converted to Visual Studio 2015.
  70. * User obones contributed fix for windows x64 TLS callbacks. This
  71. fixed leak of thread caches on thread exists in 64-bit windows.
  72. * releasing memory back to kernel is now made with page heap lock
  73. dropped.
  74. * HoluWu contributed fix for correct malloc patching on debug builds
  75. on windows. This configuration previously crashed.
  76. * Romain Geissler contributed fix for tls access during early tls
  77. initialization on dlopen.
  78. * large allocation reports are now silenced by default. Since not all
  79. programs want their stderr polluted by those messages. Contributed
  80. by Junhao Li.
  81. * HolyWu contributed improvements to MSVC project files. Notably,
  82. there is now project for "overriding" version of tcmalloc.
  83. * MS-specific _recalloc is now correctly zeroing only malloced
  84. part. This fix was contributed by HolyWu.
  85. * Brian Silverman contributed correctness fix to sampler_test.
  86. * Gabriel Marin ported few fixes from chromium's fork. As part of
  87. those fixes, we reduced number of static initializers (forbidden in
  88. chromium). Also we now syscalls via syscall function instead of
  89. reimplementing direct way to make syscalls on each platform.
  90. * Brian Silverman fixed flakiness in page heap test.
  91. * There is now configure flag to skip installing perl pprof, since
  92. external golang pprof is much superior. --disable-deprecated-pprof
  93. is the flag.
  94. * Fabric Fontaine contributed fixes to drop use of nonstandard
  95. __off64_t type.
  96. * Fabrice Fontaine contributed build fix to check for presence of
  97. nonstandard __sbrk functions. It is only used by mmap hooks code and
  98. (rightfully) not available on musl.
  99. * Fabrice Fontaine contributed build fix around mmap64 macro and
  100. function conflict in same cases.
  101. * there is now configure time option to enable aggressive decommit by
  102. default. Contributed by Laurent
  103. Stacul. --enable-aggressive-decommit-by-default is the flag.
  104. * Tulio Magno Quites Machado Filho contributed build fixes for ppc
  105. around ucontext access.
  106. * User pkubaj contributed couple build fixes for FreeBSD/ppc.
  107. * configure now always assumes we have mmap. This fixes configure
  108. failures on some linux guests inside virtualbox. This fixed issue
  109. #1008.
  110. * User shipujin contributed syscall support fixes for mips64 (big and
  111. little endian).
  112. * Henrik Edin contributed configurable support for wide range of
  113. malloc page sizes. 4K, 8K, 16K, 32K, 64K, 128K and 256K are now
  114. supported via existing --with-tcmalloc-pagesize flag to configure.
  115. * Jon Kohler added overheads fields to per-size-class textual
  116. stats. Stats that are available via
  117. MallocExtension::instance()->GetStats().
  118. * tcmalloc can now avoid fallback from memfs to default sys
  119. allocator. TCMALLOC_MEMFS_DISABLE_FALLBACK switches this on. This
  120. was contributed by Jon Kohler.
  121. * Ilya Leoshkevich fixed mmap syscall support on s390.
  122. * Todd Lipcon contributed small build warning fix.
  123. * User prehistoricpenguin contributed misc source file mode fixes (we
  124. still had few few c++ files marked executable).
  125. * User invalid_ms_user contributed fix for typo.
  126. * Jakub Wilk contributed typos fixes.
  127. == 29 Apr 2018 ==
  128. gperftools 2.7 is out!
  129. Few people contributed minor, but important fixes since rc.
  130. Changes:
  131. * bug in span stats printing introduced by new scalable page heap
  132. change was fixed.
  133. * Christoph Müllner has contributed couple warnings fixes and initial
  134. support for aarch64_ilp32 architecture.
  135. * Ben Dang contributed documentation fix for heap checker.
  136. * Fabrice Fontaine contributed fixed for linking benchmarks with
  137. --disable-static.
  138. * Holy Wu has added sized deallocation unit tests.
  139. * Holy Wu has enabled support of sized deallocation (c++14) on recent
  140. MSVC.
  141. * Holy Wu has fixed MSVC build in WIN32_OVERRIDE_ALLOCATORS mode. This
  142. closed issue #716.
  143. * Holy Wu has contributed cleanup of config.h used on windows.
  144. * Mao Huang has contributed couple simple tcmalloc changes from
  145. chromium code base. Making our tcmalloc forks a tiny bit closer.
  146. * issue #946 that caused compilation failures on some Linux clang
  147. installations has been fixed. Much thanks to github user htuch for
  148. helping to diagnose issue and proposing a fix.
  149. * Tulio Magno Quites Machado Filho has contributed build-time fix for
  150. PPC (for problem introduced in one of commits since RC).
  151. == 18 Mar 2018 ==
  152. gperftools 2.7rc is out!
  153. Changes:
  154. * Most notable change in this release is that very large allocations
  155. (>1MiB) are now handled be O(log n) implementation. This is
  156. contributed by Todd Lipcon based on earlier work by Aliaksei
  157. Kandratsenka and James Golick. Special thanks to Alexey Serbin for
  158. contributing OSX fix for that commit.
  159. * detection of sized deallocation support is improved. Which should
  160. fix another set of issues building on OSX. Much thanks to Alexey
  161. Serbin for reporting the issue, suggesting a fix and verifying it.
  162. * Todd Lipcon made a change to extend page heaps freelists to 1 MiB
  163. (up from 1MiB - 8KiB). This may help a little for some workloads.
  164. * Ishan Arora contributed typo fix to docs
  165. == 9 Dec 2017 ==
  166. gperftools 2.6.3 is out!
  167. Just two fixes were made in this release:
  168. * Stephan Zuercher has contributed a build fix for some recent XCode
  169. versions. See issue #942 for more details.
  170. * assertion failure on some windows builds introduced by 2.6.2 was
  171. fixed. Thanks to github user nkeemik for reporting it and testing
  172. fix. See issue #944 for more details.
  173. == 30 Nov 2017 ==
  174. gperftools 2.6.2 is out!
  175. Most notable change is recently added support for C++17 over-aligned
  176. allocation operators contributed by Andrey Semashev. I've extended his
  177. implemention to have roughly same performance as malloc/new. This
  178. release also has native support for C11 aligned_alloc.
  179. Rest is mostly bug fixes:
  180. * Jianbo Yang has contributed a fix for potentially severe data race
  181. introduced by malloc fast-path work in gperftools 2.6. This race
  182. could cause occasional violation of total thread cache size
  183. constraint. See issue #929 for more details.
  184. * Correct behavior in out-of-memory condition in fast-path cases was
  185. restored. This was another bug introduced by fast-path optimization
  186. in gperftools 2.6 which caused operator new to silently return NULL
  187. instead of doing correct C++ OOM handling (calling new_handler and
  188. throwing bad_alloc).
  189. * Khem Raj has contributed couple build fixes for newer glibcs (ucontext_t vs
  190. struct ucontext and loff_t definition)
  191. * Piotr Sikora has contributed build fix for OSX (not building unwind
  192. benchmark). This was issue #910 (thanks to Yuriy Solovyov for
  193. reporting it).
  194. * Dorin Lazăr has contributed fix for compiler warning
  195. * issue #912 (occasional deadlocking calling getenv too early on
  196. windows) was fixed. Thanks to github user shangcangriluo for
  197. reporting it.
  198. * Couple earlier lsan-related commits still causing occasional issues
  199. linking on OSX has been reverted. See issue #901.
  200. * Volodimir Krylov has contributed GetProgramInvocationName for FreeBSD
  201. * changsu lee has contributed couple minor correctness fixes (missing
  202. va_end() and missing free() call in rarely executed Symbolize path)
  203. * Andrew C. Morrow has contributed some more page heap stats. See issue
  204. #935.
  205. * some cases of built-time warnings from various gcc/clang versions
  206. about throw() declarations have been fixes.
  207. == 9 July 2017 ==
  208. gperftools 2.6.1 is out! This is mostly bug-fixes release.
  209. * issue #901: build issue on OSX introduced in last-time commit in 2.6
  210. was fixed (contributed by Francis Ricci)
  211. * tcmalloc_minimal now works on 32-bit ABI of mips64. This is issue
  212. #845. Much thanks to Adhemerval Zanella and github user mtone.
  213. * Romain Geissler contributed build fix for -std=c++17. This is pull
  214. request #897.
  215. * As part of fixing issue #904, tcmalloc atfork handler is now
  216. installed early. This should fix slight chance of hitting deadlocks
  217. at fork in some cases.
  218. == 4 July 2017 ==
  219. gperftools 2.6 is out!
  220. * Kim Gräsman contributed documentation update for HEAPPROFILESIGNAL
  221. environment variable
  222. * KernelMaker contributed fix for population of min_object_size field
  223. returned by MallocExtension::GetFreeListSizes
  224. * commit 8c3dc52fcfe0 "issue-654: [pprof] handle split text segments"
  225. was reverted. Some OSX users reported issues with this commit. Given
  226. our pprof implementation is strongly deprecated it is best to drop
  227. recently introduced features rather than breaking it badly.
  228. * Francis Ricci contributed improvement for interaction with leak
  229. sanitizer.
  230. == 22 May 2017 ==
  231. gperftools 2.6rc4 is out!
  232. Dynamic sized delete is disabled by default again. There is no hope of
  233. it working with eager dynamic symbols resolution (-z now linker
  234. flag). More details in
  235. https://bugzilla.redhat.com/show_bug.cgi?id=1452813
  236. == 21 May 2017 ==
  237. gperftools 2.6rc3 is out!
  238. gperftools compilation on older systems (e.g. rhel 5) was fixed. This
  239. was originally reported in github issue #888.
  240. == 14 May 2017 ==
  241. gperftools 2.6rc2 is out!
  242. Just 2 small fixes on top of 2.6rc. Particularly, Rajalakshmi
  243. Srinivasaraghavan contributed build fix for ppc32.
  244. == 14 May 2017 ==
  245. gperftools 2.6rc is out!
  246. Highlights of this release are performance work on malloc fast-path
  247. and support for more modern visual studio runtimes, and deprecation of
  248. bundled pprof. Another significant performance-affecting changes are
  249. reverting central free list transfer batch size back to 32 and
  250. disabling of aggressive decommit mode by default.
  251. Note, while we still ship perl implementation of pprof, everyone is
  252. strongly advised to use golang reimplementation of pprof from
  253. https://github.com/google/pprof.
  254. Here are notable changes in more details (and see ChangeLog for full
  255. details):
  256. * a bunch of performance tweaks to tcmalloc fast-path were
  257. merged. This speeds up critical path of tcmalloc by few tens of
  258. %. Well tuned and allocation-heavy programs should see substantial
  259. performance boost (should apply to all modern elf platforms). This
  260. is based on Google-internal tcmalloc changes for fast-path (with
  261. obvious exception of lacking per-cpu mode, of course). Original
  262. changes were made by Aliaksei Kandratsenka. And Andrew Hunter,
  263. Dmitry Vyukov and Sanjay Ghemawat contributed with reviews and
  264. discussions.
  265. * Architectures with 48 bits address space (x86-64 and aarch64) now
  266. use faster 2 level page map. This was ported from Google-internal
  267. change by Sanjay Ghemawat.
  268. * Default value of TCMALLOC_TRANSFER_NUM_OBJ was returned back to
  269. 32. Larger values have been found to hurt certain programs (but help
  270. some other benchmarks). Value can still be tweaked at run time via
  271. environment variable.
  272. * tcmalloc aggressive decommit mode is now disabled by default
  273. again. It was found to degrade performance of certain tensorflow
  274. benchmarks. Users who prefer smaller heap over small performance win
  275. can still set environment variable TCMALLOC_AGGRESSIVE_DECOMMIT=t.
  276. * runtime switchable sized delete support has be fixed and re-enabled
  277. (on GNU/Linux). Programs that use C++ 14 or later that use sized
  278. delete can again be sped up by setting environment variable
  279. TCMALLOC_ENABLE_SIZED_DELETE=t. Support for enabling sized
  280. deallication support at compile-time is still present, of course.
  281. * tcmalloc now explicitly avoids use of MADV_FREE on Linux, unless
  282. TCMALLOC_USE_MADV_FREE is defined at compile time. This is because
  283. performance impact of MADV_FREE is not well known. Original issue
  284. #780 raised by Mathias Stearn.
  285. * issue #786 with occasional deadlocks in stack trace capturing via
  286. libunwind was fixed. It was originally reported as Ceph issue:
  287. http://tracker.ceph.com/issues/13522
  288. * ChangeLog is now automatically generated from git log. Old ChangeLog
  289. is now ChangeLog.old.
  290. * tcmalloc now provides implementation of nallocx. Function was
  291. originally introduced by jemalloc and can be used to return real
  292. allocation size given allocation request size. This is ported from
  293. Google-internal tcmalloc change contributed by Dmitry Vyukov.
  294. * issue #843 which made tcmalloc crash when used with erlang runtime
  295. was fixed.
  296. * issue #839 which caused tcmalloc's aggressive decommit mode to
  297. degrade performance in some corner cases was fixed.
  298. * Bryan Chan contributed support for 31-bit s390.
  299. * Brian Silverman contributed compilation fix for 32-bit ARMs
  300. * Issue #817 that was causing tcmalloc to fail on windows 10 and
  301. later, as well as on recent msvc was fixed. We now patch _free_base
  302. as well.
  303. * a bunch of minor documentaion/typos fixes by: Mike Gaffney
  304. <mike@uberu.com>, iivlev <iivlev@productengine.com>, savefromgoogle
  305. <savefromgoogle@users.noreply.github.com>, John McDole
  306. <jtmcdole@gmail.com>, zmertens <zmertens@asu.edu>, Kirill Müller
  307. <krlmlr@mailbox.org>, Eugene <n.eugene536@gmail.com>, Ola Olsson
  308. <ola1olsson@gmail.com>, Mostyn Bramley-Moore <mostynb@opera.com>
  309. * Tulio Magno Quites Machado Filho has contributed removal of
  310. deprecated glibc malloc hooks.
  311. * Issue #827 that caused intercepting malloc on osx 10.12 to fail was
  312. fixed, by copying fix made by Mike Hommey to jemalloc. Much thanks
  313. to Koichi Shiraishi and David Ribeiro Alves for reporting it and
  314. testing fix.
  315. * Aman Gupta and Kenton Varda contributed minor fixes to pprof (but
  316. note again that pprof is deprecated)
  317. * Ryan Macnak contributed compilation fix for aarch64
  318. * Francis Ricci has fixed unaligned memory access in debug allocator
  319. * TCMALLOC_PAGE_FENCE_NEVER_RECLAIM now actually works thanks to
  320. contribution by Andrew Morrow.
  321. == 12 Mar 2016 ==
  322. gperftools 2.5 is out!
  323. Just single bugfix was merged after rc2. Which was fix for issue #777.
  324. == 5 Mar 2016 ==
  325. gperftools 2.5rc2 is out!
  326. New release contains just few commits on top of first release
  327. candidate. One of them is build fix for Visual Studio. Another
  328. significant change is that dynamic sized delete is now disabled by
  329. default. It turned out that IFUNC relocations are not supporting our
  330. advanced use case on all platforms and in all cases.
  331. == 21 Feb 2016 ==
  332. gperftools 2.5rc is out!
  333. Here are major changes since 2.4:
  334. * we've moved to github!
  335. * Bryan Chan has contributed s390x support
  336. * stacktrace capturing via libgcc's _Unwind_Backtrace was implemented
  337. (for architectures with missing or broken libunwind).
  338. * "emergency malloc" was implemented. Which unbreaks recursive calls
  339. to malloc/free from stacktrace capturing functions (such us glib'c
  340. backtrace() or libunwind on arm). It is enabled by
  341. --enable-emergency-malloc configure flag or by default on arm when
  342. --enable-stacktrace-via-backtrace is given. It is another fix for a
  343. number common issues people had on platforms with missing or broken
  344. libunwind.
  345. * C++14 sized-deallocation is now supported (on gcc 5 and recent
  346. clangs). It is off by default and can be enabled at configure time
  347. via --enable-sized-delete. On GNU/Linux it can also be enabled at
  348. run-time by either TCMALLOC_ENABLE_SIZED_DELETE environment variable
  349. or by defining tcmalloc_sized_delete_enabled function which should
  350. return 1 to enable it.
  351. * we've lowered default value of transfer batch size to 512. Previous
  352. value (bumped up in 2.1) was too high and caused performance
  353. regression for some users. 512 should still give us performance
  354. boost for workloads that need higher transfer batch size while not
  355. penalizing other workloads too much.
  356. * Brian Silverman's patch finally stopped arming profiling timer
  357. unless profiling is started.
  358. * Andrew Morrow has contributed support for obtaining cache size of the
  359. current thread and softer idling (for use in MongoDB).
  360. * we've implemented few minor performance improvements, particularly
  361. on malloc fast-path.
  362. A number of smaller fixes were made. Many of them were contributed:
  363. * issue that caused spurious profiler_unittest.sh failures was fixed.
  364. * Jonathan Lambrechts contributed improved callgrind format support to
  365. pprof.
  366. * Matt Cross contributed better support for debug symbols in separate
  367. files to pprof.
  368. * Matt Cross contributed support for printing collapsed stack frame
  369. from pprof aimed at producing flame graphs.
  370. * Angus Gratton has contributed documentation fix mentioning that on
  371. windows only tcmalloc_minimal is supported.
  372. * Anton Samokhvalov has made tcmalloc use mi_force_{un,}lock on OSX
  373. instead of pthread_atfork. Which apparently fixes forking
  374. issues tcmalloc had on OSX.
  375. * Milton Chiang has contributed support for building 32-bit gperftools
  376. on arm8.
  377. * Patrick LoPresti has contributed support for specifying alternative
  378. profiling signal via CPUPROFILE_TIMER_SIGNAL environment variable.
  379. * Paolo Bonzini has contributed support configuring filename for
  380. sending malloc tracing output via TCMALLOC_TRACE_FILE environment
  381. variable.
  382. * user spotrh has enabled use of futex on arm.
  383. * user mitchblank has contributed better declaration for arg-less
  384. profiler functions.
  385. * Tom Conerly contributed proper freeing of memory allocated in
  386. HeapProfileTable::FillOrderedProfile on error paths.
  387. * user fdeweerdt has contributed curl arguments handling fix in pprof
  388. * Frederik Mellbin fixed tcmalloc's idea of mangled new and delete
  389. symbols on windows x64
  390. * Dair Grant has contributed cacheline alignment for ThreadCache
  391. objects
  392. * Fredrik Mellbin has contributed updated windows/config.h for Visual
  393. Studio 2015 and other windows fixes.
  394. * we're not linking libpthread to libtcmalloc_minimal anymore. Instead
  395. libtcmalloc_minimal links to pthread symbols weakly. As a result
  396. single-threaded programs remain single-threaded when linking to or
  397. preloading libtcmalloc_minimal.so.
  398. * Boris Sazonov has contributed mips compilation fix and printf misue
  399. in pprof.
  400. * Adhemerval Zanella has contributed alignment fixes for statically
  401. allocated variables.
  402. * Jens Rosenboom has contributed fixes for heap-profiler_unittest.sh
  403. * gshirishfree has contributed better description for GetStats method.
  404. * cyshi has contributed spinlock pause fix.
  405. * Chris Mayo has contributed --docdir argument support for configure.
  406. * Duncan Sands has contributed fix for function aliases.
  407. * Simon Que contributed better include for malloc_hook_c.h
  408. * user wmamrak contributed struct timespec fix for Visual Studio 2015.
  409. * user ssubotin contributed typo in PrintAvailability code.
  410. == 10 Jan 2015 ==
  411. gperftools 2.4 is out! The code is exactly same as 2.4rc.
  412. == 28 Dec 2014 ==
  413. gperftools 2.4rc is out!
  414. Here are changes since 2.3:
  415. * enabled aggressive decommit option by default. It was found to
  416. significantly improve memory fragmentation with negligible impact on
  417. performance. (Thanks to investigation work performed by Adhemerval
  418. Zanella)
  419. * added ./configure flags for tcmalloc pagesize and tcmalloc
  420. allocation alignment. Larger page sizes have been reported to
  421. improve performance occasionally. (Patch by Raphael Moreira Zinsly)
  422. * sped-up hot-path of malloc/free. By about 5% on static library and
  423. about 10% on shared library. Mainly due to more efficient checking
  424. of malloc hooks.
  425. * improved stacktrace capturing in cpu profiler (due to issue found by
  426. Arun Sharma). As part of that issue pprof's handling of cpu profiles
  427. was also improved.
  428. == 7 Dec 2014 ==
  429. gperftools 2.3 is out!
  430. Here are changes since 2.3rc:
  431. * (issue 658) correctly close socketpair fds on failure (patch by glider)
  432. * libunwind integration can be disabled at configure time (patch by
  433. Raphael Moreira Zinsly)
  434. * libunwind integration is disabled by default for ppc64 (patch by
  435. Raphael Moreira Zinsly)
  436. * libunwind integration is force-disabled for OSX. It was not used by
  437. default anyways. Fixes compilation issue I saw.
  438. == 2 Nov 2014 ==
  439. gperftools 2.3rc is out!
  440. Most small improvements in this release were made to pprof tool.
  441. New experimental Linux-only (for now) cpu profiling mode is a notable
  442. big improvement.
  443. Here are notable changes since 2.2.1:
  444. * (issue-631) fixed debugallocation miscompilation on mmap-less
  445. platforms (courtesy of user iamxujian)
  446. * (issue-630) reference to wrong PROFILE (vs. correct CPUPROFILE)
  447. environment variable was fixed (courtesy of WenSheng He)
  448. * pprof now has option to display stack traces in output for heap
  449. checker (courtesy of Michael Pasieka)
  450. * (issue-636) pprof web command now works on mingw
  451. * (issue-635) pprof now handles library paths that contain spaces
  452. (courtesy of user mich...@sebesbefut.com)
  453. * (issue-637) pprof now has an option to not strip template arguments
  454. (patch by jiakai)
  455. * (issue-644) possible out-of-bounds access in GetenvBeforeMain was
  456. fixed (thanks to user abyss.7)
  457. * (issue-641) pprof now has an option --show_addresses (thanks to user
  458. yurivict). New option prints instruction address in addition to
  459. function name in stack traces
  460. * (issue-646) pprof now works around some issues of addr2line
  461. reportedly when DWARF v4 format is used (patch by Adam McNeeney)
  462. * (issue-645) heap profiler exit message now includes remaining memory
  463. allocated info (patch by user yurivict)
  464. * pprof code that finds location of /proc/<pid>/maps in cpu profile
  465. files is now fixed (patch by Ricardo M. Correia)
  466. * (issue-654) pprof now handles "split text segments" feature of
  467. Chromium for Android. (patch by simonb)
  468. * (issue-655) potential deadlock on windows caused by early call to
  469. getenv in malloc initialization code was fixed (bug reported and fix
  470. proposed by user zndmitry)
  471. * incorrect detection of arm 6zk instruction set support
  472. (-mcpu=arm1176jzf-s) was fixed. (Reported by pedronavf on old
  473. issue-493)
  474. * new cpu profiling mode on Linux is now implemented. It sets up
  475. separate profiling timers for separate threads. Which improves
  476. accuracy of profiling on Linux a lot. It is off by default. And is
  477. enabled if both librt.f is loaded and CPUPROFILE_PER_THREAD_TIMERS
  478. environment variable is set. But note that all threads need to be
  479. registered via ProfilerRegisterThread.
  480. == 21 Jun 2014 ==
  481. gperftools 2.2.1 is out!
  482. Here's list of fixes:
  483. * issue-626 was closed. Which fixes initialization statically linked
  484. tcmalloc.
  485. * issue 628 was closed. It adds missing header file into source
  486. tarball. This fixes for compilation on PPC Linux.
  487. == 3 May 2014 ==
  488. gperftools 2.2 is out!
  489. Here are notable changes since 2.2rc:
  490. * issue 620 (crash on windows when c runtime dll is reloaded) was
  491. fixed
  492. == 19 Apr 2014 ==
  493. gperftools 2.2rc is out!
  494. Here are notable changes since 2.1:
  495. * a number of fixes for a number compilers and platforms. Notably
  496. Visual Studio 2013, recent mingw with c++ threads and some OSX
  497. fixes.
  498. * we now have mips and mips64 support! (courtesy of Jovan Zelincevic,
  499. Jean Lee, user xiaoyur347 and others)
  500. * we now have aarch64 (aka arm64) support! (contributed by Riku
  501. Voipio)
  502. * there's now support for ppc64-le (by Raphael Moreira Zinsly and
  503. Adhemerval Zanella)
  504. * there's now some support of uclibc (contributed by user xiaoyur347)
  505. * google/ headers will now give you deprecation warning. They are
  506. deprecated since 2.0
  507. * there's now new api: tc_malloc_skip_new_handler (ported from chromium
  508. fork)
  509. * issue-557: added support for dumping heap profile via signal (by
  510. Jean Lee)
  511. * issue-567: Petr Hosek contributed SysAllocator support for windows
  512. * Joonsoo Kim contributed several speedups for central freelist code
  513. * TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES environment variable now works
  514. * configure scripts are now using AM_MAINTAINER_MODE. It'll only
  515. affect folks who modify source from .tar.gz and want automake to
  516. automatically rebuild Makefile-s. See automake documentation for
  517. that.
  518. * issue-586: detect main executable even if PIE is active (based on
  519. patch by user themastermind1). Notably, it fixes profiler use with
  520. ruby.
  521. * there is now support for switching backtrace capturing method at
  522. runtime (via TCMALLOC_STACKTRACE_METHOD and
  523. TCMALLOC_STACKTRACE_METHOD_VERBOSE environment variables)
  524. * there is new backtrace capturing method using -finstrument-functions
  525. prologues contributed by user xiaoyur347
  526. * few cases of crashes/deadlocks in profiler were addressed. See
  527. (famous) issue-66, issue-547 and issue-579.
  528. * issue-464 (memory corruption in debugalloc's realloc after
  529. memallign) is now fixed
  530. * tcmalloc is now able to release memory back to OS on windows
  531. (issue-489). The code was ported from chromium fork (by a number of
  532. authors).
  533. * Together with issue-489 we ported chromium's "aggressive decommit"
  534. mode. In this mode (settable via malloc extension and via
  535. environment variable TCMALLOC_AGGRESSIVE_DECOMMIT), free pages are
  536. returned back to OS immediately.
  537. * MallocExtension::instance() is now faster (based on patch by
  538. Adhemerval Zanella)
  539. * issue-610 (hangs on windows in multibyte locales) is now fixed
  540. The following people helped with ideas or patches (based on git log,
  541. some contributions purely in bugtracker might be missing): Andrew
  542. C. Morrow, yurivict, Wang YanQing, Thomas Klausner,
  543. davide.italiano@10gen.com, Dai MIKURUBE, Joon-Sung Um, Jovan
  544. Zelincevic, Jean Lee, Petr Hosek, Ben Avison, drussel, Joonsoo Kim,
  545. Hannes Weisbach, xiaoyur347, Riku Voipio, Adhemerval Zanella, Raphael
  546. Moreira Zinsly
  547. == 30 July 2013 ==
  548. gperftools 2.1 is out!
  549. Just few fixes where merged after rc. Most notably:
  550. * Some fixes for debug allocation on POWER/Linux
  551. == 20 July 2013 ==
  552. gperftools 2.1rc is out!
  553. As a result of more than a year of contributions we're ready for 2.1
  554. release.
  555. But before making that step I'd like to create RC and make sure people
  556. have chance to test it.
  557. Here are notable changes since 2.0:
  558. * fixes for building on newer platforms. Notably, there's now initial
  559. support for x32 ABI (--enable-minimal only at this time))
  560. * new getNumericProperty stats for cache sizes
  561. * added HEAP_PROFILER_TIME_INTERVAL variable (see documentation)
  562. * added environment variable to control heap size (TCMALLOC_HEAP_LIMIT_MB)
  563. * added environment variable to disable release of memory back to OS
  564. (TCMALLOC_DISABLE_MEMORY_RELEASE)
  565. * cpu profiler can now be switched on and off by sending it a signal
  566. (specified in CPUPROFILESIGNAL)
  567. * (issue 491) fixed race-ful spinlock wake-ups
  568. * (issue 496) added some support for fork-ing of process that is using
  569. tcmalloc
  570. * (issue 368) improved memory fragmentation when large chunks of
  571. memory are allocated/freed
  572. == 03 February 2012 ==
  573. I've just released gperftools 2.0
  574. The `google-perftools` project has been renamed to `gperftools`. I
  575. (csilvers) am stepping down as maintainer, to be replaced by
  576. David Chappelle. Welcome to the team, David! David has been an
  577. an active contributor to perftools in the past -- in fact, he's the
  578. only person other than me that already has commit status. I am
  579. pleased to have him take over as maintainer.
  580. I have both renamed the project (the Google Code site renamed a few
  581. weeks ago), and bumped the major version number up to 2, to reflect
  582. the new community ownership of the project. Almost all the
  583. [http://gperftools.googlecode.com/svn/tags/gperftools-2.0/ChangeLog changes]
  584. are related to the renaming.
  585. The main functional change from google-perftools 1.10 is that
  586. I've renamed the `google/` include-directory to be `gperftools/`
  587. instead. New code should `#include <gperftools/tcmalloc.h>`/etc.
  588. (Most users of perftools don't need any perftools-specific includes at
  589. all, so this is mostly directed to "power users.") I've kept the old
  590. names around as forwarding headers to the new, so `#include
  591. <google/tcmalloc.h>` will continue to work.
  592. (The other functional change which I snuck in is getting rid of some
  593. bash-isms in one of the unittest driver scripts, so it could run on
  594. Solaris.)
  595. Note that some internal names still contain the text `google`, such as
  596. the `google_malloc` internal linker section. I think that's a
  597. trickier transition, and can happen in a future release (if at all).
  598. === 31 January 2012 ===
  599. I've just released perftools 1.10
  600. There is an API-incompatible change: several of the methods in the
  601. `MallocExtension` class have changed from taking a `void*` to taking a
  602. `const void*`. You should not be affected by this API change
  603. unless you've written your own custom malloc extension that derives
  604. from `MallocExtension`, but since it is a user-visible change, I have
  605. upped the `.so` version number for this release.
  606. This release focuses on improvements to linux-syscall-support.h,
  607. including ARM and PPC fixups and general cleanups. I hope this will
  608. magically fix an array of bugs people have been seeing.
  609. There is also exciting news on the porting front, with support for
  610. patching win64 assembly contributed by IBM Canada! This is an
  611. important step -- perhaps the most difficult -- to getting perftools
  612. to work on 64-bit windows using the patching technique (it doesn't
  613. affect the libc-modification technique). `premable_patcher_test` has
  614. been added to help test these changes; it is meant to compile under
  615. x86_64, and won't work under win32.
  616. For the full list of changes, including improved `HEAP_PROFILE_MMAP`
  617. support, see the
  618. [http://gperftools.googlecode.com/svn/tags/google-perftools-1.10/ChangeLog ChangeLog].
  619. === 24 January 2011 ===
  620. The `google-perftools` Google Code page has been renamed to
  621. `gperftools`, in preparation for the project being renamed to
  622. `gperftools`. In the coming weeks, I'll be stepping down as
  623. maintainer for the perftools project, and as part of that Google is
  624. relinquishing ownership of the project; it will now be entirely
  625. community run. The name change reflects that shift. The 'g' in
  626. 'gperftools' stands for 'great'. :-)
  627. === 23 December 2011 ===
  628. I've just released perftools 1.9.1
  629. I missed including a file in the tarball, that is needed to compile on
  630. ARM. If you are not compiling on ARM, or have successfully compiled
  631. perftools 1.9, there is no need to upgrade.
  632. === 22 December 2011 ===
  633. I've just released perftools 1.9
  634. This change has a slew of improvements, from better ARM and freebsd
  635. support, to improved performance by moving some code outside of locks,
  636. to better pprof reporting of code with overloaded functions.
  637. The full list of changes is in the
  638. [http://google-perftools.googlecode.com/svn/tags/google-perftools-1.9/ChangeLog ChangeLog].
  639. === 26 August 2011 ===
  640. I've just released perftools 1.8.3
  641. The star-crossed 1.8 series continues; in 1.8.1, I had accidentally
  642. removed some code that was needed for FreeBSD. (Without this code
  643. many apps would crash at startup.) This release re-adds that code.
  644. If you are not on FreeBSD, or are using FreeBSD with perftools 1.8 or
  645. earlier, there is no need to upgrade.
  646. === 11 August 2011 ===
  647. I've just released perftools 1.8.2
  648. I was incorrectly calculating the patch-level in the configuration
  649. step, meaning the TC_VERSION_PATCH #define in tcmalloc.h was wrong.
  650. Since the testing framework checks for this, it was failing. Now it
  651. should work again. This time, I was careful to re-run my tests after
  652. upping the version number. :-)
  653. If you don't care about the TC_VERSION_PATCH #define, there's no
  654. reason to upgrae.
  655. === 26 July 2011 ===
  656. I've just released perftools 1.8.1
  657. I was missing an #include that caused the build to break under some
  658. compilers, especially newer gcc's, that wanted it. This only affects
  659. people who build from source, so only the .tar.gz file is updated from
  660. perftools 1.8. If you didn't have any problems compiling perftools
  661. 1.8, there's no reason to upgrade.
  662. === 15 July 2011 ===
  663. I've just released perftools 1.8
  664. Of the many changes in this release, a good number pertain to porting.
  665. I've revamped OS X support to use the malloc-zone framework; it should
  666. now Just Work to link in tcmalloc, without needing
  667. `DYLD_FORCE_FLAT_NAMESPACE` or the like. (This is a pretty major
  668. change, so please feel free to report feedback at
  669. google-perftools@googlegroups.com.) 64-bit Windows support is also
  670. improved, as is ARM support, and the hooks are in place to improve
  671. FreeBSD support as well.
  672. On the other hand, I'm seeing hanging tests on Cygwin. I see the same
  673. hanging even with (the old) perftools 1.7, so I'm guessing this is
  674. either a problem specific to my Cygwin installation, or nobody is
  675. trying to use perftools under Cygwin. If you can reproduce the
  676. problem, and even better have a solution, you can report it at
  677. google-perftools@googlegroups.com.
  678. Internal changes include several performance and space-saving tweaks.
  679. One is user-visible (but in "stealth mode", and otherwise
  680. undocumented): you can compile with `-DTCMALLOC_SMALL_BUT_SLOW`. In
  681. this mode, tcmalloc will use less memory overhead, at the cost of
  682. running (likely not noticeably) slower.
  683. There are many other changes as well, too numerous to recount here,
  684. but present in the
  685. [http://google-perftools.googlecode.com/svn/tags/google-perftools-1.8/ChangeLog ChangeLog].
  686. === 7 February 2011 ===
  687. Thanks to endlessr..., who
  688. [http://code.google.com/p/google-perftools/issues/detail?id=307 identified]
  689. why some tests were failing under MSVC 10 in release mode. It does not look
  690. like these failures point toward any problem with tcmalloc itself; rather, the
  691. problem is with the test, which made some assumptions that broke under the
  692. some aggressive optimizations used in MSVC 10. I'll fix the test, but in
  693. the meantime, feel free to use perftools even when compiled under MSVC
  694. 10.
  695. === 4 February 2011 ===
  696. I've just released perftools 1.7
  697. I apologize for the delay since the last release; so many great new
  698. patches and bugfixes kept coming in (and are still coming in; I also
  699. apologize to those folks who have to slip until the next release). I
  700. picked this arbitrary time to make a cut.
  701. Among the many new features in this release is a multi-megabyte
  702. reduction in the amount of tcmalloc overhead uder x86_64, improved
  703. performance in the case of contention, and many many bugfixes,
  704. especially architecture-specific bugfixes. See the
  705. [http://google-perftools.googlecode.com/svn/tags/google-perftools-1.7/ChangeLog ChangeLog]
  706. for full details.
  707. One architecture-specific change of note is added comments in the
  708. [http://google-perftools.googlecode.com/svn/tags/perftools-1.7/README README]
  709. for using tcmalloc under OS X. I'm trying to get my head around the
  710. exact behavior of the OS X linker, and hope to have more improvements
  711. for the next release, but I hope these notes help folks who have been
  712. having trouble with tcmalloc on OS X.
  713. *Windows users*: I've heard reports that some unittests fail on
  714. Windows when compiled with MSVC 10 in Release mode. All tests pass in
  715. Debug mode. I've not heard of any problems with earlier versions of
  716. MSVC. I don't know if this is a problem with the runtime patching (so
  717. the static patching discussed in README_windows.txt will still work),
  718. a problem with perftools more generally, or a bug in MSVC 10. Anyone
  719. with windows expertise that can debug this, I'd be glad to hear from!
  720. === 5 August 2010 ===
  721. I've just released perftools 1.6
  722. This version also has a large number of minor changes, including
  723. support for `malloc_usable_size()` as a glibc-compatible alias to
  724. `malloc_size()`, the addition of SVG-based output to `pprof`, and
  725. experimental support for tcmalloc large pages, which may speed up
  726. tcmalloc at the cost of greater memory use. To use tcmalloc large
  727. pages, see the
  728. [http://google-perftools.googlecode.com/svn/tags/perftools-1.6/INSTALL
  729. INSTALL file]; for all changes, see the
  730. [http://google-perftools.googlecode.com/svn/tags/perftools-1.6/ChangeLog
  731. ChangeLog].
  732. OS X NOTE: improvements in the profiler unittest have turned up an OS
  733. X issue: in multithreaded programs, it seems that OS X often delivers
  734. the profiling signal (from sigitimer()) to the main thread, even when
  735. it's sleeping, rather than spawned threads that are doing actual work.
  736. If anyone knows details of how OS X handles SIGPROF events (from
  737. setitimer) in threaded programs, and has insight into this problem,
  738. please send mail to google-perftools@googlegroups.com.
  739. To see if you're affected by this, look for profiling time that pprof
  740. attributes to `___semwait_signal`. This is work being done in other
  741. threads, that is being attributed to sleeping-time in the main thread.
  742. === 20 January 2010 ===
  743. I've just released perftools 1.5
  744. This version has a slew of changes, leading to somewhat faster
  745. performance and improvements in portability. It adds features like
  746. `ITIMER_REAL` support to the cpu profiler, and `tc_set_new_mode` to
  747. mimic the windows function of the same name. Full details are in the
  748. [http://google-perftools.googlecode.com/svn/tags/perftools-1.5/ChangeLog
  749. ChangeLog].
  750. === 11 September 2009 ===
  751. I've just released perftools 1.4
  752. The major change this release is the addition of a debugging malloc
  753. library! If you link with `libtcmalloc_debug.so` instead of
  754. `libtcmalloc.so` (and likewise for the `minimal` variants) you'll get
  755. a debugging malloc, which will catch double-frees, writes to freed
  756. data, `free`/`delete` and `delete`/`delete[]` mismatches, and even
  757. (optionally) writes past the end of an allocated block.
  758. We plan to do more with this library in the future, including
  759. supporting it on Windows, and adding the ability to use the debugging
  760. library with your default malloc in addition to using it with
  761. tcmalloc.
  762. There are also the usual complement of bug fixes, documented in the
  763. ChangeLog, and a few minor user-tunable knobs added to components like
  764. the system allocator.
  765. === 9 June 2009 ===
  766. I've just released perftools 1.3
  767. Like 1.2, this has a variety of bug fixes, especially related to the
  768. Windows build. One of my bugfixes is to undo the weird `ld -r` fix to
  769. `.a` files that I introduced in perftools 1.2: it caused problems on
  770. too many platforms. I've reverted back to normal `.a` files. To work
  771. around the original problem that prompted the `ld -r` fix, I now
  772. provide `libtcmalloc_and_profiler.a`, for folks who want to link in
  773. both.
  774. The most interesting API change is that I now not only override
  775. `malloc`/`free`/etc, I also expose them via a unique set of symbols:
  776. `tc_malloc`/`tc_free`/etc. This enables clients to write their own
  777. memory wrappers that use tcmalloc:
  778. {{{
  779. void* malloc(size_t size) { void* r = tc_malloc(size); Log(r); return r; }
  780. }}}
  781. === 17 April 2009 ===
  782. I've just released perftools 1.2.
  783. This is mostly a bugfix release. The major change is internal: I have
  784. a new system for creating packages, which allows me to create 64-bit
  785. packages. (I still don't do that for perftools, because there is
  786. still no great 64-bit solution, with libunwind still giving problems
  787. and --disable-frame-pointers not practical in every environment.)
  788. Another interesting change involves Windows: a
  789. [http://code.google.com/p/google-perftools/issues/detail?id=126 new
  790. patch] allows users to choose to override malloc/free/etc on Windows
  791. rather than patching, as is done now. This can be used to create
  792. custom CRTs.
  793. My fix for this
  794. [http://groups.google.com/group/google-perftools/browse_thread/thread/1ff9b50043090d9d/a59210c4206f2060?lnk=gst&q=dynamic#a59210c4206f2060
  795. bug involving static linking] ended up being to make libtcmalloc.a and
  796. libperftools.a a big .o file, rather than a true `ar` archive. This
  797. should not yield any problems in practice -- in fact, it should be
  798. better, since the heap profiler, leak checker, and cpu profiler will
  799. now all work even with the static libraries -- but if you find it
  800. does, please file a bug report.
  801. Finally, the profile_handler_unittest provided in the perftools
  802. testsuite (new in this release) is failing on FreeBSD. The end-to-end
  803. test that uses the profile-handler is passing, so I suspect the
  804. problem may be with the test, not the perftools code itself. However,
  805. I do not know enough about how itimers work on FreeBSD to be able to
  806. debug it. If you can figure it out, please let me know!
  807. === 11 March 2009 ===
  808. I've just released perftools 1.1!
  809. It has many changes since perftools 1.0 including
  810. * Faster performance due to dynamically sized thread caches
  811. * Better heap-sampling for more realistic profiles
  812. * Improved support on Windows (MSVC 7.1 and cygwin)
  813. * Better stacktraces in linux (using VDSO)
  814. * Many bug fixes and feature requests
  815. Note: if you use the CPU-profiler with applications that fork without
  816. doing an exec right afterwards, please see the README. Recent testing
  817. has shown that profiles are unreliable in that case. The problem has
  818. existed since the first release of perftools. We expect to have a fix
  819. for perftools 1.2. For more details, see
  820. [http://code.google.com/p/google-perftools/issues/detail?id=105 issue 105].
  821. Everyone who uses perftools 1.0 is encouraged to upgrade to perftools
  822. 1.1. If you see any problems with the new release, please file a bug
  823. report at http://code.google.com/p/google-perftools/issues/list.
  824. Enjoy!