1. 06 Mar, 2018 1 commit
    • Spelling fixes (#543) · 69a52cff
      Wink Saville authored
      Upstream spelling fix changes from Pony, ec47ba8f565726414552f4bbf97d7,
      by ka7@la-evento.com that effected google/benchmark.
  2. 02 Mar, 2018 2 commits
    • Add Solaris support (#539) · 47df49e5
      alekseyshl authored
      * Add Solaris support
      
      Define BENCHMARK_OS_SOLARIS for Solaris.
      
      Platform specific implementations added:
      * number of CPUs detection
      * CPU cycles per second detection
      * Thread CPU usage
      * Process CPU usage
      
      * Remove the special case for per process CPU time for Solaris, it's the same as the default.
    • Use STCK to get the CPU clock on s390x (#540) · ff2c255a
      Robert Guo authored
  3. 21 Feb, 2018 4 commits
    • Print the executable name as part of the context. (#534) · 56f52ee2
      Eric authored
      * Print the executable name as part of the context.
      
      A common use case of the library is to run two different
      versions of a benchmark to compare them. In my experience
      this often means compiling a benchmark twice, renaming
      one of the executables, and then running the executables
      back-to-back. In this case the name of the executable
      is important contextually information.  Unfortunately the
      benchmark does not report this information.
      
      This patch adds the executable name to the context reported
      by the benchmark.
      
      * attempt to fix tests on Windows
      
      * attempt to fix tests on Windows
    • Fix typo in README.md (#535) · 19048b7b
      Jonathan Wakely authored
    • Ensure std::iterator_traits<StateIterator> instantiates. · 858688b8
      Eric Fiselier authored
      Due to ADL lookup performed on the begin and end functions
      of `for (auto _ : State)`, std::iterator_traits may get
      incidentally instantiated. This patch ensures the library
      can tolerate that.
  4. 14 Feb, 2018 2 commits
    • Don't include <sys/resource.h> on Fuchsia. (#531) · 6ecf8a8e
      Ian McKellar authored
      * Don't include <sys/resource.h> on Fuchsia.
      
      It doesn't support POSIX resource measurement and timing APIs.
      
      Change-Id: Ifab4bac4296575f042c699db1ce5a4f7c2d82893
      
      * Add BENCHMARK_OS_FUCHSIA for Fuchsia
      
      Change-Id: Ic536f9625e413270285fbfd08471dcb6753ddad1
    • Improve State packing: put important members on first cache line. (#527) · 207b9c7a
      Eric authored
      * Improve State packing: put important members on first cache line.
      
      This patch does a few different things to ensure commonly accessed
      data is on the first cache line of the `State` object.
      
      First, it moves the `error_occurred_` member to reside after
      the `started_` and `finished_` bools, since there was internal
      padding there that was unused.
      
      Second, it moves `batch_leftover_` and `max_iterations` further up
      in the struct declaration. These variables are used in the calculation
      of `iterations()` which users might call within the loop. Therefore
      it's more important they exist on the first cache line.
      
      Finally, this patch turns the bool members into bitfields. Although
      this shouldn't have much of an effect currently, because padding is
      still needed between the last bool and the first size_t, it should
      help in future changes that require more "bool like" members.
      
      * Remove bitfield change for now
      
      * Move bools (and their padding) to end of "first cache line" vars.
      
      I think it makes the most sense to move the padding required
      following the group of bools to the end of the variables we want
      on the first cache line.
      
      This also means that the `total_iterations_` variable, which is the
      most accessed, has the same address as the State object.
      
      * Fix static assertion after moving bools
  5. 13 Feb, 2018 5 commits
    • Fixups following addition of KeepRunningBatch (296ec569) (#526) · 3924ee7b
      Samuel Panzer authored
      * Support State::KeepRunningBatch().
      
      State::KeepRunning() can take large amounts of time relative to quick
      operations (on the order of 1ns, depending on hardware). For such
      sensitive operations, it is recommended to run batches of repeated
      operations.
      
      This commit simplifies handling of total_iterations_. Rather than
      predecrementing such that total_iterations_ == 1 signals that
      KeepRunning() should exit, total_iterations_ == 0 now signals the
      intention for the benchmark to exit.
      
      * Create better fast path in State::KeepRunningBatch()
      
      * Replace int parameter with size_t to fix signed mismatch warnings
      
      * Ensure benchmark State has been started even on error.
      
      * Simplify KeepRunningBatch()
      
      * Implement KeepRunning() in terms of KeepRunningBatch().
      
      * Improve codegen by helping the compiler undestand dead code.
      
      * Dummy commit for build bots' benefit.
    • Attempt to fix travis timeouts during apt-get. (#528) · 37dbe80f
      Eric authored
      * Attempt to fix travis timeouts during apt-get.
      
      During some builds, travis fails to update the apt-get indexes.
      This causes the build to fail in different ways.
      
      This patch attempts to avoid this issue by manually calling
      apt-get update. I'm not sure if it'll work, but it's worth a try.
      
      * Fix missing semicolons in command
    • Make output tests more stable on slow machines. · dd8dcc8d
      Eric Fiselier authored
      The appveyor bot sometimes fails because the time it
      outputs is 6 digits long, but the output test regex expects at most
      5 digits. This patch increases the size to 6 digits to placate the
      test. This should not *really* affect the correctness of the test.
    • Fix GTest workaround on MSVC · 562f9d25
      Eric Fiselier authored
    • Work around Gtest build failure caused by -Werror=unused-function. (#529) · 906749a4
      Eric authored
      We're propagating extra warning flags to the gtest build, which
      can cause it to fail. This patch prevents passing "-Wextra" to
      gtest, since the library itself doesn't test with that flag.
  6. 10 Feb, 2018 1 commit
    • Support State::KeepRunningBatch(). (#521) · 296ec569
      Samuel Panzer authored
      * Support State::KeepRunningBatch().
      
      State::KeepRunning() can take large amounts of time relative to quick
      operations (on the order of 1ns, depending on hardware). For such
      sensitive operations, it is recommended to run batches of repeated
      operations.
      
      This commit simplifies handling of total_iterations_. Rather than
      predecrementing such that total_iterations_ == 1 signals that
      KeepRunning() should exit, total_iterations_ == 0 now signals the
      intention for the benchmark to exit.
      
      * Create better fast path in State::KeepRunningBatch()
      
      * Replace int parameter with size_t to fix signed mismatch warnings
      
      * Ensure benchmark State has been started even on error.
      
      * Simplify KeepRunningBatch()
  7. 04 Feb, 2018 1 commit
  8. 29 Jan, 2018 1 commit
  9. 19 Jan, 2018 1 commit
  10. 12 Jan, 2018 1 commit
    • Wrap COMPILER macros. (#514) · 9f5694ce
      Dominic Hamon authored
      Some command line or build systems may already set these (eg, bazel) so
      make sure that takes priority.
      
      Fixes #513
  11. 05 Jan, 2018 3 commits
    • Merge pull request #509 from efcs/fix-gtest-install · e1c3a83b
      Eric authored
      Prevent GTest and GMock from being installed with Google Benchmark.
    • Prevent GTest and GMock from being installed with Google Benchmark. · 778b85a7
      Eric Fiselier authored
      When users satisfy the GTest dependancy by placing a googletest
      directory in the project, the targets from GTest and GMock incorrectly
      get installed along side this library. We shouldn't be installing
      our test dependancies.
      
      This patch forces the options that control installation for googletest
      to OFF.
    • Updated documentation. (#503) · 052421c8
      Winston Du authored
      For people who get this library via CMake's AddExternalProject like me.
      Would like a long term tutorial from someone who really understands CMake on how to actually link an externalproject's dependencies to another added external project.
  12. 14 Dec, 2017 1 commit
  13. 13 Dec, 2017 2 commits
    • Add support for GTest based unit tests. (#485) · 7db02be2
      Eric authored
      * Add support for GTest based unit tests.
      
      As Dominic and I have previously discussed, there is some
      need/desire to improve the testing situation in Google Benchmark.
      
      One step to fixing this problem is to make it easier to write
      unit tests by adding support for GTest, which is what this patch does.
      
      By default it looks for an installed version of GTest. However the
      user can specify -DBENCHMARK_BUILD_EXTERNAL_GTEST=ON to instead
      download, build, and use copy of gtest from source. This is
      quite useful when Benchmark is being built in non-standard configurations,
      such as against libc++ or in 32 bit mode.
    • Document new 'v2' branch meant for unstable development. · de725e5a
      Eric Fiselier authored
      This patch documents the newly added v2 branch, which will
      be used to stage, test, and receive feedback on upcoming
      features, most of which will be breaking changes which can't
      be directly applied to master.
  14. 07 Dec, 2017 1 commit
  15. 04 Dec, 2017 1 commit
  16. 30 Nov, 2017 2 commits
  17. 29 Nov, 2017 2 commits
  18. 27 Nov, 2017 1 commit
    • Console reporter: properly account for the lenght of custom counter names (#484) · ec5684ed
      Roman Lebedev authored
      Old output example:
      ```
      Benchmark                                                 Time           CPU Iterations  CPUTime,s   Pixels/s ThreadingFactor
      ------------------------------------------------------------------------------------------------------------------------------
      20170525_0036TEST.RAF/threads:8/real_time                45 ms         45 ms         16   0.718738 79.6277M/s   0.999978   2.41419GB/s    22.2613 items/s FileSize,MB=111.050781; MPix=57.231360
      ```
      
      New output example:
      ```
      Benchmark                                                 Time           CPU Iterations  CPUTime,s   Pixels/s ThreadingFactor
      ------------------------------------------------------------------------------------------------------------------------------
      20170525_0036TEST.RAF/threads:8/real_time                45 ms         45 ms         16   0.713575 80.1713M/s        0.999571   2.43067GB/s    22.4133 items/s FileSize,MB=111.050781; MPix=57.231360
      ```
  19. 26 Nov, 2017 2 commits
    • Improve BENCHMARK_UNREACHABLE() implementation. · 2ec7399c
      Eric Fiselier authored
      This patch primarily changes the BENCHMARK_UNREACHABLE()
      implementation under MSVC to use __assume(false) instead
      of being a NORETURN function, which ironically caused
      unreachable code warnings.
      
      Second, since the NOTHROW function attempt generated the
      warnings we meant to avoid, it has been replaced with a dummy
      null statement.
    • Improve CPU Cache info reporting -- Add Windows support. (#486) · 11dc3682
      Eric authored
      * Improve CPU Cache info reporting -- Add Windows support.
      
      This patch does a couple of thing regarding CPU Cache reporting.
      
      First, it adds an implementation on Windows. Second it fixes
      the JSONReporter to correctly (and actually) output the CPU
      configuration information.
      
      And finally, third, it detects and reports the number of
      physical CPU's that share the same cache.
  20. 22 Nov, 2017 1 commit
    • Refactor System information collection -- Add CPU Cache Info (#483) · 27e0b439
      Eric authored
      * Refactor System information collection.
      
      This patch refactors the system information collection,
      and in particular information about the target CPU. The
      motivation is to make it easier to access CPU information,
      and easier to add new information as need be.
      
      This patch additionally adds information about the cache
      sizes of the CPU.
      
      * Address review comments: Clean up integer types.
      
      This commit cleans up the integer types used in ValueUnion to
      follow the Google style guide.
      
      Additionally it adds a BENCHMARK_UNREACHABLE macro to assist
      in documenting/catching unreachable code paths.
      
      * Rename ValueUnion accessors.
  21. 17 Nov, 2017 1 commit
    • Add NetBSD support (#482) · aad6a5fa
      Kamil Rytarowski authored
      Define BENCHMARK_OS_NETBSD for NetBSD.
      
      Add detection of cpuinfo_cycles_per_second and cpuinfo_num_cpus.
      This code shared detection of these properties with FreeBSD.
  22. 15 Nov, 2017 1 commit
  23. 13 Nov, 2017 1 commit
  24. 07 Nov, 2017 2 commits
    • [Tools] A new, more versatile benchmark output compare tool (#474) · 5e66248b
      Roman Lebedev authored
      * [Tools] A new, more versatile benchmark output compare tool
      
      Sometimes, there is more than one implementation of some functionality.
      And the obvious use-case is to benchmark them, which is better?
      
      Currently, there is no easy way to compare the benchmarking results
      in that case:
          The obvious solution is to have multiple binaries, each one
      containing/running one implementation. And each binary must use
      exactly the same benchmark family name, which is super bad,
      because now the binary name should contain all the info about
      benchmark family...
      
      What if i tell you that is not the solution?
      What if we could avoid producing one binary per benchmark family,
      with the same family name used in each binary,
      but instead could keep all the related families in one binary,
      with their proper names, AND still be able to compare them?
      
      There are three modes of operation:
      1. Just compare two benchmarks, what `compare_bench.py` did:
      ```
      $ ../tools/compare.py benchmarks ./a.out ./a.out
      RUNNING: ./a.out --benchmark_out=/tmp/tmprBT5nW
      Run on (8 X 4000 MHz CPU s)
      2017-11-07 21:16:44
      ------------------------------------------------------
      Benchmark               Time           CPU Iterations
      ------------------------------------------------------
      BM_memcpy/8            36 ns         36 ns   19101577   211.669MB/s
      BM_memcpy/64           76 ns         76 ns    9412571   800.199MB/s
      BM_memcpy/512          84 ns         84 ns    8249070   5.64771GB/s
      BM_memcpy/1024        116 ns        116 ns    6181763   8.19505GB/s
      BM_memcpy/8192        643 ns        643 ns    1062855   11.8636GB/s
      BM_copy/8             222 ns        222 ns    3137987   34.3772MB/s
      BM_copy/64           1608 ns       1608 ns     432758   37.9501MB/s
      BM_copy/512         12589 ns      12589 ns      54806   38.7867MB/s
      BM_copy/1024        25169 ns      25169 ns      27713   38.8003MB/s
      BM_copy/8192       201165 ns     201112 ns       3486   38.8466MB/s
      RUNNING: ./a.out --benchmark_out=/tmp/tmpt1wwG_
      Run on (8 X 4000 MHz CPU s)
      2017-11-07 21:16:53
      ------------------------------------------------------
      Benchmark               Time           CPU Iterations
      ------------------------------------------------------
      BM_memcpy/8            36 ns         36 ns   19397903   211.255MB/s
      BM_memcpy/64           73 ns         73 ns    9691174   839.635MB/s
      BM_memcpy/512          85 ns         85 ns    8312329   5.60101GB/s
      BM_memcpy/1024        118 ns        118 ns    6438774   8.11608GB/s
      BM_memcpy/8192        656 ns        656 ns    1068644   11.6277GB/s
      BM_copy/8             223 ns        223 ns    3146977   34.2338MB/s
      BM_copy/64           1611 ns       1611 ns     435340   37.8751MB/s
      BM_copy/512         12622 ns      12622 ns      54818   38.6844MB/s
      BM_copy/1024        25257 ns      25239 ns      27779   38.6927MB/s
      BM_copy/8192       205013 ns     205010 ns       3479    38.108MB/s
      Comparing ./a.out to ./a.out
      Benchmark                 Time             CPU      Time Old      Time New       CPU Old       CPU New
      ------------------------------------------------------------------------------------------------------
      BM_memcpy/8            +0.0020         +0.0020            36            36            36            36
      BM_memcpy/64           -0.0468         -0.0470            76            73            76            73
      BM_memcpy/512          +0.0081         +0.0083            84            85            84            85
      BM_memcpy/1024         +0.0098         +0.0097           116           118           116           118
      BM_memcpy/8192         +0.0200         +0.0203           643           656           643           656
      BM_copy/8              +0.0046         +0.0042           222           223           222           223
      BM_copy/64             +0.0020         +0.0020          1608          1611          1608          1611
      BM_copy/512            +0.0027         +0.0026         12589         12622         12589         12622
      BM_copy/1024           +0.0035         +0.0028         25169         25257         25169         25239
      BM_copy/8192           +0.0191         +0.0194        201165        205013        201112        205010
      ```
      
      2. Compare two different filters of one benchmark:
      (for simplicity, the benchmark is executed twice)
      ```
      $ ../tools/compare.py filters ./a.out BM_memcpy BM_copy
      RUNNING: ./a.out --benchmark_filter=BM_memcpy --benchmark_out=/tmp/tmpBWKk0k
      Run on (8 X 4000 MHz CPU s)
      2017-11-07 21:37:28
      ------------------------------------------------------
      Benchmark               Time           CPU Iterations
      ------------------------------------------------------
      BM_memcpy/8            36 ns         36 ns   17891491   211.215MB/s
      BM_memcpy/64           74 ns         74 ns    9400999   825.646MB/s
      BM_memcpy/512          87 ns         87 ns    8027453   5.46126GB/s
      BM_memcpy/1024        111 ns        111 ns    6116853    8.5648GB/s
      BM_memcpy/8192        657 ns        656 ns    1064679   11.6247GB/s
      RUNNING: ./a.out --benchmark_filter=BM_copy --benchmark_out=/tmp/tmpAvWcOM
      Run on (8 X 4000 MHz CPU s)
      2017-11-07 21:37:33
      ----------------------------------------------------
      Benchmark             Time           CPU Iterations
      ----------------------------------------------------
      BM_copy/8           227 ns        227 ns    3038700   33.6264MB/s
      BM_copy/64         1640 ns       1640 ns     426893   37.2154MB/s
      BM_copy/512       12804 ns      12801 ns      55417   38.1444MB/s
      BM_copy/1024      25409 ns      25407 ns      27516   38.4365MB/s
      BM_copy/8192     202986 ns     202990 ns       3454   38.4871MB/s
      Comparing BM_memcpy to BM_copy (from ./a.out)
      Benchmark                               Time             CPU      Time Old      Time New       CPU Old       CPU New
      --------------------------------------------------------------------------------------------------------------------
      [BM_memcpy vs. BM_copy]/8            +5.2829         +5.2812            36           227            36           227
      [BM_memcpy vs. BM_copy]/64          +21.1719        +21.1856            74          1640            74          1640
      [BM_memcpy vs. BM_copy]/512        +145.6487       +145.6097            87         12804            87         12801
      [BM_memcpy vs. BM_copy]/1024       +227.1860       +227.1776           111         25409           111         25407
      [BM_memcpy vs. BM_copy]/8192       +308.1664       +308.2898           657        202986           656        202990
      ```
      
      3. Compare filter one from benchmark one to filter two from benchmark two:
      (for simplicity, the benchmark is executed twice)
      ```
      $ ../tools/compare.py benchmarksfiltered ./a.out BM_memcpy ./a.out BM_copy
      RUNNING: ./a.out --benchmark_filter=BM_memcpy --benchmark_out=/tmp/tmp_FvbYg
      Run on (8 X 4000 MHz CPU s)
      2017-11-07 21:38:27
      ------------------------------------------------------
      Benchmark               Time           CPU Iterations
      ------------------------------------------------------
      BM_memcpy/8            37 ns         37 ns   18953482   204.118MB/s
      BM_memcpy/64           74 ns         74 ns    9206578   828.245MB/s
      BM_memcpy/512          91 ns         91 ns    8086195   5.25476GB/s
      BM_memcpy/1024        120 ns        120 ns    5804513   7.95662GB/s
      BM_memcpy/8192        664 ns        664 ns    1028363   11.4948GB/s
      RUNNING: ./a.out --benchmark_filter=BM_copy --benchmark_out=/tmp/tmpDfL5iE
      Run on (8 X 4000 MHz CPU s)
      2017-11-07 21:38:32
      ----------------------------------------------------
      Benchmark             Time           CPU Iterations
      ----------------------------------------------------
      BM_copy/8           230 ns        230 ns    2985909   33.1161MB/s
      BM_copy/64         1654 ns       1653 ns     419408   36.9137MB/s
      BM_copy/512       13122 ns      13120 ns      53403   37.2156MB/s
      BM_copy/1024      26679 ns      26666 ns      26575   36.6218MB/s
      BM_copy/8192     215068 ns     215053 ns       3221   36.3283MB/s
      Comparing BM_memcpy (from ./a.out) to BM_copy (from ./a.out)
      Benchmark                               Time             CPU      Time Old      Time New       CPU Old       CPU New
      --------------------------------------------------------------------------------------------------------------------
      [BM_memcpy vs. BM_copy]/8            +5.1649         +5.1637            37           230            37           230
      [BM_memcpy vs. BM_copy]/64          +21.4352        +21.4374            74          1654            74          1653
      [BM_memcpy vs. BM_copy]/512        +143.6022       +143.5865            91         13122            91         13120
      [BM_memcpy vs. BM_copy]/1024       +221.5903       +221.4790           120         26679           120         26666
      [BM_memcpy vs. BM_copy]/8192       +322.9059       +323.0096           664        215068           664        215053
      ```
      
      * [Docs] Document tools/compare.py
      
      * [docs] Document how the change is calculated
    • Reorder inline to avoid warning on MSVC (#469) · 90aa8665
      Dominic Hamon authored
      Fixes #467