Skip to contents

Measure the performance of different implementations of cards using bench::mark().

library(cards)
library(reticulate)

phevaluator <- import("phevaluator")

Data Frame

Benchmark the initial implementation using data.frame compared to an integer() approach similar to PH Evaluator card.py.

New Deck

Create a new deck using new_deck_df() and an integer vector.

deck <- new_deck_df()
deck_int <- 0:51
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#>   expression         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck_df()     15µs   17.4µs    57757.    1.25KB     52.0
bench::mark(0:51)
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 0:51              0      1ns 54243985.        0B        0

While new_deck_df() is not designed to be called frequently, using an integer vector is much faster.

Deal

Compare performance of deal_hand_df() to sampling integers:

bench::mark(deal_hand_df(deck))
#> # A tibble: 1 × 6
#>   expression              min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>         <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand_df(deck)   13.5µs   15.7µs    62932.    3.73KB     56.7
bench::mark(sample(deck_int, 5))
#> # A tibble: 1 × 6
#>   expression               min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>          <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 sample(deck_int, 5)   1.48µs    1.8µs   541573.      264B     54.2

deal_hand_df() is about 7 times slower than sample().

Print

Test performance of print_hand_df() against a simple function that prints cards based on integers.

test_hand <- deal_hand_df(deck)
bench::mark(print_hand_df(test_hand))
#> # A tibble: 1 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_df(test_hand)   60.1µs   70.4µs    14198.    10.5KB     22.7

print_hand_int <- function(h) {
  cards <- paste0(rep(c(2:9, "T", "J", "Q", "K", "A"), each = 4), c("C", "D", "H", "S"))
  paste(cards[h + 1], collapse = " ")
}
test_hand_int <- sample(0:51, 5)
bench::mark(print_hand_int(test_hand_int))
#> # A tibble: 1 × 6
#>   expression                         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_int(test_hand_int)    4.3µs      5µs   194139.      928B     19.4

print_hand_df() is 14-15 times slower than the integer approach.

Evaluate

Test performance of eval_hand_df() with a single hand and with randomly selected hands:

bench::mark(eval_hand_df(test_hand))
#> # A tibble: 1 × 6
#>   expression                   min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>              <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(test_hand)   42.7µs   49.4µs    20295.      37KB     16.4
bench::mark(eval_hand_df(deal_hand_df(deck)))
#> # A tibble: 1 × 6
#>   expression                            min  median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                       <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(deal_hand_df(deck))     47µs    62µs    16019.      264B     18.4

As expected for a naive poker hand evaluator, performance of eval_hand_df() is poor compared to fast algorithms.

Summary

An implementation using integer would likely be much faster than the first implementation using data.frame. Rank and suit can be derived using integer division and modulo arithmetic respectively,and tabulate() is a faster replacement for rle().

0:51 %/% 4
#>  [1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4  5  5  5  5  6
#> [26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11 11 11 11 12 12
#> [51] 12 12
tabulate(0:51 %/% 4 + 1, 13)
#>  [1] 4 4 4 4 4 4 4 4 4 4 4 4 4
0:51 %% 4
#>  [1] 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1
#> [39] 2 3 0 1 2 3 0 1 2 3 0 1 2 3
tabulate(0:51 %% 4 + 1, 4)
#> [1] 13 13 13 13

bench::mark(rle(sort(sample(0:51, 5) %/% 4 + 1)))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                         <bch:t> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 rle(sort(sample(0:51, 5)%/%4 + 1))  15.3µs 17.8µs    56065.      264B     16.8
bench::mark(tabulate(sample(0:51, 5) %/% 4 + 1, 13))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 tabulate(sample(0:51, 5)%/%4 + 1, … 2.17µs 2.67µs   371405.      264B        0

Note that the tabulate approach is 7 times faster than sorting and run length encoding.

Integer

Benchmark the second implementation using integer().

New Deck

Create a new deck using new_deck() and new_deck_df().

deck_df <- new_deck_df()
deck <- new_deck()
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#>   expression         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck_df()   15.2µs   17.6µs    56749.    1.25KB     17.0
bench::mark(new_deck())
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck()        0     41ns 16483091.        0B        0

new_deck() is 90 times faster.

Deal

Compare performance of deal_hand_df() and deal_hand()

bench::mark(deal_hand_df(deck_df))
#> # A tibble: 1 × 6
#>   expression                 min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand_df(deck_df)   13.6µs   15.9µs    62865.      264B     18.9
bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand(deck)   1.64µs   2.05µs   480074.    3.02KB        0

deal_hand() is about 6 times faster.

Test performance of print_hand_df() against print_hand().

test_hand_df <- deal_hand_df(deck_df)
test_hand <- deal_hand(deck)

bench::mark(print_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#>   expression                       min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_df(test_hand_df)   59.6µs   69.5µs    14379.        0B     20.8
bench::mark(print_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                 min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand(test_hand)   4.18µs   4.84µs   204047.    8.52KB        0

print_hand() is 16 times faster.

Evaluate

Test performance of eval_hand_df() and eval_hand() with a single hand.

bench::mark(eval_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(test_hand_df)   43.3µs   50.4µs    19826.        0B     16.4
bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   1.48µs   1.72µs   560767.    27.5KB     56.1

eval_hand() is 20 times faster, but should perform poorly compared to fast algorithms.

Multiple Hands

Compare performance evaluating and printing multiple hands.

bench::mark({
  deck <- new_deck_df()
  replicate(50, {
    hand <- deal_hand_df(deck)
    paste0(print_hand_df(hand), ": ", eval_hand_df(hand))
  })
})
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 { deck <- new_deck_df() replicate(… 6.57ms 6.86ms      146.    34.2KB     20.9
bench::mark({
  deck <- new_deck()
  replicate(50, {
    hand <- deal_hand(deck)
    paste0(print_hand(hand), ": ", eval_hand(hand))
  })
})
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                           <bch> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 { deck <- new_deck() replicate(50, … 478µs  531µs     1855.      74KB     12.5

Overall, the new implementation is 13-14 times faster.

Python

Benchmark the integer() approach to PH Evaluator using reticulate.

Import

Test performance of phevaluator using reticulate::import(), starting with sample_cards():

bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand(deck)    1.6µs   2.05µs   479972.      264B        0
bench::mark(phevaluator$sample_cards(5L))
#> # A tibble: 1 × 6
#>   expression                        min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                   <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$sample_cards(5L)   18.4µs   21.2µs    46311.        0B     9.26

phevaluator$sample_cards() is 13 times slower than than the R integer implementation.

Also test phevaluator$evaluate_card() against the R integer method. evaluate_card() expects five to seven integers passed as individual parameters.

bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   1.48µs    1.8µs   508330.        0B     50.8
bench::mark(do.call(phevaluator$evaluate_cards, as.list(test_hand)))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 20.9µs 24.1µs    41663.        0B     8.33

Surprisingly, phevaluator is almost as slow as the original data frame implementation. Test again using some specific hands and avoid the overhead of do.call() and as.list():

four_aces <- c(51L, 50L, 49L, 48L, 47L)
royal_flush <- c(50L, 46L, 42L, 38L, 34L)

bench::mark(eval_hand(four_aces))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(four_aces)    697ns    902ns  1071335.        0B        0
bench::mark(phevaluator$evaluate_cards(51L, 50L, 49L, 48L, 47L))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$evaluate_cards(51L, 50… 18.9µs 21.8µs    46051.        0B     13.8

bench::mark(eval_hand(royal_flush))
#> # A tibble: 1 × 6
#>   expression                  min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>             <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(royal_flush)   1.19µs   1.48µs   656659.        0B        0
bench::mark(phevaluator$evaluate_cards(50L, 46L, 42L, 38L, 34L))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$evaluate_cards(50L, 46… 18.6µs 21.4µs    47016.        0B     9.41

Calling evaluate_cards() directly doesn’t significantly change the results. Test once more with random hands:

bench::mark(eval_hand(deal_hand(deck)))
#> # A tibble: 1 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(deal_hand(deck))   2.79µs   3.81µs   254239.      264B     25.4
bench::mark(do.call(phevaluator$evaluate_cards, as.list(deal_hand(deck))))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 22.9µs 26.6µs    37724.      264B     11.3

Conclusion: using phevaluator via reticulate::import() is not a faster way to evaluate hands. It is important to note that phevaluator$evaluate_cards() does more than eval_hand(), as phevaluator ranks all poker hands and eval_hand() only determines the hand rank category.

C/C++

Benchmark the integer() approach against the C/C++ implementation of PH Evaluator using Rcpp.

The current version only implements eval_hand_phe(), which uses EvaluateCards() and describeCategory() to return the card rank category.

Evaluate

Test performance of eval_hand() and eval_hand_phe() with a single hand:

bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   1.39µs   1.93µs   429286.        0B     42.9
bench::mark(eval_hand_phe(test_hand))
#> # A tibble: 1 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_phe(test_hand)    410ns    492ns  1961129.    2.24KB        0

Somewhat surprisingly, eval_hand_phe() is only 2 times faster than eval_hand(), however, eval_hand_phe() doesn’t just evaluate hand rank category, it also determines exact hand rank.

Reviewing the benchmarks on the PH Evaluator README and on my own system, the compiled C/C++ implementation should be capable of about 70 million hands per second, while eval_hand_phe() achieves about 1 million per second. This is likely due to the additional overhead of using R, and, more importantly, the additional call to describeCategory(), as the benchmark code only calls EvaluateCards().

A future implementation could implement the full pheval libraries and the C++ code in card_sampler.h to generate random hands in a standalone R package using Rcpp Modules.