Skip to contents

Measure the performance of different implementations of cards using bench::mark().

library(cards)
library(reticulate)

phevaluator <- import("phevaluator")

Data Frame

Benchmark the initial implementation using data.frame compared to an integer() approach similar to PH Evaluator card.py.

New Deck

Create a new deck using new_deck_df() and an integer vector.

deck <- new_deck_df()
deck_int <- 0:51
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#>   expression         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck_df()   20.5µs   21.5µs    44850.    1.25KB     53.9
bench::mark(0:51)
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 0:51              0     41ns 26275285.        0B        0

While new_deck_df() is not designed to be called frequently, using an integer vector is much faster.

Deal

Compare performance of deal_hand_df() to sampling integers:

bench::mark(deal_hand_df(deck))
#> # A tibble: 1 × 6
#>   expression              min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>         <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand_df(deck)   18.2µs     20µs    47860.    3.73KB     19.2
bench::mark(sample(deck_int, 5))
#> # A tibble: 1 × 6
#>   expression               min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>          <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 sample(deck_int, 5)   2.05µs   2.42µs   388419.      264B        0

deal_hand_df() is about 7 times slower than sample().

Print

Test performance of print_hand_df() against a simple function that prints cards based on integers.

test_hand <- deal_hand_df(deck)
bench::mark(print_hand_df(test_hand))
#> # A tibble: 1 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_df(test_hand)   80.2µs   85.4µs    11582.    10.5KB     16.5

print_hand_int <- function(h) {
  cards <- paste0(rep(c(2:9, "T", "J", "Q", "K", "A"), each = 4), c("C", "D", "H", "S"))
  paste0(cards[h + 1], collapse = " ")
}
test_hand_int <- sample(0:51, 5)
bench::mark(print_hand_int(test_hand_int))
#> # A tibble: 1 × 6
#>   expression                         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_int(test_hand_int)   5.66µs   6.03µs   154248.      928B        0

print_hand_df() is 14-15 times slower than the integer approach.

Evaluate

Test performance of eval_hand_df() with a single hand and with randomly selected hands:

bench::mark(eval_hand_df(test_hand))
#> # A tibble: 1 × 6
#>   expression                   min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>              <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(test_hand)   56.3µs   60.4µs    16320.      37KB     14.4
bench::mark(eval_hand_df(deal_hand_df(deck)))
#> # A tibble: 1 × 6
#>   expression                            min  median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                       <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(deal_hand_df(deck))   63.1µs    80µs    12366.      264B     12.3

As expected for a naive poker hand evaluator, performance of eval_hand_df() is poor compared to fast algorithms.

Summary

An implementation using integer would likely be much faster than the first implementation using data.frame. Rank and suit can be derived using integer division and modulo arithmetic respectively,and tabulate() is a faster replacement for rle().

0:51 %/% 4
#>  [1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4  5  5  5  5  6
#> [26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11 11 11 11 12 12
#> [51] 12 12
tabulate(0:51 %/% 4 + 1, 13)
#>  [1] 4 4 4 4 4 4 4 4 4 4 4 4 4
0:51 %% 4
#>  [1] 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1
#> [39] 2 3 0 1 2 3 0 1 2 3 0 1 2 3
tabulate(0:51 %% 4 + 1, 4)
#> [1] 13 13 13 13

bench::mark(rle(sort(sample(0:51, 5) %/% 4 + 1)))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                         <bch:t> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 rle(sort(sample(0:51, 5)%/%4 + 1))  20.5µs 21.9µs    44963.      264B     13.5
bench::mark(tabulate(sample(0:51, 5) %/% 4 + 1, 13))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 tabulate(sample(0:51, 5)%/%4 + 1, … 2.95µs 3.24µs   300685.      264B     30.1

Note that the tabulate approach is 7 times faster than sorting and run length encoding.

Integer

Benchmark the second implementation using integer().

New Deck

Create a new deck using new_deck() and new_deck_df().

deck_df <- new_deck_df()
deck <- new_deck()
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#>   expression         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck_df()   20.3µs   21.6µs    45423.    1.25KB     13.6
bench::mark(new_deck())
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck()        0     82ns 13364698.        0B        0

new_deck() is 90 times faster.

Deal

Compare performance of deal_hand_df() and deal_hand()

bench::mark(deal_hand_df(deck_df))
#> # A tibble: 1 × 6
#>   expression                 min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand_df(deck_df)   18.2µs   19.4µs    50958.      264B     15.3
bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand(deck)    2.3µs   2.54µs   380449.    3.02KB        0

deal_hand() is about 6 times faster.

Test performance of print_hand_df() against print_hand().

test_hand_df <- deal_hand_df(deck_df)
test_hand <- deal_hand(deck)

bench::mark(print_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#>   expression                       min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_df(test_hand_df)   80.4µs   84.9µs    11581.        0B     14.5
bench::mark(print_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                 min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand(test_hand)   5.62µs   5.95µs   163167.    8.45KB        0

print_hand() is 16 times faster.

Evaluate

Test performance of eval_hand_df() and eval_hand() with a single hand.

bench::mark(eval_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(test_hand_df)   56.3µs   59.3µs    16646.        0B     14.4
bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   2.13µs   2.38µs   408740.    27.5KB        0

eval_hand() is 20 times faster, but should perform poorly compared to fast algorithms.

Multiple Hands

Compare performance evaluating and printing multiple hands.

bench::mark({
  deck <- new_deck_df()
  replicate(50, {
    hand <- deal_hand_df(deck)
    paste0(print_hand_df(hand), ": ", eval_hand_df(hand))
  })
})
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 { deck <- new_deck_df() replicate(… 7.93ms 8.35ms      120.    34.2KB     16.1
bench::mark({
  deck <- new_deck()
  replicate(50, {
    hand <- deal_hand(deck)
    paste0(print_hand(hand), ": ", eval_hand(hand))
  })
})
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                           <bch> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 { deck <- new_deck() replicate(50, … 607µs  651µs     1533.      74KB     8.34

Overall, the new implementation is 13-14 times faster.

Python

Benchmark the integer() approach to PH Evaluator using reticulate.

Import

Test performance of phevaluator using reticulate::import(), starting with sample_cards():

bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand(deck)   2.25µs   2.54µs   382791.      264B        0
bench::mark(phevaluator$sample_cards(5L))
#> # A tibble: 1 × 6
#>   expression                        min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                   <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$sample_cards(5L)   24.5µs   25.6µs    38316.        0B     7.66

phevaluator$sample_cards() is 13 times slower than than the R integer implementation.

Also test phevaluator$evaluate_card() against the R integer method. evaluate_card() expects five to seven integers passed as individual parameters.

bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   2.13µs   2.34µs   415141.        0B        0
bench::mark(do.call(phevaluator$evaluate_cards, as.list(test_hand)))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 26.9µs 28.1µs    34834.        0B     6.97

Surprisingly, phevaluator is almost as slow as the original data frame implementation. Test again using some specific hands and avoid the overhead of do.call() and as.list():

four_aces <- c(51L, 50L, 49L, 48L, 47L)
royal_flush <- c(50L, 46L, 42L, 38L, 34L)

bench::mark(eval_hand(four_aces))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(four_aces)   1.07µs   1.15µs   842025.        0B        0
bench::mark(phevaluator$evaluate_cards(51L, 50L, 49L, 48L, 47L))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$evaluate_cards(51L, 50… 24.8µs 25.9µs    38062.        0B     7.61

bench::mark(eval_hand(royal_flush))
#> # A tibble: 1 × 6
#>   expression                  min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>             <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(royal_flush)   1.84µs   2.05µs   469090.        0B        0
bench::mark(phevaluator$evaluate_cards(50L, 46L, 42L, 38L, 34L))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$evaluate_cards(50L, 46… 24.4µs 25.4µs    38781.        0B     7.76

Calling evaluate_cards() directly doesn’t significantly change the results. Test once more with random hands:

bench::mark(eval_hand(deal_hand(deck)))
#> # A tibble: 1 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(deal_hand(deck))   3.77µs   4.92µs   204364.      264B        0
bench::mark(do.call(phevaluator$evaluate_cards, as.list(deal_hand(deck))))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 29.9µs 31.3µs    31296.      264B     9.39

Conclusion: using phevaluator via reticulate::import() is not a faster way to evaluate hands. It is important to note that phevaluator$evaluate_cards() does more than eval_hand(), as phevaluator ranks all poker hands and eval_hand() only determines the hand rank category.

C/C++

Benchmark the integer() approach against the C/C++ implementation of PH Evaluator using Rcpp.

The current version only implements eval_hand_phe(), which uses EvaluateCards() and describeCategory() to return the card rank category.

Evaluate

Test performance of eval_hand() and eval_hand_phe() with a single hand:

bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   2.09µs   2.34µs   411985.        0B     41.2
bench::mark(eval_hand_phe(test_hand))
#> # A tibble: 1 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_phe(test_hand)    615ns    697ns  1386481.    2.23KB        0

Somewhat surprisingly, eval_hand_phe() is only 2 times faster than eval_hand(), however, eval_hand_phe() doesn’t just evaluate hand rank category, it also determines exact hand rank.

Reviewing the benchmarks on the PH Evaluator README and on my own system, the compiled C/C++ implementation should be capable of about 70 million hands per second, while eval_hand_phe() achieves about 1 million per second. This is likely due to the additional overhead of using R, and, more importantly, the additional call to describeCategory(), as the benchmark code only calls EvaluateCards().

A future implementation could implement the full pheval libraries and the C++ code in card_sampler.h to generate random hands in a standalone R package using Rcpp Modules.