Skip to contents

Measure the performance of different implementations of cards using bench::mark().

library(cards)
library(reticulate)

phevaluator <- import("phevaluator")

Data Frame

Benchmark the initial implementation using data.frame compared to an integer() approach similar to PH Evaluator card.py.

New Deck

Create a new deck using new_deck_df() and an integer vector.

deck <- new_deck_df()
deck_int <- 0:51
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#>   expression         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck_df()   15.5µs   17.8µs    54920.    1.25KB     49.5
bench::mark(0:51)
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 0:51              0      1ns 47797406.        0B        0

While new_deck_df() is not designed to be called frequently, using an integer vector is much faster.

Deal

Compare performance of deal_hand_df() to sampling integers:

bench::mark(deal_hand_df(deck))
#> # A tibble: 1 × 6
#>   expression              min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>         <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand_df(deck)   13.5µs     16µs    60601.    3.73KB     48.5
bench::mark(sample(deck_int, 5))
#> # A tibble: 1 × 6
#>   expression               min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>          <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 sample(deck_int, 5)   1.56µs   1.84µs   520412.      264B     52.0

deal_hand_df() is about 7 times slower than sample().

Print

Test performance of print_hand_df() against a simple function that prints cards based on integers.

test_hand <- deal_hand_df(deck)
bench::mark(print_hand_df(test_hand))
#> # A tibble: 1 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_df(test_hand)     61µs   71.3µs    13728.    10.2KB     20.6

print_hand_int <- function(h) {
  cards <- paste0(rep(c(2:9, "T", "J", "Q", "K", "A"), each = 4), c("C", "D", "H", "S"))
  paste(cards[h + 1], collapse = " ")
}
test_hand_int <- sample(0:51, 5)
bench::mark(print_hand_int(test_hand_int))
#> # A tibble: 1 × 6
#>   expression                         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_int(test_hand_int)   4.18µs   4.92µs   192518.      928B     19.3

print_hand_df() is 14-15 times slower than the integer approach.

Evaluate

Test performance of eval_hand_df() with a single hand and with randomly selected hands:

bench::mark(eval_hand_df(test_hand))
#> # A tibble: 1 × 6
#>   expression                   min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>              <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(test_hand)   42.6µs   49.9µs    19719.      37KB     16.4
bench::mark(eval_hand_df(deal_hand_df(deck)))
#> # A tibble: 1 × 6
#>   expression                            min  median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                       <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(deal_hand_df(deck))   49.2µs  64.4µs    15264.      264B     16.4

As expected for a naive poker hand evaluator, performance of eval_hand_df() is poor compared to fast algorithms.

Summary

An implementation using integer would likely be much faster than the first implementation using data.frame. Rank and suit can be derived using integer division and modulo arithmetic respectively,and tabulate() is a faster replacement for rle().

0:51 %/% 4
#>  [1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4  5  5  5  5  6
#> [26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11 11 11 11 12 12
#> [51] 12 12
tabulate(0:51 %/% 4 + 1, 13)
#>  [1] 4 4 4 4 4 4 4 4 4 4 4 4 4
0:51 %% 4
#>  [1] 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1
#> [39] 2 3 0 1 2 3 0 1 2 3 0 1 2 3
tabulate(0:51 %% 4 + 1, 4)
#> [1] 13 13 13 13

bench::mark(rle(sort(sample(0:51, 5) %/% 4 + 1)))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                         <bch:t> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 rle(sort(sample(0:51, 5)%/%4 + 1))  15.6µs 18.5µs    52808.      264B     15.8
bench::mark(tabulate(sample(0:51, 5) %/% 4 + 1, 13))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 tabulate(sample(0:51, 5)%/%4 + 1, … 2.13µs 2.71µs   361695.      264B     36.2

Note that the tabulate approach is 7 times faster than sorting and run length encoding.

Integer

Benchmark the second implementation using integer().

New Deck

Create a new deck using new_deck() and new_deck_df().

deck_df <- new_deck_df()
deck <- new_deck()
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#>   expression         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck_df()   15.2µs   18.1µs    54184.    1.25KB     16.3
bench::mark(new_deck())
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck()        0     41ns 18612248.        0B        0

new_deck() is 90 times faster.

Deal

Compare performance of deal_hand_df() and deal_hand()

bench::mark(deal_hand_df(deck_df))
#> # A tibble: 1 × 6
#>   expression                 min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand_df(deck_df)   13.7µs   16.4µs    60272.      264B     18.1
bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand(deck)   1.64µs   2.09µs   460479.    3.02KB        0

deal_hand() is about 6 times faster.

Test performance of print_hand_df() against print_hand().

test_hand_df <- deal_hand_df(deck_df)
test_hand <- deal_hand(deck)

bench::mark(print_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#>   expression                       min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_df(test_hand_df)   61.3µs   71.1µs    13832.        0B     18.7
bench::mark(print_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                 min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand(test_hand)   4.06µs   4.92µs   196323.    8.25KB     19.6

print_hand() is 16 times faster.

Evaluate

Test performance of eval_hand_df() and eval_hand() with a single hand.

bench::mark(eval_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(test_hand_df)   43.9µs   51.3µs    19243.        0B     16.4
bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   1.44µs    1.8µs   527576.    27.5KB        0

eval_hand() is 20 times faster, but should perform poorly compared to fast algorithms.

Multiple Hands

Compare performance evaluating and printing multiple hands.

bench::mark({
  deck <- new_deck_df()
  replicate(50, {
    hand <- deal_hand_df(deck)
    paste0(print_hand_df(hand), ": ", eval_hand_df(hand))
  })
})
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 { deck <- new_deck_df() replicate(… 6.73ms 7.17ms      139.    34.2KB     18.3
bench::mark({
  deck <- new_deck()
  replicate(50, {
    hand <- deal_hand(deck)
    paste0(print_hand(hand), ": ", eval_hand(hand))
  })
})
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                           <bch> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 { deck <- new_deck() replicate(50, … 488µs  555µs     1770.      74KB     12.6

Overall, the new implementation is 13-14 times faster.

Python

Benchmark the integer() approach to PH Evaluator using reticulate.

Import

Test performance of phevaluator using reticulate::import(), starting with sample_cards():

bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand(deck)    1.6µs   2.09µs   455059.      264B        0
bench::mark(phevaluator$sample_cards(5L))
#> # A tibble: 1 × 6
#>   expression                        min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                   <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$sample_cards(5L)   18.4µs   21.2µs    45382.        0B     9.08

phevaluator$sample_cards() is 13 times slower than than the R integer implementation.

Also test phevaluator$evaluate_card() against the R integer method. evaluate_card() expects five to seven integers passed as individual parameters.

bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   1.52µs    1.8µs   539130.        0B        0
bench::mark(do.call(phevaluator$evaluate_cards, as.list(test_hand)))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 20.8µs 24.1µs    40723.        0B     12.2

Surprisingly, phevaluator is almost as slow as the original data frame implementation. Test again using some specific hands and avoid the overhead of do.call() and as.list():

four_aces <- c(51L, 50L, 49L, 48L, 47L)
royal_flush <- c(50L, 46L, 42L, 38L, 34L)

bench::mark(eval_hand(four_aces))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(four_aces)    779ns    861ns  1078273.        0B        0
bench::mark(phevaluator$evaluate_cards(51L, 50L, 49L, 48L, 47L))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$evaluate_cards(51L, 50… 18.9µs 22.1µs    44465.        0B     8.89

bench::mark(eval_hand(royal_flush))
#> # A tibble: 1 × 6
#>   expression                  min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>             <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(royal_flush)   1.27µs    1.6µs   558776.        0B     55.9
bench::mark(phevaluator$evaluate_cards(50L, 46L, 42L, 38L, 34L))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$evaluate_cards(50L, 46… 18.5µs 21.6µs    45395.        0B     9.08

Calling evaluate_cards() directly doesn’t significantly change the results. Test once more with random hands:

bench::mark(eval_hand(deal_hand(deck)))
#> # A tibble: 1 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(deal_hand(deck))   2.83µs    3.9µs   252261.      264B     25.2
bench::mark(do.call(phevaluator$evaluate_cards, as.list(deal_hand(deck))))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 23.2µs 26.9µs    36666.      264B     11.0

Conclusion: using phevaluator via reticulate::import() is not a faster way to evaluate hands. It is important to note that phevaluator$evaluate_cards() does more than eval_hand(), as phevaluator ranks all poker hands and eval_hand() only determines the hand rank category.

C/C++

Benchmark the integer() approach against the C/C++ implementation of PH Evaluator using Rcpp.

The current version only implements eval_hand_phe(), which uses EvaluateCards() and describeCategory() to return the card rank category.

Evaluate

Test performance of eval_hand() and eval_hand_phe() with a single hand:

bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   1.48µs   1.84µs   526542.        0B        0
bench::mark(eval_hand_phe(test_hand))
#> # A tibble: 1 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_phe(test_hand)    451ns    533ns  1736909.    2.24KB     174.

Somewhat surprisingly, eval_hand_phe() is only 2 times faster than eval_hand(), however, eval_hand_phe() doesn’t just evaluate hand rank category, it also determines exact hand rank.

Reviewing the benchmarks on the PH Evaluator README and on my own system, the compiled C/C++ implementation should be capable of about 70 million hands per second, while eval_hand_phe() achieves about 1 million per second. This is likely due to the additional overhead of using R, and, more importantly, the additional call to describeCategory(), as the benchmark code only calls EvaluateCards().

A future implementation could implement the full pheval libraries and the C++ code in card_sampler.h to generate random hands in a standalone R package using Rcpp Modules.