Skip to contents

Measure the performance of different implementations of cards using bench::mark().

library(cards)
library(reticulate)

phevaluator <- import("phevaluator")

Data Frame

Benchmark the initial implementation using data.frame compared to an integer() approach similar to PH Evaluator card.py.

New Deck

Create a new deck using new_deck_df() and an integer vector.

deck <- new_deck_df()
deck_int <- 0:51
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#>   expression         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck_df()   14.9µs   17.2µs    57930.    1.25KB     58.0
bench::mark(0:51)
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 0:51              0      1ns 67680050.        0B        0

While new_deck_df() is not designed to be called frequently, using an integer vector is much faster.

Deal

Compare performance of deal_hand_df() to sampling integers:

bench::mark(deal_hand_df(deck))
#> # A tibble: 1 × 6
#>   expression              min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>         <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand_df(deck)   13.4µs   15.5µs    63639.    3.73KB     57.3
bench::mark(sample(deck_int, 5))
#> # A tibble: 1 × 6
#>   expression               min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>          <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 sample(deck_int, 5)   1.44µs    1.8µs   532720.      264B     107.

deal_hand_df() is about 7 times slower than sample().

Print

Test performance of print_hand_df() against a simple function that prints cards based on integers.

test_hand <- deal_hand_df(deck)
bench::mark(print_hand_df(test_hand))
#> # A tibble: 1 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_df(test_hand)   59.8µs   68.8µs    14368.    10.5KB     20.4

print_hand_int <- function(h) {
  cards <- paste0(rep(c(2:9, "T", "J", "Q", "K", "A"), each = 4), c("C", "D", "H", "S"))
  paste(cards[h + 1], collapse = " ")
}
test_hand_int <- sample(0:51, 5)
bench::mark(print_hand_int(test_hand_int))
#> # A tibble: 1 × 6
#>   expression                         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_int(test_hand_int)   4.14µs   4.76µs   200509.      928B        0

print_hand_df() is 14-15 times slower than the integer approach.

Evaluate

Test performance of eval_hand_df() with a single hand and with randomly selected hands:

bench::mark(eval_hand_df(test_hand))
#> # A tibble: 1 × 6
#>   expression                   min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>              <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(test_hand)   42.1µs   47.8µs    20681.      37KB     16.6
bench::mark(eval_hand_df(deal_hand_df(deck)))
#> # A tibble: 1 × 6
#>   expression                            min  median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                       <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(deal_hand_df(deck))   47.2µs  62.2µs    15985.      264B     18.5

As expected for a naive poker hand evaluator, performance of eval_hand_df() is poor compared to fast algorithms.

Summary

An implementation using integer would likely be much faster than the first implementation using data.frame. Rank and suit can be derived using integer division and modulo arithmetic respectively,and tabulate() is a faster replacement for rle().

0:51 %/% 4
#>  [1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4  5  5  5  5  6
#> [26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11 11 11 11 12 12
#> [51] 12 12
tabulate(0:51 %/% 4 + 1, 13)
#>  [1] 4 4 4 4 4 4 4 4 4 4 4 4 4
0:51 %% 4
#>  [1] 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1
#> [39] 2 3 0 1 2 3 0 1 2 3 0 1 2 3
tabulate(0:51 %% 4 + 1, 4)
#> [1] 13 13 13 13

bench::mark(rle(sort(sample(0:51, 5) %/% 4 + 1)))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                         <bch:t> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 rle(sort(sample(0:51, 5)%/%4 + 1))  15.3µs 17.5µs    55210.      264B     16.6
bench::mark(tabulate(sample(0:51, 5) %/% 4 + 1, 13))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 tabulate(sample(0:51, 5)%/%4 + 1, … 2.17µs 2.67µs   370748.      264B        0

Note that the tabulate approach is 7 times faster than sorting and run length encoding.

Integer

Benchmark the second implementation using integer().

New Deck

Create a new deck using new_deck() and new_deck_df().

deck_df <- new_deck_df()
deck <- new_deck()
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#>   expression         min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck_df()   14.8µs   17.2µs    57689.    1.25KB     17.3
bench::mark(new_deck())
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 new_deck()     41ns     82ns 10928520.        0B        0

new_deck() is 90 times faster.

Deal

Compare performance of deal_hand_df() and deal_hand()

bench::mark(deal_hand_df(deck_df))
#> # A tibble: 1 × 6
#>   expression                 min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand_df(deck_df)   13.3µs   15.9µs    62993.      264B     18.9
bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand(deck)    1.6µs   2.01µs   487704.    3.02KB        0

deal_hand() is about 6 times faster.

Test performance of print_hand_df() against print_hand().

test_hand_df <- deal_hand_df(deck_df)
test_hand <- deal_hand(deck)

bench::mark(print_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#>   expression                       min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand_df(test_hand_df)     60µs   69.1µs    14392.        0B     20.8
bench::mark(print_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                 min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 print_hand(test_hand)   4.14µs    4.8µs   205401.    8.52KB        0

print_hand() is 16 times faster.

Evaluate

Test performance of eval_hand_df() and eval_hand() with a single hand.

bench::mark(eval_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_df(test_hand_df)   38.7µs   44.6µs    22498.        0B     18.0
bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)    943ns   1.19µs   796311.    27.5KB        0

eval_hand() is 20 times faster, but should perform poorly compared to fast algorithms.

Multiple Hands

Compare performance evaluating and printing multiple hands.

bench::mark({
  deck <- new_deck_df()
  replicate(50, {
    hand <- deal_hand_df(deck)
    paste0(print_hand_df(hand), ": ", eval_hand_df(hand))
  })
})
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 { deck <- new_deck_df() replicate(… 6.39ms 6.85ms      146.    34.1KB     20.8
bench::mark({
  deck <- new_deck()
  replicate(50, {
    hand <- deal_hand(deck)
    paste0(print_hand(hand), ": ", eval_hand(hand))
  })
})
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                           <bch> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 { deck <- new_deck() replicate(50, … 468µs  533µs     1856.      74KB     12.6

Overall, the new implementation is 13-14 times faster.

Python

Benchmark the integer() approach to PH Evaluator using reticulate.

Import

Test performance of phevaluator using reticulate::import(), starting with sample_cards():

bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#>   expression           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 deal_hand(deck)    1.6µs   2.01µs   488220.      264B        0
bench::mark(phevaluator$sample_cards(5L))
#> # A tibble: 1 × 6
#>   expression                        min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                   <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$sample_cards(5L)   18.3µs   21.1µs    47407.        0B     9.48

phevaluator$sample_cards() is 13 times slower than than the R integer implementation.

Also test phevaluator$evaluate_card() against the R integer method. evaluate_card() expects five to seven integers passed as individual parameters.

bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)    984ns   1.19µs   804573.        0B        0
bench::mark(do.call(phevaluator$evaluate_cards, as.list(test_hand)))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 20.4µs 23.5µs    41422.        0B     12.4

Surprisingly, phevaluator is almost as slow as the original data frame implementation. Test again using some specific hands and avoid the overhead of do.call() and as.list():

four_aces <- c(51L, 50L, 49L, 48L, 47L)
royal_flush <- c(50L, 46L, 42L, 38L, 34L)

bench::mark(eval_hand(four_aces))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(four_aces)    697ns    861ns  1081486.        0B        0
bench::mark(phevaluator$evaluate_cards(51L, 50L, 49L, 48L, 47L))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$evaluate_cards(51L, 50… 18.7µs 21.3µs    47006.        0B     9.40

bench::mark(eval_hand(royal_flush))
#> # A tibble: 1 × 6
#>   expression                  min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>             <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(royal_flush)   1.23µs   1.52µs   612538.        0B     61.3
bench::mark(phevaluator$evaluate_cards(50L, 46L, 42L, 38L, 34L))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 phevaluator$evaluate_cards(50L, 46… 18.4µs 21.3µs    47038.        0B     9.41

Calling evaluate_cards() directly doesn’t significantly change the results. Test once more with random hands:

bench::mark(eval_hand(deal_hand(deck)))
#> # A tibble: 1 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(deal_hand(deck))   2.75µs   3.77µs   258085.      264B     25.8
bench::mark(do.call(phevaluator$evaluate_cards, as.list(deal_hand(deck))))
#> # A tibble: 1 × 6
#>   expression                             min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                          <bch:> <bch:>     <dbl> <bch:byt>    <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 23.1µs 26.7µs    37618.      264B     11.3

Conclusion: using phevaluator via reticulate::import() is not a faster way to evaluate hands. It is important to note that phevaluator$evaluate_cards() does more than eval_hand(), as phevaluator ranks all poker hands and eval_hand() only determines the hand rank category.

C/C++

Benchmark the integer() approach against the C/C++ implementation of PH Evaluator using Rcpp.

The current version only implements eval_hand_phe(), which uses EvaluateCards() and describeCategory() to return the card rank category.

Evaluate

Test performance of eval_hand() and eval_hand_phe() with a single hand:

bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#>   expression                min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>           <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand(test_hand)   1.02µs   1.27µs   758335.        0B        0
bench::mark(eval_hand_phe(test_hand))
#> # A tibble: 1 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 eval_hand_phe(test_hand)    410ns    492ns  1933874.    2.24KB        0

Somewhat surprisingly, eval_hand_phe() is only 2 times faster than eval_hand(), however, eval_hand_phe() doesn’t just evaluate hand rank category, it also determines exact hand rank.

Reviewing the benchmarks on the PH Evaluator README and on my own system, the compiled C/C++ implementation should be capable of about 70 million hands per second, while eval_hand_phe() achieves about 1 million per second. This is likely due to the additional overhead of using R, and, more importantly, the additional call to describeCategory(), as the benchmark code only calls EvaluateCards().

A future implementation could implement the full pheval libraries and the C++ code in card_sampler.h to generate random hands in a standalone R package using Rcpp Modules.