Measure the performance of different implementations of
cards
using bench::mark()
.
library(cards)
library(reticulate)
phevaluator <- import("phevaluator")
Data Frame
Benchmark the initial implementation using data.frame
compared to an integer()
approach similar to PH Evaluator
card.py
.
New Deck
Create a new deck using new_deck_df()
and an integer
vector.
deck <- new_deck_df()
deck_int <- 0:51
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 new_deck_df() 15µs 17.4µs 57757. 1.25KB 52.0
bench::mark(0:51)
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 0:51 0 1ns 54243985. 0B 0
While new_deck_df()
is not designed to be called
frequently, using an integer vector is much faster.
Deal
Compare performance of deal_hand_df()
to sampling
integers:
bench::mark(deal_hand_df(deck))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 deal_hand_df(deck) 13.5µs 15.7µs 62932. 3.73KB 56.7
bench::mark(sample(deck_int, 5))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 sample(deck_int, 5) 1.48µs 1.8µs 541573. 264B 54.2
deal_hand_df()
is about 7 times slower than
sample()
.
Test performance of print_hand_df()
against a simple
function that prints cards based on integers.
test_hand <- deal_hand_df(deck)
bench::mark(print_hand_df(test_hand))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 print_hand_df(test_hand) 60.1µs 70.4µs 14198. 10.5KB 22.7
print_hand_int <- function(h) {
cards <- paste0(rep(c(2:9, "T", "J", "Q", "K", "A"), each = 4), c("C", "D", "H", "S"))
paste(cards[h + 1], collapse = " ")
}
test_hand_int <- sample(0:51, 5)
bench::mark(print_hand_int(test_hand_int))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 print_hand_int(test_hand_int) 4.3µs 5µs 194139. 928B 19.4
print_hand_df()
is 14-15 times slower than the integer
approach.
Evaluate
Test performance of eval_hand_df()
with a single hand
and with randomly selected hands:
bench::mark(eval_hand_df(test_hand))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand_df(test_hand) 42.7µs 49.4µs 20295. 37KB 16.4
bench::mark(eval_hand_df(deal_hand_df(deck)))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:t> <dbl> <bch:byt> <dbl>
#> 1 eval_hand_df(deal_hand_df(deck)) 47µs 62µs 16019. 264B 18.4
As expected for a naive poker hand evaluator, performance of
eval_hand_df()
is poor compared to fast algorithms.
Summary
An implementation using integer would likely be much faster than the
first implementation using data.frame
. Rank and suit can be
derived using integer division and modulo arithmetic respectively,and
tabulate()
is a faster replacement for
rle()
.
0:51 %/% 4
#> [1] 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 6
#> [26] 6 6 6 7 7 7 7 8 8 8 8 9 9 9 9 10 10 10 10 11 11 11 11 12 12
#> [51] 12 12
tabulate(0:51 %/% 4 + 1, 13)
#> [1] 4 4 4 4 4 4 4 4 4 4 4 4 4
0:51 %% 4
#> [1] 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1
#> [39] 2 3 0 1 2 3 0 1 2 3 0 1 2 3
tabulate(0:51 %% 4 + 1, 4)
#> [1] 13 13 13 13
bench::mark(rle(sort(sample(0:51, 5) %/% 4 + 1)))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:t> <bch:> <dbl> <bch:byt> <dbl>
#> 1 rle(sort(sample(0:51, 5)%/%4 + 1)) 15.3µs 17.8µs 56065. 264B 16.8
bench::mark(tabulate(sample(0:51, 5) %/% 4 + 1, 13))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl>
#> 1 tabulate(sample(0:51, 5)%/%4 + 1, … 2.17µs 2.67µs 371405. 264B 0
Note that the tabulate approach is 7 times faster than sorting and run length encoding.
Integer
Benchmark the second implementation using integer()
.
New Deck
Create a new deck using new_deck()
and
new_deck_df()
.
deck_df <- new_deck_df()
deck <- new_deck()
bench::mark(new_deck_df())
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 new_deck_df() 15.2µs 17.6µs 56749. 1.25KB 17.0
bench::mark(new_deck())
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 new_deck() 0 41ns 16483091. 0B 0
new_deck()
is 90 times faster.
Deal
Compare performance of deal_hand_df()
and
deal_hand()
bench::mark(deal_hand_df(deck_df))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 deal_hand_df(deck_df) 13.6µs 15.9µs 62865. 264B 18.9
bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 deal_hand(deck) 1.64µs 2.05µs 480074. 3.02KB 0
deal_hand()
is about 6 times faster.
Test performance of print_hand_df()
against
print_hand()
.
test_hand_df <- deal_hand_df(deck_df)
test_hand <- deal_hand(deck)
bench::mark(print_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 print_hand_df(test_hand_df) 59.6µs 69.5µs 14379. 0B 20.8
bench::mark(print_hand(test_hand))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 print_hand(test_hand) 4.18µs 4.84µs 204047. 8.52KB 0
print_hand()
is 16 times faster.
Evaluate
Test performance of eval_hand_df()
and
eval_hand()
with a single hand.
bench::mark(eval_hand_df(test_hand_df))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand_df(test_hand_df) 43.3µs 50.4µs 19826. 0B 16.4
bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand(test_hand) 1.48µs 1.72µs 560767. 27.5KB 56.1
eval_hand()
is 20 times faster, but should perform
poorly compared to fast algorithms.
Multiple Hands
Compare performance evaluating and printing multiple hands.
bench::mark({
deck <- new_deck_df()
replicate(50, {
hand <- deal_hand_df(deck)
paste0(print_hand_df(hand), ": ", eval_hand_df(hand))
})
})
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl>
#> 1 { deck <- new_deck_df() replicate(… 6.57ms 6.86ms 146. 34.2KB 20.9
bench::mark({
deck <- new_deck()
replicate(50, {
hand <- deal_hand(deck)
paste0(print_hand(hand), ": ", eval_hand(hand))
})
})
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch> <bch:> <dbl> <bch:byt> <dbl>
#> 1 { deck <- new_deck() replicate(50, … 478µs 531µs 1855. 74KB 12.5
Overall, the new implementation is 13-14 times faster.
Python
Benchmark the integer()
approach to PH Evaluator
using reticulate.
Import
Test performance of phevaluator
using
reticulate::import()
, starting with
sample_cards()
:
bench::mark(deal_hand(deck))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 deal_hand(deck) 1.6µs 2.05µs 479972. 264B 0
bench::mark(phevaluator$sample_cards(5L))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 phevaluator$sample_cards(5L) 18.4µs 21.2µs 46311. 0B 9.26
phevaluator$sample_cards()
is 13 times slower than than
the R integer implementation.
Also test phevaluator$evaluate_card()
against the R
integer method. evaluate_card()
expects five to seven
integers passed as individual parameters.
bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand(test_hand) 1.48µs 1.8µs 508330. 0B 50.8
bench::mark(do.call(phevaluator$evaluate_cards, as.list(test_hand)))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 20.9µs 24.1µs 41663. 0B 8.33
Surprisingly, phevaluator
is almost as slow as the
original data frame implementation. Test again using some specific hands
and avoid the overhead of do.call()
and
as.list()
:
four_aces <- c(51L, 50L, 49L, 48L, 47L)
royal_flush <- c(50L, 46L, 42L, 38L, 34L)
bench::mark(eval_hand(four_aces))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand(four_aces) 697ns 902ns 1071335. 0B 0
bench::mark(phevaluator$evaluate_cards(51L, 50L, 49L, 48L, 47L))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl>
#> 1 phevaluator$evaluate_cards(51L, 50… 18.9µs 21.8µs 46051. 0B 13.8
bench::mark(eval_hand(royal_flush))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand(royal_flush) 1.19µs 1.48µs 656659. 0B 0
bench::mark(phevaluator$evaluate_cards(50L, 46L, 42L, 38L, 34L))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl>
#> 1 phevaluator$evaluate_cards(50L, 46… 18.6µs 21.4µs 47016. 0B 9.41
Calling evaluate_cards()
directly doesn’t significantly
change the results. Test once more with random hands:
bench::mark(eval_hand(deal_hand(deck)))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand(deal_hand(deck)) 2.79µs 3.81µs 254239. 264B 25.4
bench::mark(do.call(phevaluator$evaluate_cards, as.list(deal_hand(deck))))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl>
#> 1 do.call(phevaluator$evaluate_cards… 22.9µs 26.6µs 37724. 264B 11.3
Conclusion: using phevaluator
via
reticulate::import()
is not a faster way to evaluate hands.
It is important to note that phevaluator$evaluate_cards()
does more than eval_hand()
, as phevaluator
ranks all poker hands and eval_hand()
only determines the
hand rank category.
C/C++
Benchmark the integer()
approach against the C/C++
implementation of PH Evaluator
using Rcpp.
The current version only implements eval_hand_phe()
,
which uses EvaluateCards()
and
describeCategory()
to return the card rank category.
Evaluate
Test performance of eval_hand()
and
eval_hand_phe()
with a single hand:
bench::mark(eval_hand(test_hand))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand(test_hand) 1.39µs 1.93µs 429286. 0B 42.9
bench::mark(eval_hand_phe(test_hand))
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 eval_hand_phe(test_hand) 410ns 492ns 1961129. 2.24KB 0
Somewhat surprisingly, eval_hand_phe()
is only 2 times
faster than eval_hand()
, however,
eval_hand_phe()
doesn’t just evaluate hand rank category,
it also determines exact hand rank.
Reviewing the benchmarks on the PH Evaluator README
and
on my own system, the compiled C/C++ implementation should be capable of
about 70 million hands per second, while eval_hand_phe()
achieves about 1 million per second. This is likely due to the
additional overhead of using R, and, more importantly, the additional
call to describeCategory()
, as the benchmark code
only calls EvaluateCards()
.
A future implementation could implement the full pheval
libraries and the C++ code in card_sampler.h
to generate random hands in a standalone R package using Rcpp
Modules.