This is an analysis of the Chicken Noodle Cube. This is a collection of Magic the Gathering cards I use to play with my friends. While this is a fufilling creative and social endeavor, it also presents me with a rich dataset.
I compiled the database ahead of time. See the shell scripts in the github repo.
Now, onto the analysis! cards is the main table we’re working with
Code
import duckdbimport pandas as pdimport plotnine as pnimport seaborn as snsfrom types import SimpleNamespacecube = duckdb.connect("cube.duckdb")cube.query("select * from cards limit 1").to_df().iloc[0]
name Basri, Tomorrow's Champion
cmc 1
card_type Legendary Creature - Human Knight
color W
set dft
collector_number 3
rarity rare
color_category White
status Owned
maybeboard_ind False
notes None
Name: 0, dtype: object
( cube.query(""" select color_category, cmc as cmc from cards where color_category != 'Lands' """).to_df() .pipe(pn.ggplot, pn.aes(x="cmc", color="color_category")) + pn.stat_ecdf()+ pn.facet_wrap("color_category")+ pn.scale_color_manual(style.cmc_scale)+ pn.theme_minimal())
Cards I own:
Code
cube.query("select status, count(*) as num from cards where maybeboard_ind=False group by status")
┌───────────┬───────┐
│ status │ num │
│ varchar │ int64 │
├───────────┼───────┤
│ Not Owned │ 5 │
│ Owned │ 367 │
└───────────┴───────┘
Keywords
Code
( cube.query(""" select color_category , sum(COLUMNS(kw.* EXCLUDE (id))) from cards_with_info inner join keyword_counts as kw using(id) where 1=1 and maybeboard_ind=False and cmc > 0 group by color_category """).to_df().set_index("color_category").T# sort by most common .assign(total=lambda df: df.sum(axis=1)) .sort_values("total", ascending=False)# show top 10 .head(10).astype(int))