planet is an R package for inferring ethnicity (1), gestational age (2), and cell composition (3) from placental DNA methylation data.

See full documentation at https://victor.rbind.io/planet

Installation

You can install planet from this github repo:

devtools::install_github('wvictor14/planet')

Usage

See vignettes for more detailed usage.

Example Data

All functions in this package take as input DNAm data the 450k and EPIC DNAm microarray. For best performance I suggest providing unfiltered data normalized with noob and BMIQ. A processed example dataset, plBetas, is provided to show the format that this data should be in. The output of all planet functions is a data.frame.

A quick example of each major function is illustrated with this example data:

library(minfi)
library(planet)

#load example data
data(plBetas)
data(plPhenoData) # sample information

Predict Ethnicity

predictEthnicity(plBetas) %>%
  head()
#> 1860 of 1860 predictors present.
#> # A tibble: 6 x 7
#>   Sample_ID Predicted_ethni~ Predicted_ethni~ Prob_African Prob_Asian
#>   <chr>     <chr>            <chr>                   <dbl>      <dbl>
#> 1 GSM19449~ Caucasian        Caucasian            0.00331    0.0164  
#> 2 GSM19449~ Caucasian        Caucasian            0.000772   0.000514
#> 3 GSM19449~ Caucasian        Caucasian            0.000806   0.000699
#> 4 GSM19449~ Caucasian        Caucasian            0.000883   0.000792
#> 5 GSM19449~ Caucasian        Caucasian            0.000885   0.00130 
#> 6 GSM19449~ Caucasian        Caucasian            0.000852   0.000973
#> # ... with 2 more variables: Prob_Caucasian <dbl>, Highest_Prob <dbl>

Predict Gestational Age

There are 3 gestational age clocks for placental DNA methylation data from Lee Y. et al. 2019 (2). To use a specific one, we can use the type argument in predictAge:

predictAge(plBetas, type = 'RPC') %>%
  head()
#> 558 of 558 predictors present.
#> [1] 38.46528 33.09680 34.32520 35.50937 37.63910 36.77051

Predict Cell Composition

Reference data to infer cell composition on placental villi DNAm samples

(3) can be used with cell deconvolution from minfi or EpiDISH. These are provided in this package as plCellCpGsThird and plCellCpGsFirst for third trimester (term) and first trimester samples, respectively.

data('plCellCpGsThird')

minfi:::projectCellType(
  
  # subset your data to cell cpgs
  plBetas[rownames(plCellCpGsThird),], 
  
  # input the reference cpg matrix
  plCellCpGsThird,
  
  lessThanOne = FALSE) %>%
  
  head()
#>            Trophoblasts    Stromal      Hofbauer Endothelial       nRBC
#> GSM1944936    0.1091279 0.04891919  0.000000e+00  0.08983998 0.05294062
#> GSM1944939    0.2299918 0.00000000 -1.806592e-19  0.07888007 0.03374149
#> GSM1944942    0.1934287 0.03483540  0.000000e+00  0.09260353 0.02929310
#> GSM1944944    0.2239896 0.06249135  1.608645e-03  0.11040693 0.04447951
#> GSM1944946    0.1894152 0.07935955  0.000000e+00  0.10587439 0.05407587
#> GSM1944948    0.2045124 0.07657717  0.000000e+00  0.09871149 0.02269798
#>            Syncytiotrophoblast
#> GSM1944936           0.6979477
#> GSM1944939           0.6377822
#> GSM1944942           0.6350506
#> GSM1944944           0.5467642
#> GSM1944946           0.6022329
#> GSM1944948           0.6085825