devtools::install_github("dionnecargy/SeroTrackR") # To download the package
── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘/private/var/folders/bh/0yzt0_x97vj_zktb_39c1xvh0000gn/T/RtmpQJAaAN/remotes7b6a4cb81ae9/dionnecargy-SeroTrackR-499eab4/DESCRIPTION’ ... OK
* preparing ‘SeroTrackR’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
Omitted ‘LazyData’ from DESCRIPTION
* building ‘SeroTrackR_0.5.0.tar.gz’
install.packages(c("tidyverse", "knitr", "targets"), repos ="https://cloud.r-project.org") # Download other packages in this tutorial
The downloaded binary packages are in
/var/folders/bh/0yzt0_x97vj_zktb_39c1xvh0000gn/T//RtmpQJAaAN/downloaded_packages
library(SeroTrackR) # To load the package library(tidyverse) # for data wrangling and visualisationlibrary(knitr) # for RMarkdown visualisation and PDF generation
File Name Convention
The following files are required for this app. You might like to organise your folder as follows:
The raw data files should all contain “plate1”, “plate2”, “plate3”, etc., in the file name. Ensure there are no special characters or spaces between the “plate” and number. To keep things simple, please also do not write any leading 0’s before the number e.g., do not write “plate01”.
For the MAGPIX and Intelliflex Machines:
You can pre-program the MAGPIX machine so that you can export all the raw data directly from the machine once the plate reading is completed. There is no need to edit the raw data file that comes from the MAGPIX.
Within your plate layout in the MAGPIX, you can use the “U” button for all unknown samples, “B” button for Background or Blank samples, and “S” for Standard Curve samples. For the control wells, please feel free to edit these labels so that the ID is just “B”, “S1”, “S2”, “S3”….”S10”.
You can also write in the Operator i.e., who ran the assay! This is useful to track variation in plates between experiments.
For the Bioplex Machines:
Isolate names will tend to be written as ‘X1’, ‘X2’, ‘X3’… and saved as an .xlsx file. Specifics on Bio-Plex machines will be added shortly.
Plate Layout File
The plate layout file should contain all of the plate layouts in each tab. For each 96-well plate that you run on the Luminex machine, prepare a plate layout that includes the sample labels that will match your raw data. The application will match the raw data to the corresponding sample based on the plate layout that you import.
Make sure that your sample labels in the plate layout are as follows:
Standards: Labels start with “S” and then a number as required (e.g. S1, S2, S3 or Standard1, Standard2, Standard3).
Blanks: Labels start with “B” and then a number as required if there is more than one blank sample (.e.g ‘B1’, ‘B2’, or ‘Blank 1’, ‘Blank2’ etc).
Unknown Samples: Label your unknown samples according to your specific sample codes (e.g. ABC001, ABC002).
The package expects standards to start with “S” and blanks to start with “B”, but everything else with a label will be considered an unknown sample. If you have other types of samples, for example a positive control, you can use a different sample label to the other unknown study samples (i.e. “PositiveControl” in addition to the “ABC” study codes).
The standards S1-10 correspond to the following dilution concentrations:
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
1/50
1/100
1/200
1/400
1/800
1/1600
1/3200
1/6400
1/12800
1/25600
The standards S1-5 correspond to the following dilution concentrations:
S1
S2
S3
S4
S5
1/50
1/250
1/1250
1/6250
1/31250
Data Analysis: runPvSeroPipeline()
Run this global function runPvSeroPipeline() embedded within the {SeroTrackR} R package! This function contains all of the steps in order of how to perform the Plasmodium vivax serology test and treat protocol as found in our application!
Using Tutorial Dataset: Load the Data
We will be using the build-in files in the R package for this tutorial, as shown below.
This is a table containing the classification results (seropositive or seronegative) for each SampleID. In this case, the classification results are stored in the pred_class_max column as we chose the sens_spec = "maximised". If you change it to another type of threshold, then the suffix of that column will change accordingly.
You will also see the relative antibody unit (RAU) values (columns with antigen names), whether the sample passed QC check (QC_total) and the plate that they were run on.
final_analysis[[1]] %>%head() %>%kable()
SampleID
Plate
QC_total
EBP
LF005
LF010
LF016
MSP8
PTEX150
PvCSS
RBP2b.P87
pred_class_max
ABC013
plate1
pass
0.0003339
0.0015045
0.0002163
0.0014567
0.0000195
0.0001591
0.0000772
0.0003714
seropositive
ABC097
plate2
pass
0.0004324
0.0015615
0.0001944
0.0013373
0.0000195
0.0001549
0.0000705
0.0009189
seropositive
ABC181
plate3
pass
0.0003822
0.0015832
0.0002144
0.0013711
0.0000195
0.0001582
0.0000710
0.0002070
seropositive
ABC022
plate1
pass
0.0200000
0.0200000
0.0007373
0.0194885
0.0006247
0.0003145
0.0006008
0.0004895
seropositive
ABC106
plate2
pass
0.0057123
0.0193731
0.0007458
0.0195240
0.0006263
0.0003077
0.0006480
0.0126203
seropositive
ABC190
plate3
pass
0.0098260
0.0200000
0.0007400
0.0200000
0.0006020
0.0003171
0.0006555
0.0175253
seropositive
Standard Curve Plot
The standard curve plots are generated from the antibody data from the standards you indicated in your plate layout (e.g. S1-S10) and Median Fluorescent Intensity (MFI) units are displayed in log10-scale. In the case of the PvSeroTaT multi-antigen panel, the antigens will be displayed and in general your standard curves should look relatively linear (only when the y-axis is on logarithmic scale).
final_analysis[[2]]
Bead Counts QC Plot
A summary of the bead counts for each plate well are displayed, with blue indicating there are sufficient beads (≥15) or red when there are not enough. If any of the wells are red, they should be double-checked manually and re-run on a new plate if required.
The function will inform you whether there are “No repeats necessary” or provide a list of samples to be re-run. In the example data, the beads in plate 2 wells A1 and A2 will need to be repeated
The Median Fluorescent Intensity (MFI) units for each antigen is displayed for your blank samples. In general, each blank sample should have ≤50 MFI for each antigen, if they are higher they should be cross-checked manually.
In the example data, blank samples recorded higher MFI values for LF005 on plate 1 and should be checked to confirm this is expected from the assay.
final_analysis[[5]]
Model Output Plot
The automated data processing in this app allows you to convert your Median Fluorescent Intensity (MFI) data into Relative Antibody Units (RAU) by fitting a 5-parameter logistic function to the standard curve on a per-antigen level. The results from this log-log conversion should look relatively linear for each antigen.
final_analysis[[6]]
$plate1
$plate2
$plate3
Run Classification: No
no_classification_final_analysis <-runPvSeroPipeline(raw_data = your_raw_data, plate_layout = your_plate_layout, platform ="magpix", location ="ETH", experiment_name ="experiment1", classify ="No", ########################## key if you do NOT want any classification performed i.e., you do not have PvSeroTaT antigens algorithm_type ="antibody_model", sens_spec ="maximised")
For all of these analyses you can run as many plates as you wish.
5-Point Standard Curve
Step 1: Load your data!
Firstly, we will be using our example data that’s in-built in the package. Here replace the system.file() argument with the file path for your package.
Caitlin and Dionne have worked on a function to (a) process raw Serological data and (b) convert MFI to RAU. The runPlasmoPipeline() function will output three data frames:
All_Results: All columns of every MFI to RAU conversion
MFI_RAU: Just the SampleID, Plate, MFI and RAU values per antigen
MFI_RAU_long: SampleID, Plate, MFI, RAU, Antigen, Species (long-format df)
results_10stdcurve <-runPlasmoPipeline(raw_data = your_raw_data_10std,platform ="magpix",plate_layout = your_plate_layout_10std,panel ="panel1",std_point =10, ################################### here make sure you write 10! experiment_name ="10-point standard curve")
For the LDH data analysis, follow the steps below:
runLDHpipeline(raw_data ="your/raw/data.xlsx", # Mandatoryplate_layout ="your/plate/layout/file.xlsx", # Mandatoryplatform ="bioplex", # Defaults to "bioplex" but "magpix" also workslocation ="ETH", # Defaults to "ETH" but "PNG" also worksdilution =c(1000000, 333333.33, 111111.11, 37037.04, 12345.68, 4115.23, 1371.74, 457.25, 152.42, 50.81), # Default values shown but other levels can be addedexperiment_name ="experiment1", # Defaults to "experiment1"file_path =NULL# Defaults to current folder/working directory)
Visualisation of the {SeroTrackR} R Package
We have used the {targets} R package to generate a pipeline! This allows us to:
Automatically detect the dependencies of each step
One-command execution
Automatic caching
Automatic detection of changes in data and/or code
For more information on {targets} see this tutorial.
here() starts at /Users/Dionne/Documents/GitHub/SeroTrackR
✔ skipped pipeline [43ms, 15 skipped]
Warning message:
package ‘targets’ was built under R version 4.4.1
here() starts at /Users/Dionne/Documents/GitHub/SeroTrackR
Warning message:
package ‘targets’ was built under R version 4.4.1
FAQs
I have multiple plate layout files. How can I input them?
Use the getPlateLayout() function to create a master plate layout file to then input into the other functions in the package!
getPlateLayout("../inst/extdata/")
Here replace “../inst/extdata/” with the main file that contains your folders. For example, if your folder looks like this: