Follow these steps for an example study (n=499) MTBLS1684 which has Agilent QTOF 6550 data collected in the RP-ESI-POS mode.
Download and install the latest version of R Software from https://cran.r-project.org/
Download and install the latest version of RStudio software from https://www.rstudio.com/products/rstudio/download/
Create a directory, "MTBLS1684" in your computer's hard-drive.
Download (.D) files for this case study from https://www.ebi.ac.uk/metabolights/MTBLS1684 and copy the data to the "MTBLS1684" directory.
Download the reference mz and RT files from https://github.com/idslme/IDSL.IPA/blob/main/Reference_masses_peak_annotation.xlsx and copy to the "MTBLS1684" folder.
Use the MS-Convert utility from proteowizard to convert the .D files to mzML format.
The pipeline required 48 parameters divided into 8 sections. For the "MTBLS1684" data, use the below settings in each section to run the pipeline using your computer.
For beginners it is recommended to use the online form version https://ipa.idsl.me/ipa-comprehensive-analysis/online-form to fill these sections.
For R experts, edit the R-script directly in your RStudio IDE.
Note : IDSL.IPA pipeline works for both Linux and windows operating systems.
There are three ways to prepare the parameter input for the IDSL.IPA pipeline, see here https://ipa.idsl.me/ipa-comprehensive-analysis .
Section 1 (Global Parameters)
PARAM0001 (Peak List for individual LC/HRMS files) : YES
PARAM0002 (Aligned peak table ) : YES
PARAM0003 (Gap-filled peak table ) : YES
PARAM0004 (Annotate peak table using a reference database ) : YES
PARAM0005 (Targeted Analysis ) : NO
PARAM0006 (Number of parallel threads) : 8 (change this value to match the number of threads available in your computer). Modern CPUs have two threads per core.
Section 2 (Data Import and Export)
PARAM0007 (Location of the LC/HRMS data) : "full path of the MTBLS1684 directory"
PARAM0008 (List of files) : "All"
Note : provide a semi-colon (;) separated list of file names in case only a subset of files need to be processed.
PARAM0009 (Data format) : "mzML"
Note : IDSL.IPA depends on the mzR package to read the mzML files.
PARAM0010 (Location of the output files) : "full path of the MTBLS1684 directory"
Section 3 (Pairing of potential C12 and C13 peaks)
PARAM0011 (Instrument noise level) : 500
Note : this noise level is used only for the removing noisy C12 peaks.
PARAM0012 (Cutoff for the maximum ratio of putative C12 and C13 peaks) : 90
Section 4 (Chromatographic Peak Detection)
PARAM0013 (Mass tolerance to create EICs) : 0.01
PARAM0014 (RT tolerance to remove redundant peaks) : 0.05
PARAM0015 (Smoothing windows for LOESS) : 12
PARAM0017 (Fronting and tailing peaks resolving factor) : 0.05
PARAM0018 (Rounding factor for m/z values) : 2
Section 5 (Chromatographic Peak Analysis and Data Reduction)
PARAM0019 (perform recursive mass correction) : YES
PARAM0020 (number of extra scan on both sides of the corrected mass ) : 50
PARAM0021 (minimum peak height ) : 1000
PARAM0022 (% cutoff for maximum missing scans) : 30
PARAM0023 (minimum nIsoPairs) : 3
PARAM0024 ( minimum % nIsopairs) : 30
PARAM0025 ( maximum ratio of cumulative C12/C13 ratio) : 80
PARAM0026 (maximum ratio of peak width at half height) : 1
PARAM0027 (minimum signal to noise (local) ) : 2
PARAM0028 ( number of points for data interpolation) : 100
Section 6 (Retention time correction and peak alignment)
PARAM0029 (perform retention time correction ) : YES
PARAM0030 (reference sample list ) : "003.mzML;004.mzML;005.mzML;007.mzML;008.mzML;009.mzML;010.mzML;011.mzML;012.mzML;014.mzML"
Note : the sample list should be separated by a semi-colon (;)
PARAM0031 (minimum % of the recurring peaks in reference samples) : 100
PARAM0032 (Retention time correction method) : "RetentionIndex"
PARAM0033 (Reference peak tolerance for "RetentionIndex" to minimize local RT errors ) : 5
PARAM0034 (Degree for the polynomial regression) : ""
PARAM0035 (mass tolerance for peak alignment) : 0.01
PARAM0036 (RT tolerance for peak alignment ) : 0.05
PARAM0037 (number of m/z slices for parallel computation) : 20
Section 7 (Gap-filling)
PARAM0038 (mass tolerance ) : 0.01
PARAM0039 (RT tolerance) : 0.1
PARAM0040 (extra scans on both side of peak apex for calculating peak area ) : 20
Section 8 (Peak Annotation)
PARAM0041 (reference file location ) : "MTBLS1684"
PARAM0042 (reference file name ): "Reference_masses_peak_annotation.xlsx"
PARAM0043 (mass tolerance) : 0.01
PARAM0044 (RT tolerance) : 0.05
PARAM0045 (Use corrected RT values ) : YES
PARAM0046 (Compound centric annotation ) : YES
PARAM0047 (Sample centric annotation ) : YES
PARAM0048 (Gap-filling for the sample centric annotation ) : YES
Expected results and outcomes : Once the IDSL.IPA R-script completes the calculations, you should get these results ( https://zenodo.org/record/4708401 and https://zenodo.org/record/4708411 )