3. Understand Text Format of AMRSAPDB v1 Clinical database
The text format of AMRSAPDB v1 Clinical starts with two-character code followed by 4 blank space characters followed by the data.
Each line is kept 90 characters long for better readability of the data. The description of the two-character code is given below.
ID => AMRSAPDB v1 Accession
PN => Peptide Name
SO => Source Organism
PL => Peptide Length
TX => Taxonomy
PE => Peptide Existence Level
TO => Target Organism
AC => Counts of Amino Acids
AF => Frequencies of Amino Acids
MA => Missing Amino Acid(s)
MO => Most Occurring Amino Acid(s)
LO => Less Occurring Amino Acid(s)
BA => Hydrophobic Amino Acid(s) Count
LA => Hydrophilic Amino Acid(s) Count
AA => Acidic Amino Acid(s) Count
BC => Basic Amino Acid(s) Count
MC => Modified Amino Acid(s) Count
MF => Modified Amino Acid(s) Frequencies
MM => Molecular Mass
AI => Aliphatic Index
II => Instability Index (Half Life)
HP => Hydrophobicity (GRAVY)
HM => Hydrophobic Moment
IP => Isoelectric Point
NC => Net Charge
SF => Secondary Structure Fraction
AR => Aromaticity
ME => Molar Extinction Coefficient (cysteine|cysteine)
MS => Mass Shift
SC => Structural Class of the Peptide
XR => Database Cross-reference
SQ => Sequence
XX => Blank Line
// => End of an entry