SCAT: Structural Correlation Analysis Tool

SCAT is designed for the analysis of structural properties of tryptophan environment in proteins. As a result of the SCAT module, the structural properties of tryptophan residues can be calculated and the tryptophan residues can be assigned to one of the five spectral-structural classes. The SCAT module is fully interoperable with the PDB. The program downloads and parses the PDB file after the user provides the protein PDB code.

Input of data

There are 2 steps in input of spectral data.

Step 1. Provide the exact 4 character PDB code of file.
Please remember that the protein entered cannot contain protons.
Step 2. is designed to select chains, protein residues and atoms and hetatoms to be used for the calculation. The link to the selected file in PDB is provided. For each chain a subset of the residues could be selected: enter the range of residues (inclusive) in the text box. If you leave the box blank, all residues in the chain will be used.

Examples:

The separate line is reserved for comments. The comments would appear in all output files together with other information about selected PDB file.


Tips: structural calculations

SCAT calculations

The calculation of structural parameters of environment of tryptophan residues is expected to be completed in 1-20 minutes (depending on the number of tryptophan residues in protein). If the result will not appear in 30 minutes, please contact PFAST administrator.

Output file: 16-parameters.txt

The results of structural-correlation analysis are presented in the form of six ASCII files. The 16-parameters.txt file contains summary information about structural parameters of environment of tryptophan residues.

Line 1-10

Date of calculation
4 character PDB code
Name of protein (from PDB file)
Source of protein (from PDB file)
Authors (from PDB file)
Comment line (user comments from the previous step)
Resolution
[number of residues in protein] number of tryptophan residues {the position of tryptophan residues in sequence of protein}

Next Lines

The next lines contain 16 structural parameters of environment of each tryptophan residue in protein. The detailed information about calculation of structural parameters could be found in Background section or papers.

  1. Accessibility to solvent (%) of Ne1 atom of indole ring, which was calculated taking into account all hetatoms included in analysis
  2. Accessibility to solvent (%) of Ne1 atom of indole ring, which was calculated by excluding of all hetatoms from the analysis
  3. Accessibility to solvent (%) of CZ2 atom of indole ring, which was calculated taking into account all hetatoms included in analysis
  4. Accessibility to solvent (%) of CZ2 atom of indole ring, which was calculated by excluding of all hetatoms from the analysis
  5. Averaged accessibility to solvent (%) of nine atoms of indole ring of tryptophan residue, which was calculated taking into account all hetatoms included in analysis
  6. Averaged accessibility to solvent (%) of atoms of indole ring of tryptophan residue, which was calculated by excluding of all hetatoms from the analysis
  7. Packing density (Den1), the number of neighbor atoms at distance < 5.5 Å from the indole ring
  8. Packing density (Den2), the number of neighbor atoms at distance < 7.5 Å from the indole ring
  9. Parameter B1: crystallographic B-factors of the polar atoms at distance < 5.5 Å from the indole ring normalized to the mean B-factor value of all the C a atoms in the crystal structure
  10. Parameter B2: crystallographic B-factors of polar atoms at distance between 5.5 and 7.5 Å from the indole ring normalized to the mean B-factor value of all the C a atoms in the crystal structure
  11. Parameter R1 [ R1 = Acc*B1 ], “dynamic accessibility” is a dynamic characteristic of the microenvironment
  12. Parameter R2 [ R2 = Acc*B2 ], “dynamic accessibility” a dynamic characteristic of the microenvironment
  13. Parameter A1, relative polarity of environment: portion of polar atoms amongst all atoms around the tryptophan residue at distance < 5.5 Å
  14. Parameter A2, relative polarity of environment: portion of polar atoms amongst all atoms around the tryptophan residue at distance between 5.5 and 7.5 Å
  15. Charge difference equals to the difference between sum of all changes near pyrole ring minus sum of all charges near benzene ring in sphere of 5.5 Å
  16. Charge difference equals to the difference between sum of all changes near pyrole ring minus sum of all charges near benzene ring in sphere 7.5 Å

Output file: List of all atoms.txt

File contains list of all protein and solvent atoms at distance of 7.5 Å from the atoms of indole ring of tryptophan residues.

Line 1-10

Date of calculation
4 character PDB code
Name of protein (from PDB file)
Source of protein (from PDB file)
Authors (from PDB file)
Comment line (user comments from the previous step)
Resolution
[number of residues in protein] number of tryptophan residues {the position of tryptophan residues in sequence of protein}

Next Lines

List of all protein and solvent atoms at distance of 7.5 Å from the atoms of indole ring of tryptophan residues. The following information is presented in the file: the neighbor atom name, residues name and number, atom's B-factor, distance to the nearest of nine atoms of indole ring of tryptophan residue in Å.

Output file: List of polar atoms.txt

File contains list of polar (O, N and S) protein and solvent atoms at distance of 7.5 Å from the atoms of indole ring of tryptophan residues.

Line 1-10
Date of calculation
4 character PDB code
Name of protein (from PDB file)
Source of protein (from PDB file)
Authors (from PDB file)
Comment line (user comments from the previous step)
Resolution
[number of residues in protein] number of tryptophan residues {the position of tryptophan residues in sequence of protein}

Next Lines

List of polar (O, N and S) protein and solvent atoms at distance of 7.5 Å from the atoms of indole ring of tryptophan residues. The following information is presented in file: the neighbor atom name, residues name and number, atom's B-factor, distance to the nearest of nine atoms of indole ring of tryptophan residue in Å.

Output file: Result.txt

File contains information about orientation of protein and solvent atoms located at the distance of 7.5 Å from the atoms of indole ring of tryptophan residues.

Line 1-10

Date of calculation
4 character PDB code
Name of protein (from PDB file)
Source of protein (from PDB file)
Authors (from PDB file)
Comment line (user comments from the previous step)
Resolution
[number of residues in protein] number of tryptophan residues {the position of tryptophan residues in sequence of protein}

Table

In order to describe the surrounding of each of nine atoms of indole ring independently, we introduce new systems of coordinates centered at each atom of indole ring by turn.
First, three atoms of indole ring are listed in the table. The new system of coordinate is centered at the first of three listed atoms.

There are 3 parts of the table, which contain information about atoms - neighbors of the first of three indole atoms:

  1. list of polar (O, N and S) protein and solvent atoms at distance of 5.5 Å from the first of three atoms of indole ring of tryptophan residues.
  2. list of carbon protein atoms at distance of 5.5 Å from the first of three atoms of indole ring of tryptophan residues.
  3. list of polar (O, N and S) protein and solvent atoms at distance between 5.5 and 7.5 Å from the first of three atoms of indole ring of tryptophan residues.

The information in table contains X,Y,Z coordinates for each atom presented in PDB file, the recalculated X,Y,Z coordinates centered at the atoms of tryptophan residues and spherical system of coordinates, which provides distances ( ρ ) and orientations ( φ is the azimuth and θ is the elevation) of all atoms. The last column contains B-factor values.

Output file: Result-table.txt

File contains information about orientation and distances of potential partners of H-bonds with atoms of tryptophan residue, selected among all polar protein and solvent atoms.

Line 1-10

Date of calculation
4 character PDB code
Name of protein (from PDB file)
Source of protein (from PDB file)
Authors (from PDB file)
Comment line (user comments from the previous step)
Resolution
[number of residues in protein] number of tryptophan residues {the position of tryptophan residues in sequence of protein}

Next Lines

The data in this part of table is presented only for tryptophan pairs, the distance between centers of indole ring of which is = 12 Å.

The data include

  1. Cos Qda - cosine of angle (Qda) between emission dipole of donor and absrobtion dipole of acceptor
  2. k^2 – orientation factor
  3. Ro – Forster distance for donor-acceptor pair
  4. E, % - efficiency of energy transfer

Next Lines

The distances between centers of indole ring of all tryptophan pairs

Table

The first part of the table contains the following parameters calculated for each of nine atoms of indole ring of tryptophan residue separately:

Accessibility to solvent (%) calculated not taking into account water molecules
Accessibility to solvent (%) calculated taking into account water molecules
Number of all neighbor atoms at distances 5.5 and 7.5 Å
Number of all neighbor polar (N, O, S) atoms at distances 5.5 and 7.5 Å
Number of all polar protein atoms at distances 5.5 and 7.5 Å
Number of all water molecules at distances 5.5 and 7.5 Å
Number of all other hetatoms at distances 5.5 and 7.5 Å
Fraction of polar atoms among all atoms at distances 5.5 and 7.5 Å
Fraction of protein polar atoms among all atoms at distances 5.5 and 7.5 Å
The sum of positive charges at distances 5.5 and 7.5 Å  
The sum of negative charges at distances 5.5 and 7.5 Å

The second part of the table contains the averaged information for the whole tryptophan residue at distances less than 5.5 Å, from 5.5 Å to 7.5 Å and at distance less than 7.5 Å.

Accessibility to solvent (%) calculated not taking into account water molecules
Accessibility to solvent (%) calculated taking into account water molecules Averaged B-factor of polar atoms Parameters B1 and B2: crystallographic B-factors of polar atoms at distance < 5.5 Å and in range of 5.5 -7.5 Å from the indole ring, normalized to the mean B-factor value of all the C a atoms in the crystal structure
Parameter R1 and R2 [ R1 = Acc*B1 and R2 = Acc*B2 ], dynamic accessibilities are dynamic characteristic of the microenvironment
Number of all neighbor atoms
Number of all polar neighbor atoms
Number of all water molecules
Fraction of polar atoms among all neighbor atoms
Parameter A1 and A2 - relative polarity of environment: portion of polar atoms amongst all atoms
Sum of all positive charges around tryptophan
Sum of all negative charges around tryptophan
Difference between positive and negative charges around tryptophan
Charge difference equals to the difference between sum of all charges near pyrole ring minus sum of all charges near benzene ring

Next parts of the table contains information (atom name, residue name and number, R – distance, orientation (cos(THETA) and cos(FI)) and B-factors) about possible partners of hydrogen bonds located at distances 5.5 Å and in range of 5.5 -7.5 Å from each of nine atoms of indole ring. According to the geometric criteria of hydrogen bond:
cos(THETA) must be near to 1 for possible donors (these atoms are considered as potential donors: main-chain nitrogen atoms; Sγ of Cys; Nε2 and N&delata;1 of His; Nζ of Lys; Nδ2 of Asn; Nε2 of Gln; Nε, Nη1 and Nη2 of Arg; Oγ of Ser; Oγ1 of Thr; Oη of Tyr, Sγ of Cys); and cos(TETHA) and cos(FI) must be near to 0 and 1, respectively, for possible acceptors (these atoms are considered as potential acceptors: main-chain carbonyl oxygen; Oδ1 and Oδ2 of Asp and Asn; Sγ of Cys; Oε1 and Oε2 of Glu and Gln; Nδ1 of His; Sδ of Met; Oγ of Ser; Oγ of Thr; Oη of Tyr).

All atoms listed above considered as potential partners for hydrogen bonding if they were located within a cone with angles differing by less than ca. 20 ° from the ideal geometry of H-bonds, i. e. |cos(THETA)| >0.9 for possible donors and |cos(THETA)| <0.35 and cos(FI)>0.9 for possible acceptors.

Last part of the table contains information about sulphur atoms of Cys and Met, which are considered as good quenchers of fluorescence of tryptophan residue.

Last Line

The averaged B-factor of all Calpha atoms in protein

Output file: Summary.txt

The Summary.txt file contains information about assignment of tryptophan residues to the spectra-structural classes. The detailed information about assignment procedure could be found in Background section.

Line 1-10

Date of calculation
4 character PDB code
Name of protein (from PDB file)
Source of protein (from PDB file)
Authors (from PDB file)
Comment line (user comments from the previous step)
Resolution
[number of residues in protein] number of tryptophan residues {the position of tryptophan residues in sequence of protein}

Next Lines

Contain information about classification scores, Mahalanobis distances and probabilities (the sum of all 5 probability equals to 1) of assignment of tryptophan residue to one of five spectral-structural classes.  

The classification score is used to determine the most probable class to which a tryptophan belongs. A tryptophan residue belongs to the class for which it has the highest classification score.

Mahalanobis distance is a distance measure between tryptophan residue and class centroid, which takes into account correlation in the class. Class centroids are calculated from the training set.

The posterior probability that tryptophan residue belongs to a particular class is proportional to the Mahalanobis distance (the posterior probability is calculated taking into account a priori probability). Tryptophan residue belongs to the class for which it has the highest posterior probability.

More information about calculations of classification score, Mahalanobis distance and probability of assignment could be found in Background section.