SMILES is an acronym for Simplified Molecular Input Line Entry System. It is a chemical notation system used to represent a molecular structure by a linear string of symbols. The SMILES notation system was specifically designed for computer use by chemists.
Atoms
Atoms are represented by their atomic symbols.
C is carbon N is nitrogen S is sulfur F is fluorine I is iodine P is phosphorus O is oxygen Cl is chlorine
The hydrogen atom are not include in smiles notation. Hydrogen attachments are determined by the program. This greatly simplifies a SMILES notation.
Compound Molecular Formula SMILES Notation
Ethylene CH2=CH2 C=C
Propylene CH2=CH-CH3 C=CC
2-Butene CH3-CH=CH-CH3 CC=CC
Bond
Single bonds do not need to be shown
Compound Molecular Formula SMILES Notation
Propane CH3-CH2-CH3 CCC
Butane CH3-CH2-CH2-CH3 CCCC
Branches
Branches in molecular structures are designated by enclosures in parentheses. The examples of SMILES given in the lists above represent straight, linear compounds. When a structure contains a branch, the SMILES Notation of the structure requires that the branch be designated in enclosed parentheses. The figure below illustrates branching.
A single structure can have more than one valid SMILES notation. As an example, valid SMILES notations for the isobutyric acid structure include the following:
CC(C)C(=O)O
C(C)(C)C(=O)O
OC(=O)C(C)C
O=C(O)C(C)C
(1) Cyclic structures require numbers to indicate where the ring starts and stops. The numbers 1 through 9 are used to indicate the starting and terminating atoms.
(2) The same number is used to indicate the starting and terminating atom for each ring. The starting and terminating atom must be connected to each other!
(3) Each number that is used (1, 2, 3, etc.) must appear twice and only twice in the entire smiles notation.
(4) Numbers are entered immediately following the atoms used to indicate the starting and terminating positions.
(5) A starting or terminating atom can be associated with two consecutive numbers. For example, naphthalene can be coded as: c12ccccc1cccc2.
Here is some example about the structures and the notation to make you more understand.
Unbranches chain |
Propane CH3-CH2-CH3 CCC
Ethylene CH2=CH2 C=C
2-Butene CH3-CH=CH-CH3 CC=CC
|
Branches chain |
|
Cyclic structure |
|
If you want to further your knowledge about this SMILES Notation. You can learn from here
That's all from us. May what have we shown to you give u a better understanding. :)