Selecting Molecules for Virtual Screening

E. K. Davies and C. J. Davies
Treweren Consultants Ltd, Holmleigh, Evesham Road, Harvington, Evesham WR11 8LU, UK

Abstract

This paper describes the origin of molecules of the 3.5 billion molecules that were used in the CAN-DDO Screen Saver Cancer Research project. In addition, to commercially available catalogues and well-established combinatorial libraries, de novo derivative generation was used to increase the number of molecules by two orders of magnitude. The importance of drug-likeness criteria and those used are also discussed.

Introduction

In order to make a significant contribution to pharmaceutical research, High Throughput Screening (HTS) requires a large number of molecules to be tested for biological activity. It is not unusual to be able to test over 10,000 samples per day which means that most pharmaceutical in-house historical collections can be tested in 1-3 months. Unless larger numbers of molecules are to be tested any further reduction in this time is unlikely to significant reduce the timescales for drug discovery research. However, there are two other issues of potential concern: the cost of testing and the limited small quantities of each sample in the collection. Consequently, if all molecules where tested on all screens, over time many companies would face the prospect of consuming their entire in-house historical collection or at least substantially reducing the quantities of samples. For some companies combinatorial chemistry provides a means of generating samples for HTS. However, even if it were conceivable to make all drug-like molecules by combinatorial methods, the costs would be prohibitive.

This paper focuses on early stage lead generation and assumes that the elucidation of the human genome and rapid increase in the number of 3-D protein receptor structures will mean that it is not necessary to make and biologically test large numbers of drug-like molecules. Instead vast numbers of molecules should be computationally evaluated. Fortunately, there is a vast range of software which claims to be applicable to this problem^1,2 which leaves the issue of how generate appropriate molecules in sufficient numbers.

Drug-likeness

It is obvious that many small organic molecules such as those that insoluble, reactive or toxic are generally unsuitable as potential drugs. The importance of ADME-Tox (Adsorption, Distribution, Metabolism, Excretion and Toxicity) is widely acknowledged and significant progress has been made understanding and predicting such molecular characteristics ^3-7. Nonetheless, the best predictions remain approximate and some of the crudest and fastest approaches of eliminating molecules based on simple properties such as molecular mass, number of heteroatoms, LogP etc are often used. In addition, it is common practice to eliminate reactive and toxic molecules based on undesirable substructures.

The THINK software⁸ includes functionality to calculate a range of properties shown in Table I. There are a number of properties which can be used in combination to filter out molecules which are not drug-like. The CAN-DDO project⁹ used molecular mass, number of centres (hydrogen bond donors, acceptors, charged atoms, ring centres etc), polar surface area, number of rotatable bonds and number of conformations. The choice of properties and ranges was complicated by the fact that there are molecules which are marketed as drugs have property values outside the ranges indicated in Table I. It is acknowledged, that during lead optimisation the molecular weight and lipophilicity often increase and consequently leads which start with high values of either of these parameters are perhaps poor choices for optimisation.

Table I THINK properties and filters used for CAN-DDO

Name	Minimum^(a)	Maximum^(a)	Comment
Atoms			Counter
Bonds			Counter
HetAtoms			Counter
Donors			Count of hydrogen donors
Acceptors			Count of hydrogen acceptors
Positives			Count of positively charged centres
Negatives			Count of negatively charged centres
Acids			Count of acidic H-donors
Bases			Count of basic H-acceptors
Rings			Counter
Aromatic			Counter
HetAromatic			Counter
Branches			Counter
Halogens			Counter
Centres	2	9	Count of centres
Chirals			Count of chiral atoms
Mass	150	800
Flexibility			On geometric scale ^(b)
Volume			Based on VdW radii
Area			Based on VdW radii
Lipophilicity			On scale of 0-1 ^(b)
PSA			Polar surface area
NPSA			Non-polar surface area
PFA			Polar fractional area
NPFA			Non-polar fractional area
XSA	20	240	O+N surface area
XFA			O+N fractional area
CPK-Contacts			Counter
VDW-Contacts			Counter
Rot-Bonds	0	10	Count excluding rings
Conformers	0	1000000	Based on product of increments
E-Torsion			Torsional energy

(a) Where no minimum and maximum is specified no property filter was used
(b) The algorithms used are described elsewhere

If properties are used alone, then it is inevitable that some molecules are included which are reactive, easily metabolised or toxic. Consequently, it is common practice when selecting molecules for experimental High Throughput Screening to eliminate molecules which are considered undesirable based on a list of substructures. The substructures used in the CAN-DDO project (and by THINK by default) are indicated in Table II. This list was constructed following discussions with several vendors of samples for High Throughput Screening and certain pharmaceutical companies. Again, there are examples of drugs which contain many of these substructures and consequently some chemists would use a smaller or different list.

Table II Substructures used as filters

Unstable	Reactive	Undesirable
NOC	CC(H)=C(H)C=O	[M]
HNO	C=COH	[Si]N
[ND]=O	C=CNH	[Si]O
[NITR]	O=C[HAL]	SH
N[SSP2]	N=C[HAL]	C1XC1
O[SSP2]	S=C[HAL]	C1SC1
OO	[HAL]C[HAL]	SC#N
SS	HOCOH	C(=O)S
NN	O=COC=O	NP
N#N	COS(=O)(=O)C	PS
N=N=N	COS(=O)OC	C=P
N[CAK]O		P[HAL]
N=C=O		CN#C
N=C=S		PC#N
N=C=N		O=CC#N
O[HAL]		OCC#N
N[HAL]		NC#N
S[HAL]		CC(H)=O
OC[HAL]		PP
NC[HAL]
C=COH

THINK supports atom types and wildcards in square brackets

ND Nitrogen with double bond
NITR Nitrogen of nitro group
SSP2 Sp2 hybridised sulphur
CAK Alkyl chain carbon
HAL F, Cl, Br or I
M Metal atom
X Nitrogen or Oxygen

Catalogue Collections

Prior to commencing the CAN-DDO project the catalogues from some 13 suppliers were filtered to eliminate molecules that were not drug-like. The versions of the catalogues included in HTS Chemicals¹⁰ were used in some cases although updates were used where these were readily available. The numbers of molecules eliminated and some of the common reasons are summarized in Table III. The filters were applied to 3D structures in the order of Tables I and II with the consequence that once a molecule was eliminated it was not determined whether any other filters would also remove it. The most common reason for eliminating molecules was the inability of THINK 1.0 to create valid 3D coordinates. While THINK 1.14 has some improvements, the inability to create 3D structures for bridged ring systems remains the most significant limitation - notwithstanding some difficulties synthesising such molecules.

Table III Results of filtering catalogue molecules

Supplier	Total	Filtered	3D	Centres	Mass	XSA	[ND]=O	N-N	C=CNH	[HAL]C[HAL]
Asinex	55003	16243	12221	6120	60	1427	6259	5727	1310	1935
Bionet	24046	7935	6265	602	120	548	948	1863	330	2461
Chembridge	234402	108986	12912	25254	93	9689	25757	19094	12607	5710
Chemstar	49473	16744	9151	4204	268	1131	6681	3948	3933	1217
Comgenics-5	60000	20585	771	17565	0	940	1108	1769	3229	2331
Comgenics-10	40000	12810	606	12892	0	464	662	63	14	1525
Labotest	25700	7755	5069	2337	370	900	2067	3593	440	454
Maybridge	53929	24314	2229	3271	29	2178	4299	5304	1624	5417
Orion	18405	6144	2442	1178	163	530	2445	1621	196	1753
Sigma-Aldrich	71410	26392	11986	6133	2071	4252	5988	3530	463	2292
ChemDiv 01	14062	5998	1413	1725	0	332	1622	1249	547	398
ChemDiv 10	13202	5546	1077	1687	0	358	1437	1061	697	589
ChemDiv 50	115796	44419	6905	18660	1	2491	1420	12834	7185	3521
SPECS	79727	28306	21148	7077	492	1817	5465	6280	3581	1300
SVETA	97425	30758	24578	8901	126	1877	11029	10352	4063	2132
TRG	18533	7620	3317	1604	74	391	842	1866	1354	150
Totals	971113	370555	122090	119210	3867	29325	78029	80154	41573	33185

It is apparent that based on the THINK criteria, a significant proportion of the catalogue molecules marketed for High Throughput Screening are not drug like. Most suppliers appear to eliminate molecules on molecular mass but not the number of centres (or heteroatoms). Although it is known that some suppliers filter on lipophilicity, the range of property calculations used mean that this is not apparent from our work. The final set of molecules used for the CAN-DDO project removed 55,760 duplicates and added an update of the ASINEX catalogue and the drug-like subset of the NCI collection giving a total of 409,843 drug-like molecules which might be available for biological testing.

Combinatorial Chemistry Libraries

Combinatorial chemistry is now well-established as a means of making large numbers of molecules for High Throughput Screening. The range of chemistry which can be automated is quite large and continues to grow¹¹. Some of these libraries were developed to optimise specific series of molecules and do not have the generality of reagent that is common to many of the more well-known libraries. A set of 23 libraries were selected from the literature and are summarized in Table IV together with the results of filtering using the THINK standard drug-like criteria.

Table IV Libraries and R-groups

Library	Core	R-groups	Total^(a)	Filtered
L1	S=C1N([3])C(=O)C([1])N1C[2]	[1]C(H)=O[2]N=C=S[3]C(H)(N)C(=O)OH	268320	126105
L2	C1C(C(=O)O)N(C(=O)[2])C([1])S1	[1]C(=O)[HAL][2]C(H)=O	372528	217989
L4^(b)	[1]-[2]	[1]N=C=O[2]NH	144744	114269
L5^(b)	[1]-[2]	[1]C(=O)Cl[2]NH	267972	179137
L6^(b)	[1]-[2]	[1]OC(=O)Cl[2]NH	52812	33702
L7^(b)	[1]-[2]	[1]S(=O)(=O)Cl[2]NH	76284	55321
L8	[1]C(=O)N([2])C([3])C(=O)O[4]	[1]C(=O)Cl[2]N#C[3]N=C=S[4]OH	1156780872	28135200
L18	O=C(O)c1ccc2N([1])C(=O)N([2])c2c1	[1]N(H)(H)[2][HAL]	2364897	50119
L20	O=C(N)c1ccc2n([1])c(C[2])nc2c1	[1]N(H)(H)[2]NH	2529108	466930
L21	n1c2ccccc2n([1])c1CS[2]	[1]N(H)(H)[2]SH	156453	88314
L23B	NC(=O)c1ccc(cc1)-c2c([2])c([1])no2	[1]C(=O)OR'[2][HAL]	2183826	477703
L29	c1ccc(cc1)-c2nc([2])n([1])c2C(=O)N	[1]N(H)(H)[2]C(=O)OH	1543842	559489
L32	N1([1])CCC(=O)N([2])C1(=O)	[1]N(H)(H)[2]N=C=O	95682	71236
L36	c12ccccc1C(=O)N([1])C(=O)N([2])2	[1]N(H)(H)[2]N(H)(H)	1671849	566401
L37	[1]C1C(=O)NC([3])C(=O)N1C[2]	[1]C(H)(N(H)(H))C(=O)OH[2]C(H)=O[3]C(H)(N(H)(H))C(=O)OH	2875392	972364
L39	o1c([1])c([2])cc1C(=O)OCC	[1]C(=O)OH[2]C#CH	90744	62665
L41	n1([1])cc([2])cc1[3]	[1]C(=O)OH[2]C(H)(H)C(=O)H[3]C(H)(H)[NITR](=O)[0-]	1578753	618937
L43	C1([2])CC(=O)C=CN1C(=O)[1]	[1]C(=O)Cl[3][HAL]	250573	191834
L4A	[1]NC(=O)[2]	[1]N=C=O[2]NH	144744	125387
L5A	[1]C(=O)[2]	[1]C(=O)Cl[2]NH	267972	240212
L5C	[1]C(=O)[2]	[1]C(=O)OH[2]NH	958782	736052
L7A	[1]S(=O)(=O)[2]	[1]S(=O)(=O)Cl[2]NH	76284	67502
L44	C1([2])OC(=O)C=CN1C(=O)[1]	[1]C(=O)OH[2][HAL]	2183826	1170344
Total			1176936259	35327212

(a) Based on filtered R-groups in Sigma-Aldrich catalogue
(b) These libraries do not utilise valid reactions

During this work it became apparent, that it was necessary to implement within THINK functionality to enumerate libraries faster than commercial products such as Chem-X¹⁰ and in such a way that reagents and the associated products can be eliminated prior to enumeration. This required applying the upper property filters and substructure filters to the R-groups. In addition, use is made of the fact that many of the properties can be estimated by summing the properties of the R-groups with the consequence that products can be eliminated without performing a detailed enumeration. As a consequence, the effective enumeration speed was increased.

De Novo Derivative Generation

The numbers of molecules which can be made by combinatorial chemistry is dependent upon the available reagents and the number of libraries. In general, libraries with 4 or more R-groups such peptides and Ugi-type libraries have a large proportion of molecules that are not drug-like because of their flexibility or molecular mass. Furthermore, some care needs to be choosing libraries to avoid repeated use of the same lists of R-groups with subtly different cores. Thus, if larger numbers of molecules for virtual screening are to be generated, it might be necessary to consider reagents which are not commercially available.

For the CAN-DDO project, rather than generate reagent series, enumerate and filter the corresponding libraries, derivatives of molecules were generated and filtered at search time. This has the advantage of reducing the amount of data which has to be pre-processed and can also be applied to catalogue molecules. The current implementation uses a list of transformations which are summarized in Table V and include a range of oxidations, reductions, additions, eliminations and substitutions. The algorithm selects one of these at random and determines whether it can be applied. If so, one location is selected at random and the resulting molecule checked for drug likeness. In addition, in order to bias the molecules generated to be similar to the starting molecule, an annealing step is performed which effectively reduces the probability of molecule mass and other properties included in the drug-likeness criteria greatly increasing. It should also be recognised that it is not necessary to be restricted to transformations which correspond to chemical reactions.

Table V Transforms used by de novo derivative generator

Substructure	Replacement	Type
[A]C(H)(H)[A]	[A]C(H)(H)C(H)(H)[A]	Chain increase
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)C(H)(H)C(=O)C(H)(H)C(H)(H)	Chain increase
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)C(H)(H)C(=O)N(H)C(H)(H)C(H)(H)	Chain increase
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)C(H)(H)C(=O)OC(H)(H)C(H)(H)	Chain increase
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)C(H)(H)OC(H)(H)C(H)(H)	Chain increase
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)C(H)(H)N(H)C(H)(H)C(H)(H)	Chain increase
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)C(H)(H)SC(H)(H)C(H)(H)	Chain increase
C(=O)C	C(=O)OC	Chain increase
C(=O)C	C(=O)N(H)C	Chain increase
C(H)(H)C(H)(H)	C(H)(H)	Chain reduction
C(H)(H)C(=O)C(H)(H)	C(H)(H)	Chain reduction
C(H)(H)C(=O)N(H)	C(H)(H)	Chain reduction
C(=O)OC(H)(H)	C(H)(H)	Chain reduction
C(H)(H)OC(H)(H)	C(H)(H)	Chain reduction
C(H)(H)N(H)C(H)(H)	C(H)(H)	Chain reduction
C(H)(H)SC(H)(H)	C(H)(H)	Chain reduction
C(=O)OC	C(=O)C	Chain reduction
C(=O)N(H)C	C(=O)C	Chain reduction
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)OC(H)(H)	Heteroatoms
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)C(=O)C(H)(H)	Heteroatoms
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)N(H)C(H)(H)	Heteroatoms
C(H)(H)C(H)(H)C(H)(H)	C(H)(H)SC(H)(H)	Heteroatoms
[OAL]	C(H)(H)	Heteroatoms
C=O	C(H)(H)	Heteroatoms
[NAL]	C(H)	Heteroatoms
[SSP2]	C(H)(H)	Heteroatoms
OH	Cl	Substitution
OH	Br	Substitution
OH	I	Substitution
OH	F	Substitution
OH	C(H)(H)(H)	Substitution
OH	C#N	Substitution
OH	C(H)(H)N(H)(H)	Substitution
OH	N(H)(H)	Substitution
OH	OC(H)(H)(H)	Substitution
OH	OC(H)(H)C(H)(H)(H)	Substitution
Cl	OH	Substitution
Br	OH	Substitution
I	OH	Substitution
F	OH	Substitution
C(H)(H)(H)	OH	Substitution
C#N	OH	Substitution
C(H)(H)N(H)(H)	OH	Substitution
N(H)(H)	OH	Substitution
OC(H)(H)(H)	OH	Substitution
OC(H)(H)C(H)(H)(H)	OH	Substitution
N(H)(H)	N(C(H)(H)(H))(C(H)(H)(H))	Substitution
N(H)(H)	N(C(H)(H)C(H)(H)(H))(C(H)(H)C(H)(H)(H))	Substitution
N(H)(H)	F	Substitution
N(H)(H)	Cl	Substitution
N(H)(H)	Br	Substitution
N(H)(H)	I	Substitution
N(H)(H)	OH	Substitution
N(H)(H)	C#N	Substitution
N(H)(H)	N(H)C(=O)N(H)(H)	Substitution
N(H)(H)	N(H)C(=S)N(H)(H)	Substitution
N(H)(H)	C(H)(H)(H)	Substitution
N(H)(H)	OC(H)(H)(H)	Substitution
N(C(H)(H)(H))(H)	N(H)(H)	Substitution
N(C(H)(H)(H))(C(H)(H)(H))	N(H)(H)	Substitution
N(C(H)(H)C(H)(H)(H))(H)	N(H)(H)	Substitution
N(C(H)(H)C(H)(H)(H))(C(H)(H)C(H)(H)(H))	N(H)(H)	Substitution
F	N(H)(H)	Substitution
Cl	N(H)(H)	Substitution
Br	N(H)(H)	Substitution
I	N(H)(H)	Substitution
OH	N(H)(H)	Substitution
C#N	N(H)(H)	Substitution
N(H)C(=O)N(H)(H)	N(H)(H)	Substitution
N(H)C(=S)N(H)(H)	N(H)(H)	Substitution
C(H)(H)(H)	N(H)(H)	Substitution
OC(H)(H)(H)	N(H)(H)	Substitution
H	OH	Substitution
F	H	Substitution
Cl	H	Substitution
Br	H	Substitution
I	H	Substitution
OH	H	Substitution
N(H)(H)	H	Substitution
C(H)(H)(H)	H	Substitution
C(H)(H)C(H)(H)(H)	H	Substitution
C(H)(H)C(H)(H)C(H)(H)(H)	H	Substitution
N(H)	NC(=O)C(H)(H)(H)	Substitution
N(H)	NC(H)(H)(H)	Substitution
N(H)	NC(H)(H)C(H)(H)(H)	Substitution
[CAR]H	[CAR]N(H)(H)	Substitution
C(OH)(H)	C=O	Oxidation
CN(H)(H)	C[N+](=O)[O-]	Oxidation
C=O	C(OH)(H)	Reduction
C[N+](=O)[O-]	CN(H)(H)	Reduction
C=C	C(H)C(H)	Hydrogenation
C([HAL])C(H)	C=C	Elimination
C(OH)C(H)	C=C	Elimination
C=C	C(H)C(H)	Addition
C=C	C(F)C(H)	Addition
C=C	C(Cl)C(H)	Addition
C=C	C(Br)C(H)	Addition
C=C	C(I)C(H)	Addition
C=C	C(OH)C(H)	Addition

THINK supports atom types and wildcards in square brackets

A Any atom
OAL Aliphatic (non-aromatic) oxygen
NAL Aliphatic (non-aromatic) nitrogen
SSP2 Sp2 hybridised sulphur
CAR Aromatic carbon

When starting from similar molecules, it is conceptually possible to generate identical molecules and in the worst case scenario the series might converge to give identical structures at each subsequent step. The probability of this occurring is effectively eliminated by using different random number generator starting number or seed for each molecule. In the CAN-DDO project, 100 derivatives were attempted for each of the 35 million starting molecules, generating approximately 3.5 billion molecules. In order to give some indication of the possible overlap in derivatives some simple experiments were performed:

(a) Starting with 3 commercially available molecules, a 100 derivatives where generated twice for each molecule using different random number seeds. This resulted in 3, 5 and 3 identical molecules.
(b) A further experiment took two derivatives that had been generated (which may or may not be related by a single transform) and generated 100 derivatives from each of these using the same random number seed. This resulted in 2, 0 and 0 identical molecules.
(c) A final experiment extracted 3 random molecules from a medium sized library (L4A) and the lists of molecules in this library that were more than 99% similar to these using THINK's functional group based keys. For each of the 3 starting molecules and an arbitrary similar molecule, 100 derivatives where generated using the same random number seed. On comparison, no identical molecules were found.

These experiments confirm the intuitive conclusion, that if the catalogues or libraries contain molecules that are closely related (eg by a transform known to the de novo derivative generator), then there is a small chance of generating duplicate molecules. In practice, the fact that very few molecules are related in this way and even for these different random number generator seeds are used, means that the number of duplicate molecules is negligible.

The derivatives generated depend upon the transformations present and the random number generator and starting seed. To reproduce a given analogue series that may be of historic interest, the order of transformations, the random number generator or the random number seed can be modified. Figure 1 shows an interesting series which starts burimamide (a molecule which shows some histamine H2 antagonist activity) through a series of transformations to cimetidine (which was the first commercial H2 antagonist) to the more potent ranitidine molecule. Although, this example omits some of the earliest molecules and many analogues that were made, the discovery of ranitidine was made independently and those who discovered cimetidine failed to find a significantly better molecule¹². Perhaps if THINK had been available at the time, the fortunes and future of the company that discovered cimetidine would have been different.

Conclusions

This paper suggests that it is possible to generate vast numbers of drug-like molecules for virtual screening and that the catalogue molecules available represent a small proportion of these (about 0.1%). It is also apparent that the proportion of drug-like molecules from huge libraries is usually small and consequently such libraries are likely to be of lower value. There may be scope to increase the numbers of derivatives and/or the number of libraries to cover more drug-like molecules. However, this will inevitably give more hits for subsequent analysis and it is not necessarily true that this will have the consequence of reducing timescales and costs for drug discovery.

One might expect the inclusion of derivatives to result in series of related hits which may be indicative of hits that are synthetically accessible are potentially good leads. In addition, the approach should identify families that might be missed if only a small number of representatives for each family (or cluster) was screened. Obviously, some virtual screening software is too slow and/or cannot be run on a sufficiently large distributed processing scale to include derivatives in the virtual screening.

References

(1) Warr, W. Virtual High-Throughput Screening: Computational Tools for Drug Discovery and Design in Spectrum Life Sciences, Decision Resources, Inc, MA, USA. 2001
(2) Li, J., Murray, C.W., Waszkowycz, B. and Young, S.C. Targeted molecular diversity in drug discovery: integration of structure-based design and combinatorial chemistry. Drug Discovery Today 1998, 3, 105-112.
(3) Lipinski, C.A., Lombardo, F., Dominy, B.W. and Feeny, P.J. Experimental and Computational Approaches to Estimate Solubility and Permeability in Drug Discovery and Development Settings. Advanced Drug Delivery Rev. 1997, 23, 3-25.
(4) Clark, D.E. and Pickett, S.D, Computational methods for the prediction of 'drug-likeness' Drug Discov. Today 2000, 5,59-58
(5) Liu, R. and So, S-S. Development of Quantative Structure-Property Relationship Models for Early ADME Evaluation in Drug Discovery. 1 Aqueous Solubility. J. Chem. Inf. Comput. Sci. 2001, 41, 1633-1639
(6) Liu, R. Sun, H. and So, S-S. Development of Quantative Structure-Property Relationship Models for Early ADME Evaluation in Drug Discovery. 2 Blood-Brain Barrier Penetration J. Chem. Inf. Comput. Sci. 2001, 41, 1623-1632
(7) Beresford, A.P., Selick, H.E. and Tarbit, M.H. The merging importance of predictive ADME simulation in drug discovery. Drug Discov. Today 2002, 7, 109-116.
(8) Davies, E.K. and Davies C.J. THINK A new program for Virtual Screening In preparation
(9) Hand, L. Computing for Cancer Research. The Scientist 2001, 15, 1-5.
(10) HTS Chemicals was developed at Chemical Design by the authors for use with Chem-X. Both products were discontinued after the company was purchased by Oxford Molecular in 1998.
(11) Solid Phase Synthesis database available from Accelrys (www.Accelrys.com)
(12) Ganellin, C.R. Chemistry and Structure-Activity Relationships of Drug Acting at Histamine Receptors in Pharmacology of Histamine Receptors; Ganellin, C.R. and Parsons M.E. Eds; John Wright & Sons 1982 Chapter 2 p10-102

ND	Nitrogen with double bond
NITR	Nitrogen of nitro group
SSP2	Sp2 hybridised sulphur
CAK	Alkyl chain carbon
HAL	F, Cl, Br or I
M	Metal atom
X	Nitrogen or Oxygen

A	Any atom
OAL	Aliphatic (non-aromatic) oxygen
NAL	Aliphatic (non-aromatic) nitrogen
SSP2	Sp2 hybridised sulphur
CAR	Aromatic carbon