The Small Molecule Suite

The Small Molecule Suite (SMS) is a free, open-acces tool developed by the Harvard Program in Therapeutic Sciences (HiTS) and funded by the NIH. The goal of the SMS is to help scientists understand and work with the targets of molecular probes, approved drugs and other drug-like molecules, while acknowliging the complexity of polypharmacology — the phenomenon that virtually all drug-like molecules bind multiple target proteins. The SMS combines data from the ChEMBL database with prepublished data from the Laboratory of Systems pharmacology. The methodology of calculating selectivities and similarities are explained in Moret et al. Cell Chem Biol 2019 (which can also be used to cite the Small Molecule Suite).

This work is licensed under the Creative Commons Attribution-ShareAlike license.

Compound affinity and binding assertions

Target Affinity Spectrum (TAS) values are binding affinity assertions that aggregate compound binding data from heterogeneous sources, such as full dose-response affinity measurements, single dose binding assays and binding assertions from the literature.

The binding assertions are 1, 2 and 3, with 1 representing the strongest and 3 the weakest binding. 10 represents confirmed non-binding.

See the publication for details.

Download Excel Download CSV
Target gene

The Selectivity app helps you find selective and potent small molecules against your target of interest.

To use the Selectivity app:

  1. Select a gene of interest in the top left corner of the application
  2. Change the filter settings as needed
  3. Look at the 'Affinity and selectivity plot' and select a region of compounds you are interested in
  4. The 'Affinity and selectivity data' will change upon your selection in (3), select the compound you are most interested in to see all its known targets in the 'Affinity and selectivity reference' (you may have to scroll down)
Reference compound
Compare selected compound to all other compounds in the database.

The Similarity app helps you find compounds similar to your compound of interest.

To use the Similarity app:

  1. Select a reference compound and set filters as desired. Three plots show up under 'Compound similarity plots'. These plots describe the similarity to the reference compound in phenotype (PFP), targets (TAS), and chemical structure (structural similarity) -- calculated using Morgan2 fingerprints in RDkit.
  2. Select an area of the compound similarity plots you are interested in. They will show up in table format under 'Compound similarity data'.
  3. In the 'Compound similarity data' select a compound so see Its Target Afinity Spectrum in the 'Compound similarity selections' that shows up below (you may have to scroll down).

Type or paste gene symbols in the text box below to generate a downloadable table of drugs targetting those genes. One gene per line.

This tool uses HUGO symbols Please see for help.
Selecting a choice will populate the input above with an example list of genes.

Add compounds with the given selectivity to the library.
Add compounds that are approved or in clinical development to the library.
Add compounds endorsed by experts to the library.

The Library app helps you build custom small molecule libraries

To use the Library app:

  1. Submit a list of targets that you want to build the library for (in HUGO nomenclature), or select one of the pre-selected gene lists.
  2. Select up to which selectivity level you want to be included.
  3. Select which approval phases you want to include for clinical compounds.
  4. Select whether to include the compounds from (4.0 star rating only).
  5. Choose whether to view the table per target or per compound
  6. Download the library.

Download Small Molecule Suite data

SMS version based on ChEMBL v29

The entire Small Molecule Suite dataset is available for download.
The data are organized in separate tables. Documentation for each table and their relationships are available.

Table documentation Download tables from Synapse

Download tables in CSV format
SMS version based on ChEMBL v29
Name Description Size
lsp_biochem_agg Table of aggregated biochemical affinity measurements. All available data for a single compound target pair were aggregated by taking the first quartile. 15.2 MB
lsp_biochem Table of biochemical affinity measurements. 52.9 MB
lsp_clinical_info Table of the clinical approval status of compounds. Sourced from ChEMBL 45.5 kB
lsp_commercial_availability Table of the commercial availability of compounds. Sourced from eMolecules ( 388.0 MB
lsp_compound_dictionary Primary table listing all compounds in the database. During compound processing distinct salts of the same compound are aggregated into a single compound entry in this table. The constituent compound IDs for each compound in this table are available in the lsp_compound_mapping table. 1.2 GB
lsp_compound_library Library of optimal compounds for each target. See 10.1016/j.chembiol.2019.02.018 for details. 115.5 kB
lsp_compound_mapping Table of mappings between compound IDs from different sources to the internal lspci_ids. 295.6 MB
lsp_compound_names Table of all annotated names for compounds. The sources for compound names generally distinguish between primary and alternative (secondary) names. 13.2 MB
lsp_manual_curation Table of manual compund target binding assertions. 9.6 kB
lsp_one_dose_scan_agg Table of single dose compound activity measurements as opposed to full dose-response affinity measurements. All available data for a single concentration and compound target pair were aggregated by taking the first quartile. 2.2 MB
lsp_one_dose_scans Table of single dose compound activity measurements as opposed to full dose-response affinity measurements. 5.2 MB
lsp_phenotypic_agg Table of aggregated phenotypic assays performed on the compounds. All available data for a single assay and compound target pair were aggregated by taking the first quartile. 70.6 MB
lsp_phenotypic Table of phenotypic assays performed on the compounds. 82.7 MB
lsp_references External references for the data in the database. 2.2 MB
lsp_selectivity Table of selectivity assertions of compounds to their targets. See 10.1016/j.chembiol.2019.02.018 for details. 24.9 MB
lsp_structures Additional secondary InChIs for compounds. 14.2 MB
lsp_target_dictionary Table of drug targets. The original drug targets are mostly annotated as ChEMBL or UniProt IDs. For convenience we converted these IDs to Entrez gene IDs. The original mapping between ChEMBL and UniProt target IDs are in the table `lsp_target_mapping` 1.4 MB
lsp_target_mapping Mapping between the original ChEMBL target IDs, their corresponding UniProt IDs and Entrez gene IDs. A single UniProt or ChEMBL ID can refer to protein complexes, therefore multiple gene IDs often map to the same UniProt or ChEMBL ID. 273.8 kB
lsp_tas_references Table that makes it easier to link TAS values to the references that were used to compute the TAS values 3.8 MB
lsp_tas Table of Target Affinity Spectrum (TAS) values for the affinity between compound and target. TAS enables aggregation of affinity measurements from heterogeneous sources and assays into a single value. See 10.1016/j.chembiol.2019.02.018 for details. 10.6 MB