Producing visually attractive chemical structures with depictions which communicate sufficient detail about the molecule to make them easy to interpret is easy for simple molecules and challenging for complex structures and even some of nominal complexity. Consider the images below.


Complex it’s not but even certain algorithms cannot clean this fairly simple structure. Appropriate cleaning will give


Performed incorrectly the resulting depictions can be very confusing and communicate incorrect information to the user. Consider the structure below.


Inspection of the structure might initially confuse the user into consider the presence of a napthyl ring system. However, consider the bonds within the ring and you will notice it is not a naphthyl ring. CLEANing the molecule gives us:


We acknowledge that we have some “pretty ugly” structures online as a result of us depositing structures from various sources and with millions of structures we are sure to have you problems. For our curators and our depositors we have provided the ability to CLEAN molecules online. The capability has been introduced into the Structure Drawing Applet and is outlined in a technical note online.

CLEANing algorithms are not easy to perfect. None are ideal. But they ARE absolutely necessary. For those of you receiving the CrystalEye feed you will see a lot of interesting depictions especially for the organometallics. The organic compounds should be fairly easy to CLEAN up under most conditions. The examples below show examples.



Hopefully_the CLEANing capabilities we have introduced will help our curators and depositors improve the depictions of structures on ChemSpider.

Stumble it!

One Response to “The Challenges of CLEANing Chemical Structures”

  1. Chris Singleton says:

    Definitely agree that structure cleanliness is a significant problem, and in some cases may not lead to error but still makes it less clear. This is evident in the case of some steroids on Chemspider, such as ethynyl estradiol, prednisone, and prednisolone, in which the -OH overlaps with the methyl. If you look at testosterone in comparison, the -OH does not overlap with the methyl since the -OH carbon does not have an additional R-group on it.

Leave a Reply