As discussed in an earlier blog I spent some time chatting with Paul Doherty and Peter Murray Rust this weekend…specifically around InChIs and InChIKeys. I’d originally suggested to Paul that he put InChIs on the site so that I could use them to check for presence of the structures he draws in the ChemSpider database. Well, now he’s started to include them on his postings I get to check them.

I started with a search on the term Diazonamide A in Pubchem. Two hits….shown below.

Diazonamide A on PubChem 

Diazonamide A is a complex structure. Those structure representations drawn above are NASTY and do need cleaning for sure. Unfortunately Chempider has the same issues for these two structures (see below) since we did not CLEAN the PubChem dataset. Cleaning these structures in not an easy task. We are working on improving this as discussed previously.

The two shown on PubChem have different Isomeric SMILES and different InChIs. Why? ONE stereocenter difference…see the highlighted difference below (in red) and the arrow to the one stereocenter difference.

Stereo Differences

 It is appropriate to have two structures in PubChem..they are unique. But now we don’t know which one is Diazonamide A. Shucks.

And so, let’s check eMolecules. I didn’t find any hits. eMolecules did take a lot of PubChem into their dataset. the reason for not finding it might be “it’s not there” because of eMolecules focus on commercial suppliers OR because “I queried incorrectly”. Don’t know.

So, to ChemSpider. A search on Diazonamide A gave SIX hits. Two of these are the exact ones from PubChem. The four others are shown below…

Marinlit Diazonamide A

Of these four two of them have an “additional oxygen and two hydrogen atoms” in the molecular formula.

Is that right or wrong? TotallySynthetic has the formula as C40H34Cl2N6O6 so we’ll trust Paul and curate these two records as IN ERROR as shown below (A primary advantage of ChemSpider is we are allowing curating!). I’ll also let Marinlit know…


 Curate Out

 The differences between Diazonamide A and the “incorrect structure” are shown below just for information. SIGNIFICANTLY different.

Differences in Marinlit

Let’s take a look at the different InChIs for all structures we are considering – 2 PubChem, 2 ChemSpider (from Marinlit) and 1 from TotallySynthetic. They are ALL different and all differ in the sterochemistry layer.

PubChem CID: 395475

PubChem CID: 5492609
InChI=1/C40H34Cl2N6O6/c1-15(2)27-37-46-29-32(54-37)40-20-9-5-8-19(18-7-6-10-22-25(18)26(33(41)43-22)31-34(42)48-38(29)53-31)28(20)47-39(40)52-24-12-11-17(13-21(24)40)14-23(35(50)45-27)44-36(51)30(49)16(3)4/h5-13,15-16,23,27,30,39,43,47,49H,14H2,1-4H3,(H,44,51)(H,45,50)/t23-,27-,30-,39-,40u /m0/s1/f/h44-45H

ChemSpider CID: 10478902 


ChemSpider CID: 17212293 

InChI=1/C40H34Cl2N6O6/c1-15(2)27-37-46-29-32(54-37)40-20-9-5-8-19(18-7-6-10-22-25(18)26(33(41)43-22)31-34(42)48-38(29)53-31)28(20)47-39(40)52-24-12-11-17(13-21(24)40)14-23(35(50)45-27)44-36(51)30(49)1 6(3)4/h5-13,15-16,23,27,30,39,43,47,49H,14H2,1-4H3,(H,44,51)(H,45,50)/t23-,27+,30+,39-,40+/m1/s1 

Totally Synthetic



The TotallySynthetic structure has the stereo layer t23-,27-,30-,39?,40-. In real terms this relates to five stereo centers. At this point I have to question whether the structure as drawn is correct or not..what we have is five structures with the SAME connectivities but with different stereochemistries. This is similar to the issue I blogged about previously in regards to Taxol.


I’m not sure that we have the right structure of Diazonamide A yet..but I’m sure we’ll get there very quickly with this question out there in the open. I HOPE SO!

