How many features can I expect per sample?

The number of features is a difficult prediction, and realistically, not a good proxy to the amount of knowledge you can gain from that dataset. Some compounds with many spin systems (I like to use the macrolide Erythromycin as an example) will show enough resolution between their signals to derive many spin system features. Others, which have very long spin systems (think a fatty acid) will have few spin systems. Therefore, each extract queried in MADByTE will produce all resolvable spin system features. The data filtration is rigorous - we toss out many points from consideration if they are conflicted too heavily, so what you end up with is a robust feature to prioritize downstream effort.

If the system uses TOCSY data, is it possible for me to use a COSY instead?

Actually, yes! There are some benefits and some drawbacks to using a COSY rather than a TOCSY, depending on what types of samples you are working with. For one, a TOCSY spectra gives the chance for peak picking to be ‘imperfect’ (by that, I mean that you missed a peak in your picking scheme), in a TOCSY, you should expect to see this signal elsewhere in the chain of signals. However, with a COSY, you may not see this signal elsewhere, so it may be dropped from the processing. One major benefit is that a COSY actually gives you better connectivity information - you know exactly who the neighboring proton is. This can expedite your downstream work, but again, should be taken as an option - not a recommendation.

NMR is solvent dependent… so what solvent should I use?

It uses your favorite solvent! The entire purpose of MADByTE was to open it up for general use and widespread application. That means that as long as you stick to YOUR favorite solvent, the underlying theory and programming remains exactly the same. As a suggestion, if you have more than one solvent in use in your practice/laboratory, try keeping each of these data sets as separate projects to ensure the processing knows they are separate.

How does the dereplication library work?

The dereplication library is currently built as an ‘in house’ utility - that is to say that it only uses information that you tell it to. This means that if you have a library of compounds at your disposal, you’ll need to run an HSQC of them, as a minimum. The good news: There is a very big push in the NMR community to make databases of raw data available publicly (doi.org/10.1039/C7NP00064B). This means that in the very near future, databases of various natural products in different solvents will be readily available for comparison.