Schrodinger (NASDAQ:SDGR) is a computational biology platform that uses a physics-informed algorithm, free energy perturbation (FEP), combined with machine learning to design drugs. In addition to licensing its code out to pharmaceutical companies, the company operates a partnership model, where SDGR will leverage its software and take equity or cash for the completion of milestones. Taking out cash and these securities, SDGR trades at 8x revenue with a history of losses. The company’s 2022 revenue composition was software service revenue was $135m including SaaS licensing, consulting, and maintenance. The company also earned $45m on drug discovery, which is revenue from their biotech partnerships. The driving force of underlying value is the success of their software business. In the long term, drug discovery revenue depends on pharmaceutical companies partnering with SDGR due to the value of their software.
I rate SDGR as a short because its software likely is becoming obsolete. The first part of this write-up will set up the situation: fundamentals like revenue are flatlining and business economics as measured by operating, net, and gross margin all are deteriorating. The second part will explain why: I argue that the company’s software performance is further and further from the state of the art—providing a reason for continued difficulties. The third part will provide a valuation and conclusion.
Part 1: Current Business Situation
Before getting to the short thesis, I would like to recover a valuation for SDGR’s cash, securities, and equity ownership of firms. Here are the fair value estimates of the securities it holds:
The levels refer to how cleanly one can estimate the valuation, with level 1 being a publicly quoted security and level 3 being a private asset without easy-to-reference comps. The level 3 valuations are based on liquidation values which significantly undervalue these securities. Being conservative and ascribing the upper end of valuation on the level 3 securities results in a total portfolio valuation of roughly $875m. Most of it was driven by Nimbus which is a part of the level 3 valuation but is likely between $38m-$380m carrying value based on a 3.8% ownership (pg. 16) and a $1b to $10b valuation. That is responsible for 95% of the delta between the $480m reported valuation and the $875m I provide. SDGR also has a portfolio (pg. 9) of other drugs, mostly in discovery, that I value at $200m. They also have an additional joint venture drugs portfolio (pg. 10) which pays out as drug discovery revenue based on certain milestones that I won’t double count. The company also increased cash and marketable securities by 100m by the end of Q2 due to Nimbus selling a drug in development to Takeda. I’m probably overvaluing Nimbus as I take into account the cash from the partial Nimbus sale and I also value the company at the top of their valuation range. Taking all this into account, SDGR trades at 8x revenues.
SDGR’s Q2 earnings were bad. For the first time in its history as a public company, year-over-year earnings declined. Here are the reported earnings:
While drug discovery declines are cause for concern, the decline of software revenue, while more moderate, is also more problematic. Management says drug discovery declines are due to the timing of milestones based on JV business decisions. Considering Q1 drug discovery revenue was abnormally high, this could be a valid issue. However, software revenue is the main leading indicator of the value proposition of SDGR. Flatlining software suggests not only negative results for software but ultimately less future drug discovery revenue as firms are less willing to partner with inferior technology.
Software revenue has been decelerating and even trading downward for some time now. Here is a chart illustrating software revenue (dark blue) and drug development (light blue) where revenue is a 4-quarter rolling average.
Not only was trailing 12 months Q2 software revenue below Q1, but Q1 software revenue was below Q4. This was mitigated in Q1 earnings by the large, abnormally large based on guidance, drug development revenue. Q2 software revenues declined by 2%, and Q1 software declined by 3%. While in the first half, the company reported a profit of $133m, this was entirely due to fair value changes of $223.7m mostly related to a Nimbus sale of its TYK2 program to Takeda. Since I’ve revalued Nimbus to a reasonable if not high valuation, the expected changes in equity valuation should roughly be zero going forward, thus it makes sense to value this as a one-off. Without these earnings, SDGRs first-half 2023 income is roughly -$90m.
Perhaps to compensate for the poor quarter, the company raised 2023 software guidance even though they suggesting that Q3 software revenue comes in line with current quarterly revenue and not the year-over-year forecasted 15%-18% revenue growth. Taken together, Q4 software revenue must blow current trends out of the water. Additionally, the company guided drug discovery revenue to $50-70m from $70m-90m but still slightly above the $45m in 2022. This is disappointing as year-to-date Q2 drug revenue is already 85% of 2022 drug discovery revenue. After earnings were released SDGR tanked, suggesting investors don’t believe management either. In my opinion, management will shred their credibility if, after proposing such aggressive guidance, they cannot meet it.
Flattening revenue in conjunction with large losses suggests poor business economics. Taking a longer-term view reveals structural problems with customer acquisition costs:
This chart illustrates gross profit (orange), SG&A (yellow), and R&D (aqua green). In 2017, operating profit—the sum of gross profit minus SG&A and R&D—is negative, but the hope is with operating leverage the margins will improve. However, as you can see from this chart, as gross profit increases SG&A and R&D increased even more. Potentially you can make the argument that R&D will increase profits tomorrow, but given the SG&A number, it looks like the lifetime value of a customer is negative. The growth in SG&A, though, is less excusable. Much of the administrative costs should be fixed, and sales costs should be more efficient as revenue and gross profit increase. I think the explanation that makes the most sense is that the value proposition is deteriorating, and the company needs to spend more money just to tread water.
Additionally, given that the revenue is flattening, it feels difficult to cut costs down to profitability:
This reproduces a picture shared above with additional guidance information. At the top end of revenue guidance, which I suggested before is somewhat unbelievable, revenue growth is 28% and the operating expense guidance growth is 29%, suggesting that while they are lowering costs growth, cost control is still out of hand. The company touts the increased operating cash flow (a smaller negative number), however, annualized it’s still -$99m versus -$120m, with stock-based compensation (pg. 10) increasing by a $6m run rate. As revenue flatlines, turning profitable will be a challenge given the cost structure of the business. The reason for the deteriorating fundamentals is that SDGRs software is becoming less effective compared to market alternatives.
Part Two: The Obsolescence SDGR’s Software
Analysis of the Underlying Software
There is no denying that the underlying business is slowing down. I have suggested the reason for this is problems with the competitiveness of the underlying software. This section will illustrate the competitive challenges contributing to its software obsolescence. Before discussing the issue with the software, I must explain what it does. This is SDGR’s presentation on their platform and provides a high-level description of their algorithmic approach:
The idea behind the SDGR approach is to filter the efficacy of 1 billion potential molecule candidates given a specific interaction (for example a specific protein binding site). To begin the process of finding candidate molecules, they take 1000 random molecules and evaluate the likelihood of the compound binding with the protein. They use the FEP algorithm to predict the change in Gibbs Free Energy. If the change in Gibbs Free Energy is negative, then a molecular reaction should happen spontaneously (modulo activation energy).
After they have calculated the change in free energy for molecules interacting with the protein of interest, they have a free energy score. You can train a machine learning algorithm to take in a molecule and attempt to reproduce the FEP score. FEP is slow—and I will discuss why later. But the machine learning algorithm can produce accurate scores orders of magnitude faster. Thus, you used FEP on 1000 molecules. You train your machine learning on that data, then you use it to score the rest of the 1 billion molecules.
FEP is still more accurate than their machine learning algorithm, so you take the top 5000 scoring molecules and run them through FEP. The top 10 molecules are synthesized in a lab and hopefully, 8 out of the ten have promising characteristics. This is how SDGR determines promising drug compounds. Now I want to dig deeper into the FEP algorithm.
Free Energy Perturbation
FEP is the most important step in SDGR’s approach. Thus, it’s useful to understand how it works. According to this source, the difference between the free energies of the two states is AAB = AB – AA = -kT ln(ZB/ZA).
Markov Chain Monte Carlo (MCMC) is a way to sample from a distribution using only likelihood ratios without knowing the normalizing term that makes probability distributions integrate into one. Intuitively, if you are at a point and thinking about moving to another point, you can calculate the unnormalized probability of being at both points. Thus, you can calculate the relative likelihood ratio between the two points. If the point you are moving to has a likelihood ratio that is half as large, you only move to that point with 50% probability, and with the other 50% probability you stay put. In this way, you are moving to all points according to their likelihood ratio, and over many samples, you will be sampling points from the normalized distribution, PA. This is how to sample from PA without knowing its probability density function.
MCMC is slow. To illustrate why, consider being a point that is 10x less likely than another point. If you visit the less likely point once, to maintain the correct likelihood ratio, you need to visit that more likely point 10 times (or with continuous distributions 10 times in the same small neighborhood). This doesn’t happen once, but many times in the process of sampling. To make things more difficult, if the UA function is not similar to the UB function, your expectation will not converge. Thus, in practice, you divide the “distance” from A to B into N smaller steps, and going from step 1 to step 2 is one nth the distance from A to B. This is where the term perturbation comes from in FEP.
An alternative to MCMC is the use of molecular dynamics (MD). The benefit of MD is that it allows molecules to move, especially when the binding behavior of a molecule on a protein may be a rotation or translation of the molecule, MD can take this into account. MD solves a differential equation that governs molecular behavior by calculating molecular motion with a series of small time steps—where the discretization error will be small. This shares similarities with the Runge-Kutta approach of differential equation solving but preserves energy. The problem with MD, as relayed to me by a biophysicist is that “The approach cannot model large proteins with any degree of accuracy, so the solution is to model everything that happens in the whole cell.” What this expert is pointing out is that you can’t model a single molecule without information on the system in general, but if you model everything in the system, your error becomes too large. MD takes a similar amount of time to MCMC. So while it is an alternative to MCMC, this illustrates that alternatives have their own problems as well.
Enter Machine Learning
I’ve explained to you why SDGR’s FEP algorithm is slow. The way SDGR proposes to speed this up is by using the machine learning algorithm graph convolutional neural networks (GCNN) to predict FEP scores. In principle, it makes sense why training a machine learning algorithm could speed things up. There is some direct relationship between AA to AB. However, due to algorithmic problems, we cannot get there without solving multiple intermediate steps. However, many classes of neural networks can approximate any function, given enough data. Thus, it will likely be able to learn the direct relationship, without having to solve the intermediate subproblems. GCNNs can’t precisely do that, but there is a lot of evidence that they can mimic this ability to approximate almost all graph functions given enough data.
What the GCNN will do is take the structure of all the molecules in a system and attempt to predict the change in free energy. The change in free energy and accompanying activation energies will tell you how much of the molecule in question binds with the protein. At a high level, the GCNN combines information only from neighboring nodes in a graph. For molecules, the graph nodes are the atoms that have a direct bond (i.e., edge) with another atom:
Schrodinger describes the GCNN process on a molecule:
Going from diagrams 1 to 5 along the image. The nodes (ie atoms) are all labeled—perhaps with some indicator variable that specifies the atom type, the degree of the atom, and perhaps edge labels like bond type. Call this collection of node labels: u. Then the node labels are transformed by some linear function: Wk,du+bk,d=u’. This is diagrams 2 and 3. Wk,d and bk,d is a matrix and a vector that transforms u into a potentially different dimensional u’. These elements are the parameters to adjust when training your model. You apply these parameters for all the nodes. Finally, for each node, you sum up the transformed node labels of each adjacent (ie bonded) node and assign a new set of features for the node that is connected to the summed-up nodes. This is diagrams 4 and 5. Now that you have new node labels, you can run this procedure as many times as you want with the new node features.
To predict a scalar quantity, you can just take all the node features and multiply them by a set of learned parameters and sum them into a scalar value. The way you train a GCNN is to take the output generated by the GCNN from imputing all the molecules and modifying the parameters of the GCNN so that the difference between the FEP calculated free energy and the GCNN calculated free energy is as small in absolute distance as possible.
Going back to the whole process, SDGR uses their FEP process first to train the GCNN, then after scoring all molecules via the GCNN, SDGR uses FEP to get better estimates of the best-performing molecules:
From what I can tell, this is essentially the SDGR approach in a nutshell. Next, I will discuss issues with their approach.
Analysis of SDGR’s Algorithm
The first thing that strikes me is that MCMC can almost always be replaced with a better algorithm. People like it because it’s familiar and often they don’t know alternatives, but there are. MCMC can usually be replaced with variational approximations or simulation-based inference. It turns out that PA can be approximated via variational methods to solve this problem:
Remember we can’t sample from PA, because it’s an unnormalized distribution. Turns out though, you can use what is called the KL divergence, to train a density estimator that you can sample from:
The good thing about the KL divergence is that it returns a (pseudo) distance between distributions, and at its minimum, 0, implies the Qφ = PA (almost surely but no one cares about that detail). Thus, if you can sample from Qφ and after minimizing this distance by modifying φ, you get a Qφ that is a distribution very similar to PA. Then you can calculate the expectation with respect to Qφ by directly sampling from Qφ, avoiding doing MCMC on PA. This procedure is proposed in 2021 for a modified FEP algorithm: Bennett acceptance ratio (BAR). BAR sometimes outperforms traditional FEP due to better properties in part due to the variational approach as well, but adding the variational capability will certainly cut down on computational resources with certain queries.
Given the ease of implementation, it is surprising, to the best of my knowledge, the SDGR has not discussed or implemented this approach. Perhaps the variational approach will not be as easy–but likely possible according to the paper—to implement for MD—but SDGR is still pushing MCMC approaches even when they don’t use MD. This suggests to me that perhaps they aren’t aware of the state-of-the-art. Perhaps you can’t run variational inference on every FEP query, but as an MCMC substitute, there are queries that one could replace. One of the benefits of a BAR and variational inference approach is that you don’t need thousands of GPUs. And despite potential savings on this estimated $12m of cost (5000 V100 GPUs, costing $14k, 5-year replacement cycle), the fact that this approach has still not made it into the software is indicative of a culture that is too focused on the chemistry and not enough on the statistical models.
These issues wouldn’t be an immediate problem because we know FEP does work. If it is the best method for the job, the fact that it’s not perfect wouldn’t stop most people from using it. However, there are models being introduced that are better than the FEP approach with not only more speed but also higher accuracy. To illustrate my point even more in-depth, I’d like to point out a paper published recently in Proceedings of the National Academy of the Sciences (which I will refer to as Qiao et. al.). There are many papers that are cutting into SDGR’s expertise, but I will just discuss this one. Qiao et. al. uses geometric deep learning, a related field with graph neural networks, to calculate quantum mechanical properties. It boasts free energy errors of between .17 – .35 kcal per mole. This compares to SDGR’s errors of .7-1 kcal per mole. SDGR’s errors are reported with root mean squared error, and Qiao et. al. use mean average error which is always slightly lower. However, in my experience with these differences, the delta between these numbers doesn’t fully close the gap.
In addition, since the evaluation set differs between the two approaches, it is difficult to know with confidence that Qiao et. al. outperforms SDGR. However, in the appendix of their paper, Qiao et. al. shows their algorithm (OrbNet-Equi) outperforming Density Functional Theory (DFT) (B97-c result) often by large margins:
DFT is more accurate than MD approaches but cannot be used for FEP because it’s too slow. Qiao et. al. is 100x to 1000x faster than efficient DFT methods, which combined with its better performance, makes this probably a good deal better than SDGR’s FEP approach. Since the error for calculated free energy is lower, the molecules Qiao et. al.’s method proposes will be better. The code to run Qiao et. al. is open source, so you can use an approach that is probably better than FEP at zero cost, although it won’t have as much additional software infrastructure. Most biotech firms are started with doctorate holders, often with at least one founder or early hire having serious quantitative skills. Compared to the rest of the quantitative drug design landscape, SDGR’s algorithm seems to me to not measure and it’s likely becoming clear to biotech executives.
This paper is just one of many papers to implement machine learning for drug design. Machine learning approaches are disruptive innovations. SDGR should retool, and its scientist must learn entirely different skills to compete with these papers. At the same time, these approaches are essentially paradigm-changing. There is likely no amount of sustaining innovation that can make FEP better than machine learning approaches. There is too much low-hanging fruit as these machine learning techniques have only recently been invented. At the same time, innovation in FEP is slow as approaches have been at the 1 kcal per mole error rate for 10 years.
As a metaphor, consider the case of computer vision. Before 2012, computer vision consisted of scientists programming rules into an algorithm. This is sort of like SDGR using MD or MCMC to help the algorithm to predict free energy as calculated by an expert algorithm. In 2012, Alexnet blew every other computer vision software out of the water in the ImageNet competition. In large part, it was due to offloading expert suggestions and building a model that could be trained on more data and fewer expert rules (we can talk about convolutions another day). From 2012 onward, no other technique has outperformed neural networks on ImageNet since Alexnet. To emphasize the metaphor, there is still a role for the expert, but machine learning systems with enough data can usually outperform expert decision making and this transition leaves algorithms that don’t put statistics and data first in danger of obsolescence.
Part 3: Wrap-Up
A company trading at 8x revenue seems reasonable for a company growing at 30% a year and close to profitability, however, SDGR is neither of these things. As a company that is plateauing in revenue and has an improbable path to profitability, one could make the argument for valuing it at its cash, securities, and equity investments, or even less as, every year, the company is losing money. On top of the baseline $1.2b in security assets, I model the business.
Even though I argue that revenue growth is flatlining, for the sake of conservatism I assume DSGR will grow revenue by an average of 15% a year for the next 5 years. This is in line with management’s 1st-year guidance which the market likely doesn’t believe based on the price decline after Q2 earnings. I’m also conservative in assuming growth won’t slow. Then I value the residual at a 20x P/E multiple which given my 10% discount rate comes out to a 5% perpetual growth rate. Net profit margins start at -60% and increase linearly over the 5 years to 15%. -60% margins are generous given net margins of -80%+ last year. Additionally, a 15% ultimate profit margin, which is too low for a software company, but considering how unlikely its path to profitability is, is also likely generous. Below is a DCF chart and the numbers other than percentages are in billions:
The total value is $563m compared to a current enterprise value of $1.5b. This analysis suggests that SDGR has a downside of roughly $1b or 37%.
The biggest risk in the short term is that revenue reaccelerates. If growth returns to 25% a year and the EV/Sales multiple rerates to 15x, the potential market cap could be 72% higher. I think that it will be difficult to mitigate this risk, but the risk is small. If you are holding this into earnings there is always a chance they will report great numbers. That being said the market response may be more muted even if growth reaccelerates because the pathway to profitability is difficult and in the current market environment growth stocks not only need to show growth but also the ability to generate profits.
Additional risks include the portfolio/equity investments having a breakout drug that looks impressive and has a high chance of being approved. My 1-year holding period mitigates this risk as most of the drugs that aren’t explicitly valued are in discovery. The current cost to borrow is low and I wouldn’t worry about it unless it gets to 10%. At the same time, I will put a stop loss 15% higher than current prices. I would also think about selling if it looks like revenue or profitability is inflecting in the positive direction, based on earnings. I don’t see a way for SDGR to become profitable, but if that does look like it’s happening, one should cover the short. This is easier said than done given that the price will be much higher after good earnings. However this is not really a valuation short (although valuation plays a role), and I would not want to be caught up in thesis drift after seeing it become both more overvalued, but also more viable as a business.
At the time of its founding in 1990, the business was a great idea. Using algorithmic approaches to select drugs was ahead of its time. In the meantime, though, the world caught up to SDGR. To illustrate, SDGR are using geometric deep learning to approximate FEP calculations, however, at the same time, researchers have used geometric deep learning to outperform FEP methods entirely. As its growth is slowing due to the loss of its competitive edge, I recommend shorting the company with a conservative potential profit of 37%. The investment period should be one year and if results don’t look like they are deteriorating, one should close the short. If revenues continue to flatline—even if management can get costs under more control, this should be successful.