Article written by Keshav Malhotra – Embryologist · Chair Embryology, ISAR · Chair, Embryology SIG, ASPIRE · Executive Member, Alpha Scientists in Reproductive Medicine.
A patient sat across from me a couple of days back holding a printout. It was an embryology report, a single page with her embryos listed and graded. She pointed to one near the top and asked why it had been given a lower grade than an embryo her friend had received at a different clinic. Hers was a 4AA. Her friend’s was a 5AA. The number was smaller, so she had assumed her embryo was worse.
I explained that grading carries real variability, and that no grade determines an outcome on its own. The number she had fixed on was a measure of expansion, a stage of development, not a score of worth. Then I told her which of her own embryos we intended to transfer first. She asked how I had decided, and a longer conversation followed. She left reassured. I did not.
Because the question underneath her confusion was a fair one, and we had never really given her an answer to it. She wanted to understand how we had decided the order of her embryos. And the honest truth was that for most of my career, that decision had been happening somewhere I could not easily show her. It was happening in my head.
We Are Already Ranking. We Just Don’t Admit It.
Here is something we rarely say out loud. Every embryologist, in every lab, on every transfer day, is ranking. We look at a cohort, we weigh one embryo against another, and we decide which one goes first. That is ranking, plain and simple. It is the actual work.
What we write down, though, is a grade. We assign letters and numbers against a fixed morphological standard, and we hand that over as if it were the decision. But the grade was never the decision. It was a description we used to get to the decision. The ranking, the part that actually determines what happens to the patient, has been happening informally, idiosyncratically, and invisibly, shaped by experience and instinct that we never had to make explicit.
This works, more or less, when one experienced embryologist is doing it consistently. It stops working the moment you ask a fair question of it. Would another embryologist in the same lab rank this cohort the same way? Would I rank it the same way next week? Can I explain to the patient in front of me why this embryo and not that one? For a process that decides the order of someone’s chance at a child, those are not unreasonable things to expect. And an in-the-head ranking cannot answer any of them.
Where Grading Lets Us Down
Grading is useful, and I am not arguing that we throw it away. But an honest look at it reveals problems we tend to talk around.
The first is variability. The same embryo, assessed by two competent embryologists, can receive two different grades. Assessed by the same embryologist on two different days, it can still come out differently. Consensus criteria have narrowed this, but they have not removed it. A label that shifts depending on who is looking, and when, is a fragile thing to present as fixed.
The second is false precision. The boundary between a BB and a BC blastocyst is a convention we have drawn across a biological continuum. The embryo does not know which side of the line it sits on. When we hand a patient a categorical grade, we imply a sharpness the biology does not have. It was precisely this that tripped up my patient, who read a difference in expansion stage as a difference in worth.
The third is the one I find most revealing, and it happens constantly in a busy lab. Picture a cohort of five blastocysts that have all expanded well, all with good inner cell mass and good trophectoderm. On the Gardner system they sit in the same band. Grading says they are equivalent. But you have one transfer slot, and you must choose one to go first. So you rank them anyway, quietly, using judgment the grade does not capture and the record does not show. The moment grading matters most, when a real choice has to be made, is the moment it most often fails to discriminate, and the informal ranking takes over without anyone naming it.
The fourth is the misreading my patient ran into. A grade looks like an absolute score, so it invites comparison across patients and clinics that is not clinically meaningful. The label encourages the exact mistake that caused her distress.
And the fifth is simply that a grade describes an embryo. It does not produce a decision. The step from description to decision still happens, but we keep it hidden, even from ourselves.
What About the Expensive Answers?
At this point a reader will reasonably ask about the tools built to solve selection. Why not test the embryos genetically, or watch them develop continuously, and let that settle the order?
Both routes are real, and both have a place. But neither is the answer for most labs or most patients. Preimplantation genetic testing adds significant cost, requires a biopsy that is invasive to the embryo, and is simply not appropriate or affordable for every cycle. Time-lapse morphokinetics asks for expensive incubators and imaging systems that a large share of the world’s labs do not have and will not have soon.
The reality of global IVF is that an enormous amount of it happens with conventional morphology, a microscope, and an embryologist’s judgment. Building our thinking only around the high-resource setting ignores where most embryos are actually selected.
So the practical question is not whether we can buy our way to better selection. It is whether we can do better with what nearly every lab already has. And the honest answer is yes, because the gap is not a technology gap. It is a standardisation gap. We are already ranking. We are just doing it without a shared method and without a way to show our work.
The Case for Standardised, Transparent Ranking
Ranking starts from the decision rather than the description. It asks which embryo should go first, then second, and orders the cohort accordingly. This sounds like a small reframe. It is not.
Ranking maps directly onto the clinical question, so it stops pretending the grade was the answer. It holds the context of the specific cohort in front of you, which a fixed grade discards. A 4AA as one of six is a different situation from the same 4AA as a patient’s only blastocyst. The grade is identical. The decision is not. Ranking carries that difference. It is also more honest about uncertainty, because it claims only to order this set of embryos by priority, which is the one claim we actually need to make.
This is not a fringe position. The 2025 update to the Istanbul Consensus moved in exactly this direction, recommending that several morphological criteria not well enough supported for grading be used instead to rank embryos within a cohort. The field’s own consensus is quietly acknowledging that for part of what we assess, ranking is the more appropriate framework. That deserves more attention than it has had.
What AI Actually Revealed
Here is the part I did not expect. AI did not replace my judgment. It made it visible.
The first time I watched an AI tool lay a patient’s embryos out in rank order, it was doing explicitly what I had been doing implicitly for years. It was ranking. Seeing that order on a screen forced me to admit something about my own practice I had never quite articulated. The decision was always relative. The absolute grade had always been a stand-in for a ranking I was performing in my head and never had to defend.
There is an irony worth naming, because it connects straight back to my patient. Whether we hand someone a grade or a ranking score, a compact label invites the same mistake. It looks absolute, so it gets read as a verdict. Her grade was misread for exactly the reason a ranking score can be. That is not an argument against ranking. It is an argument for communicating it honestly, for being clear that a position is a position within one patient’s cohort, not a mark out of ten. The fault was never the label. It was that we presented a relative decision in the costume of an absolute one.
Used well, a tool like this does not deskill the embryologist, and it does not demand a new building full of equipment. It offers something most labs lack, which is a standardised, repeatable way to do the ranking we are already doing by instinct, and a way to put it in front of a patient and say, here is the order, and here is why. The value is not that the machine is cleverer than the embryologist. The value is that it makes the reasoning legible.
Back to the Patient
The grade on her report was never a verdict on her embryo. It was one input into a decision about sequence, the same ordering embryologists make every day, usually without ever writing it down. Her instinct to compare it to her friend’s was only possible because we had handed her something that looked absolute. What she actually wanted was simpler, and entirely reasonable. She wanted to understand how the order had been chosen.
That clarity is something we owe our patients, and giving it does not require a biopsy or a new incubator. It asks for something cheaper and harder: the honesty to admit that ranking is the work, the discipline to standardise how we do it, and the willingness to explain it out loud. Grade when we record. Rank when we choose. And stop dressing one up as the other.
She did not need a better embryo. She needed to understand the one she had, and the decision we had made about it. The embryos on that printout were exactly what they were before she walked in. The only thing that needed to change was us, and how plainly we were willing to show our work.
Want to get in touch? Click here to contact us.
