← Previous · All Episodes
Moving Clinical Trial Goalposts Episode 32

Moving Clinical Trial Goalposts

· 37:13

|

Judith: Welcome to Berry's In the
Interim podcast, where we explore the

cutting edge of innovative clinical
trial design for the pharmaceutical and

medical industries, and so much more.

Let's dive in.

Scott Berry: Welcome everybody to In
the Interim, I'm Scott Berry, your

host, and I have my usual partner
in statistics, my partner in crime.

Kurt Veley is here to join me.

Welcome back to, in the Interim, Kurt.

Kert Viele: Thanks Scott.

Scott Berry: So we have, we have an
interesting topic and part of it is

I, the failure of ai, that's not the
topic, but Kurt and I both looked

everywhere to see if there's a name
for this, to see what AI said of this.

And we can't find a name for this,
so it's a little bit unnamed.

So we are calling it, I am calling
it the uncertainty of novelty.

The AI threw that out as as, as something,
and, and we'll talk about what this is.

Is, uh, and, and the, the alternative
name of this is something, and I, I know

Kurt has a, a daughter that's in college
and I have a son that's in college and

we learn all, everything about what's
cool in the world from our children.

Um, and we learned slang from them.

And, uh, we've picked up things like.

Dude and bro, and they have specific
meetings and you have to use

them in the right particular way.

But my son uses the phrase, when
something just seems awkward,

he says, what are we doing here?

And so it's just that phrase
and the way he phrases.

So it's a little bit of, uh,
mixing uncertainty of novelty.

And what are we doing here?

Okay, so what are we doing here?

We are talking about.

When novelty, and I, I don't
like to use the word innovation

because it's almost self glossing.

It is positive, uh, but when you're doing
new things, you're doing something novel.

Uh, there there are hurdles and,
and there's, there's people that

look at the approach and they
point out issues with the novelty.

And the interesting thing is those issues
exist and they're may be worse in the

thing you did before that wasn't novel.

But yet this is pointed out, uh,
as these, and it might be dish, uh,

barriers that are put on novelty that
aren't put on what we do every day.

So what are we doing here?

I'll jump in the first example,
and Kurt and I are gonna present

over the years examples of this.

And we, we, we wanna be really
careful here that we are not,

we're not being critical.

Kurt and I both have a great deal of, of
respect and appreciation for regulators.

For example, uh, journal editors,
people who do this, but there,

there are times, these are people
who generally provide these.

Hurdles or what about this?

What about that?

And they show up.

So they're gonna show
up in examples today.

And, but we, we, we, and, and it's
not only regulators, but, but they'll

show up in our examples and we don't,
we, we don't wanna come across it.

That all.

So I

remember being, Yeah,

Kert Viele: we, should be upfront that
we agree with our, with regulators

and journal editors probably more
often than our college age killer.

Scott Berry: Yeah.

Yeah.

yes.

yes.

Um, a, a topic for a different day.

Um.

So I was, I had, was working
with a sponsor and we

designed, I'll show an example.

So example number one, we were, we
were building a trial and it was in

oncology and it was a combination
therapy compared to standard of care.

And the combination was adding
experimental agent A to standard of care.

Is that better than standard of care?

Okay.

And we were looking at overall
response rate as an endpoint.

This is looking at the tumor having
a 50% reduction in, there are

characteristics to this, but that's
the endpoint and the, the question is

what would happen on standard of care?

And we actually had data on other
trials that had presented, uh, ORR

rates for standard of care, and
we use that in a Bayesian model.

Where we enrolled three, we proposed
to enroll three to one to the, the

combination therapy that included the
experimental agent and one to the standard

of care and use Bayesian borrowing
to estimate the standard of care.

And the, the historical data was about a
15% ORR rate, and we proposed this to the

agency, so a randomized trial in that Now.

We, I, I, it was back in the old days
with some of these FDA meetings, sponsors

asked you to physically be located
with them, the pharmaceutical company.

So I flew to the sponsor, was
located there for a telephone call.

Uh, on the design.

Uh, now, now we do this on,
uh, virtual meetings and you're

not forced to fly to there.

Then these were not even flying
to DC for, for the meeting.

So I was, I was, uh, hammered
as you, you, you guess, by

the regulators on this design.

I was, uh, in this so.

Their complaint was that, okay,
you, you do this borrowing of 15%

on historical, maybe the modern day
average of standard of care comparing

to your new arm is more like 30%.

And by borrowing from the 15% rate,
we were doing dynamic borrowing.

So yes, type one error would be inflated.

If the, if the truth is now 30 for the
standard of care and we're borrowing

from 15, you get inflated type one error.

Dynamic borrowing is great because it
sort of, uh, uh, it lowers the, the

rate of type one error lowers and if
it gets even higher, it, it decreases.

Beautiful.

I thought it was beautiful and I was,
I was crushed by the FDA statistician

saying, you're inflating type one error.

You can't, you can't do this.

Okay, and, and, and, and they are
sitting at a table with the sponsor.

And so they came back and said, our
suggestion to you is that you run a

single arm trial of your combination
and compare to a historical rate of

15% and show a statistical significant
improvement to the 15% that that would

be sufficient in in a good trial.

And so I'm, I'm sitting there with the
sponsor and I'm just, I, I, I'm shook

a little bit to this and of course this
is a beautiful thing for the sponsor.

They perceive this to be an
easier road for them to go.

And meanwhile I want to jump up
and down and say, wait a minute,

this doesn't make any sense.

Uh, and we'll come back to this
not making sense, but of course

I have to say absolutely nothing.

I I have to be quiet and say
absolutely nothing in all of this.

So, of course the, the interesting thing
is by running a single arm trial to a

number, nobody ever asks What's your
type one error, if that number is wrong?

But by borrowing from that same
number and enrolling new standard

of care is a novel thing to do.

They asked the question,
what's the type one error?

If in truth it's 30% their
solution they came up with.

Of course, if the current state of
nature is that it's 30%, the type

one error is much, much, much higher
in what they ask the sponsor to do.

It's much worse than the novel approach,
but the novel approach really asks

you to investigate that question.

And yet it's a better answer
for them if that's the concern,

but yet, oh, no, no, we can't.

We can't do the novel thing.

So a little bit of my son son's
sort of, what are we doing here?

Kert Viele: I think that's a
little, so I, I wrote on LinkedIn

that, uh, an article on can you,
can statisticians change reality?

And, you know, in effect
that's what's being done here.

Type one error.

If you say, you know, are you
making a false conclusion?

If the null is true, you had protection.

If the null was 15%.

So you had type one error control, but the
real scientific null you want to test is

whether the control equals the treatment.

And you're not enrolling control,
you don't have protection to that.

You have worse protection
than what you proposed.

Um,

I was, we were both at, in a
session at RSIW where TBA was

presenting and she had an example.

Where she proposed it was a
non-inferiority trial, two and a

half point non-inferiority margin.

And the same issue you,
you inflate type one error.

If you're borrowing data, that's not
sufficiently on point, usual problem.

Um, but the, the regulatory
solution to this was, okay.

We don't want you to borrow,
but we will let you use a

non-inferiority margin of three.

So in effect, this is, you know, changing
the goalpost, it's changing reality.

Um, you're saying, okay, if
you're gonna borrow, we think 2.6

down is a bad drug, but if you're
not gonna borrow, we'll think 2.6

is a good drug.

Scott Berry: Hmm.

Kert Viele: this case, I mean,
if, if your real goal is I don't

want to approve the drug with 2.5,

if you're I margins three, your type
one error for that question is inflated.

It's inflated everywhere,
not just in certain places.

I.

Scott Berry: Okay, so the 2.5

in the first one with borrowing has
a certain type one error that can be

larger, and by saying 3%, that 2.5

with the 3% has a higher type one
error than with the borrowing.

It, it, it would've been in that scenario,
but somehow there's a comfort in the, the

usual sorts of thing and not the novelty.

Kert Viele: Yeah, it's something we've
done before and there's comfort with

that and I, I have some sympathy for
that, but of course, it, it causes

us a lot of frustration at times.

Scott Berry: Mm-hmm.

Okay, so you have another example.

Kert Viele: So, yeah, so the um,
uh, so this was several years ago.

Uh, we're gonna switch to
enrichment designs, basket trials.

So we were presenting a.

Uh, it was in psychiatry and
it was a, a drug that they were

interested in testing it in a, in
an indication, but that indication

could be divided up into four groups.

And so we proposed an enrichment
trial where we would, the four

groups were in a two by two
factorial, so we would start looking.

About, you know, does it work in the
top row of that table or the left

column, or whatever the case may be.

We were gonna try to, to personalize
where the drug worked and where it didn't.

And the problem with this
is the standard problem.

We're gonna use a hierarchical
model in order to do this.

Uh, I should be upfront, not upfront.

I should, I should be clear.

The reason that we can't do
separate trials in the four

groups is it's enormous.

We'd have to run at least four
times the size of the trial.

Here, it's probably more like six.

'cause some of the groups are small.

So we were gonna use

Scott Berry: but also with, but also
you have the, you have the expectation

that the effect might have some
similarity in the four group, so

it's not just cutting the corner.

Yep.

Yep,

Kert Viele: Yeah, this
isn't random four groups.

This is the possibility of
separate effects, but the

expectation they'll be similar.

So anyway, um, the issue when
you do this kind of borrowing

is there's always a concern.

Suppose it works in three of the groups
and it doesn't work in the fourth.

You borrow, you bring that fourth group
up, you inflate the type one error.

So now you've got, you know, instead
of having a two and a half percent type

one error, you've got, there's this
4% or 8% chance that you're going to

incorrectly bring along that last group.

So you're gonna have this chance of
25% of the population being wrong.

So we went back and forth on this and
eventually what happened is all of

the enrichment was completely removed
and they ran a pooled analysis.

So the pooled analysis, as you were
saying in your example, it's worse

in terms of this kind of error.

So if you run in and the.

The drug actually works in
three groups and not the fourth.

If you pull the type one error 70,
80%, you're almost guaranteed to

carry it along, and it's one of those
situations we've said that now the null.

Makes the implicit assumption about
reality that everyone is equal and we

don't think about those type one errors.

I've, you know, I got jaded after this
instance enough to go, you know, it's

almost better to do a pooled analysis and
do all your subgroups on sensitivities,

where you know, everybody's like,
well, you know, it's all post hoc

and nothing's type one error control.

Scott Berry: Hmm.

Yeah.

So you were proposing a method
for that same group that you were

criticized for that did better
than what the final solution was.

Kert Viele: Yeah.

Scott Berry: So this is a, this is
a really interesting issue as we get

into understanding more about disease,
understanding more about populations

potential for heterogeneity of effect.

And I remember going to the FDA
when we went with the GBM Agile

platform trial and we were looking
at potential subgroups of oncology.

And we presented this problem
where we were borrowing on it.

And, uh, it, it's a usual thing.

So we, we don't wanna be
too critical about people.

Initially this reaction that,
oh, by looking at this question,

you have to control type one
error in every possible group.

And I thought Lisa LaVange she
was at the FDA at the time, was

fantastic in this, she said.

The FDA approves drugs all the time,
that there's somebody in that indication

that doesn't benefit and we can't
possibly understand that every single

person in that population benefits,
but yet we do a pooled analysis and

a mean effect, and we approve it.

It.

So the question is, what's a type 1 error?

If you're going into your example
of four populations and you end

up with the decision that it works
in three, and the truth is that it

works in two, but not the third.

Have you committed a type 1 error?

And she said for, for this,
for that particular example, we

should count it as a type 1 error.

If you make an approval and it
doesn't benefit any of the groups.

But if there's a population in there that
it benefits, now this is a mixed result.

We wanna worry about the likelihood of a
mixed result, and we wanted to have that.

But to label that as a type 1 error, she
didn't think that was a type 1 error.

So that was mixed.

So we had to do strong control
of type 1 as she defined it.

And I thought that at the time.

Was an amazing reaction to novel
designs that were new, asking new

questions and understanding the breadth
of things, what the errors were,

we were always doing, and was able
to apply that to a novel scenario.

Kert Viele: And I do think this is,
this remains an active research area,

and I think we don't have a conclusion
on this, but the notion of, uh, we're.

Used to things like we don't, we won't,
we will tolerate a two and a half

percent chance of missing the right
answer for everybody approving the drug

when it doesn't work for everybody.

That's the standard
prototypical stat 200 example.

So the question is, you know,
what happens when there's a 2.6%

chance of only missing it
for 20% of the population?

Is that a worse error?

I mean, I'd argue it's less of an
error, but we need to actually.

This, this needs to be worked out.

And then we need a metric
that works in this case.

Scott Berry: Yeah.

Yeah.

Um, okay.

We, we may come back to enrichment trials.

We, we think that the draft ICHE
20 is, is misleading on and, and

imposing a, a, a restriction on them.

Uh, maybe I shouldn't do that.

So, so do you want to describe.

What potentially the the ICHE 20 says,
since we're on enrichment trials.

Kert Viele: Alright, well since
you went there, so the, the

other interesting aspect when
you do enrichment is there is a.

There's a statement in the guidance
and it's in a number of guidances

isn't unique to ICHE 20, but there's
a, a question if you investigate

multiple groups and you show an effect
in some of the groups, and let's

assume it's a strong effect and so on.

The question is what do you
have to do with the groups

that you're leaving behind?

So do you need to show that it.

Doesn't work in those groups.

And if you, if you do have to
show that, the question is then

what is the burden of proof?

Do you have to demonstrate the null
is true, which is obviously hard.

And Scott, I'm gonna let you, I know this
is near and dear to your current heart.

Scott Berry: Yeah.

And, and, and we got the,
a response from regulators.

Um, quoting the draft ICHE 20, that
it says for enrichment trials, that

you have to show that the effect is
significantly larger in the groups you're

taking forward than the ones you're not.

And that burden has never been
placed on a sponsor before.

If you run a phase two trial and
you're, you're looking at, and you

decide, these are my inclusion exclusion
criteria, criteria for phase three,

you are never asked to demonstrate
that people you're not enrolling

have different benefit to the drug.

It, it would be enormous burden.

It would mean you'd never
run a a, a enrichment trial.

But the standard development pathway,
you don't ever have to do that.

But now because you're addressing the
potential differential, which by the

way is better per patients, it's it, you
know, this isn't a bad thing to be doing.

You now have a higher burden by
doing that than if you don't do it.

And of course, what it does is it pushes
you to do the same thing, which is worse.

Kert Viele: And essentially to come
back to what you're saying, this is, you

know, if, if you have an inclusion grant
here, I'm gonna enroll people over 50.

They never ask you to show that
it doesn't work for a 45-year-old.

It's just not part of the, I do
think there's a road here that.

The question it, it is nice to show
there's a differential effect and

all of these designs, they're aimed
at showing a differential effect.

The question is, it's vague on saying,
what is that standard of evidence?

There's certainly a level of
evidence that the design provides.

We just don't know what the bar is,
and significance is way too high.

Scott Berry: Yep.

Yep.

Uh, or having to prove
that the drug doesn't work.

'cause many times in enrichment
trial, you're enrolling a trial

and you don't enro enroll groups
where the effect is, it, it's

probably positive, but likely small.

Uh, very severe patients, you
might have a small benefit, but

they're, they're hard to get to.

And, and, and largely there's
little difference or very healthy.

That everybody does well.

So it's just a statistical thing not
to enroll that where you likely have a

benefit, but to prove you don't, would we?

We would all grind to a halt.

Kert Viele: Yeah.

Scott Berry: A similar example
is shifting gears to another one

that came up when a new trial.

It came to the forefront
of platform trials.

So a platform trial is a trial
where in a single trial you

enroll multiple experimental arms.

And so a simple platform trial is we
enroll experimental arms A, B, and C, and

they're all compared to a common control,
equal randomization, simple trial.

But in the same trial, there's A,
B, and C comparing to a control.

Initially when we were doing designs
like this and others were doing designs

like this, the, the, the, the initial
reaction was you have to control the

type one error for a adjusting for
the fact that you're testing B and C.

And so instead of 2.5%

type one error for A, which is the
standard, you have to split that

because you're investigating B and C.

And we kind of understand how we get here.

We use terms like experiment wide type
one error when we're looking at old trials

that only addressed one arm or addressed
versions of one treatment like doses.

And so the, the parallel to a platform
trial that's looking at A, B and

C is we run three separate trials.

One of A against control, one of B against
control, and one of C against control.

The standard is they all get 0.025

error, but by putting them
in the same trial, you have

to adjust for the other ones.

And by the way, if you add a fourth.

You might now have to adjust it
even further in this scenario is

a burden put on that that's never
put on the trial In the other

circumstance, and by the way, would
crush platform trials never to be run.

The nice thing about this example is
there was that initial reaction to it.

Regulators, journals, review
people, things that that I think

we've largely come to the the
recognition that, oh, this is.

This is something we've always done.

This is a burden that this is not
something we should place upon in

additional burden to this novelty
that we don't place in other

scenarios and guidance documents
at the FDA are reflective of this.

Now, there are interesting cases where.

Things are all the same mechanism of
action or if they were happen to be doses.

We all recognize that that's different
science, but in the case where there

are three different experimental arms,
this was new and the, the, the, that

was what was thought, but it completely
differed from what we normally do.

Kert Viele: And I can actually see putting
a multiplicity penalty on separate trials.

I mean, we talk about all the time,
you know, a certain, um, you know,

the amyloid hypothesis in Alzheimer's,
which has somewhat evolved recently.

So this isn't as good example
as it was five years ago.

But the notion of.

You know, does one trial out
of 30 investigating the same

mechanism, what does that mean?

The key issue here is if you're going
to do one thing for separate trials,

you should be doing the same thing
in the platform, whatever it is.

Scott Berry: Yep.

And we all think that actually we do
a poor job of analyzing across trials.

Um, and part of it is that all
the characteristics we look at

are within trial type one error
within trial power and all that.

But, you know, and a different,
a call for a different day is

whether we'd be better off with
one large trial, uh, or two small,

adequate and well controlled trials.

But, uh, other things come
up in that circumstance.

Kert Viele: I've seen bizarre ones where
they've said a company should only have

a certain amount of error, and it's
sort of, well, the company needs to

spin off a unit in order to actually
test another drug and things like that.

It's, it gets a little strange.

Scott Berry: Yeah, I get strange
because it probably means I

have no type one error left and

Kert Viele: You don't, that's

Scott Berry: need to retire.

Kert Viele: It's the only reason we hire.

Scott Berry: yep.

Okay.

You have a response.

Adaptive randomization scenario.

Kert Viele: Oh, I guess I do.

Um, so this is somewhat related
to the platform trials example.

It involves multiple arms.

Um, and so that's the
specifics, kind of the general.

Uh, topic here is a lot of the
ethical discussions that come up in

adaptive trials where we are looking
at the data at an interim so someone

knows what the data looks like.

And you often get into ethical discussions
of, you know, have you broken equipoise?

So an RAR example, you look at the data.

One arm's doing great, the
other two arms are doing poorly.

You increase allocation to the one.

You decrease allocation to the two that
are doing worse and they're not doing bad

enough to fail, they're just doing worse.

Um, and then the question is, well,
that's unethical because you have e

evidence that the good drug is better.

So have you broken equipoise?

And, you know, this is always, I first
off, certainly have some sympathy for

the argument, you know, should you
treat the next patient in, uh, as best

you can according to what you know.

Um, the counter argument to
that, of course, is we can't

just always use the best arm.

We know that we don't learn
efficiently when that happens.

If I just said after the first
patient, I'll pick the best arm

and always give that I, I won't
get the right answer at the end.

So I have to balance this learning
and, and randomization somehow.

So, but, but the, the thing
that this always seems odd

about to me is if I were to.

Not look at the data at all.

So I, I haven't looked at the data.

I'm not running an interim.

I could be sitting in a situation where
I have the same data set where X is

doing better than Y and Z, and I might
continue to allocate one-to-one to one.

I continue equal allocation.

So in that example, I
give more of Y and Z.

To patients in the trial
than I would doing RAR.

So the patients in the trial suffer,
they're gonna have a worse result.

I also know that RAR is more likely
to get me the right answer, has more

power, has better chance of picking
the right dose if it's done correctly.

And so I end up by looking at the data.

I treat the patients
outside the trial better.

So in effect, the knowledge that I have.

I'm being told that if I use
it by using it, I treat the

patients in the trial better.

I treat the patients outside the
trial better, everybody wins.

But if I do it, but
somehow that's unethical.

Whereas if I blindly
hurt people, that's okay.

And I've just, you know, so to be
provocative here, this is the example

I use when I want to get a, you
know, kind of a rise out of people

is I'll go, you know, in most cases.

If you do something out of, if you refuse
to look at information and cause harm,

that's often considered negligence.

It shouldn't be a virtue here.

Scott Berry: Yeah.

Yeah, yeah.

And, and I've heard the same thing that
if, if at an interim there's an 80%

chance dose two is better than dose one,
you shouldn't give dose one anymore.

And the, the act of doing the
interim and learning that answer

means you can't give that anymore.

But if you don't do the interim,
it's okay to do fixed randomization.

Uh, the knowledge is there, the
data's the same, but somehow.

Uh, ignorance is bliss and
it's okay in that circumstance.

Um, response.

Adaptive randomization, even doing
interims brings out a lot of these

scenarios where the perception that
what you now need to do, because

there was a look that's made, imposes
things that are, it's okay to just

ignore it in other circumstances.

Okay, I'm gonna jump on
an analysis method one.

Uh, this is near and dear to my heart.

Uh, if you were at RISW, um,
a couple weeks ago, I probably

shouldn't timestamp this, but I,
I talked about this a little bit.

So, in analyzing an ordinal
outcome, we have developed a

utility weight to an ordinal outcome
called the Modified Rankin Score.

There are seven outcomes of the modified
Rankin score from zero, which is perfect.

Neurological status, integer values up to
six, which is dead, neurologically, dead,

uh, and there's differential level of
disability from 1, 2, 3, 4, 5, up to six.

And we have done, uh, multiple studies
that look at the impact of the different

states and we weight them relative to
economic burden and patient preference.

They're almost identical.

We've created a weight to that and we've
labeled it the utility weighted MRS.

And we ran this in a trial
design, actually was tied to

an enrichment trial design.

And we ran the trial
design, the dawn trial.

It was a primary endpoint in the
DAWN trial, and the trial was

incredibly successful, endovascular
therapies in a amazing therapy.

It's, um, it's awesome the, the.

We, we proposed this and we
got, within the journal stroke,

the, the investigators from Mr.

Clean wrote in about the utility
weighted MRS, and there were a number

of complaints, but the biggest complaint
they had was, you can't do that because

you're using a relative weight of the
outcomes when not everybody in the

population agrees to those weights.

There, there were other things largely
that we, we wrote back to them and

they, they came back and the, the single
point in lots of people have this.

Criticism that you're doing weights
of those outcomes and not everybody

in the population is the same.

They even, they even looked at
EQ five D for their trial, Mr.

Clean trial, which also showed benefit
of endovascular therapy and showed

that the, the utility that people gave
the state 1, 2, 3, there was variance.

Not everybody said the same thing,
so you can't do this waiting system

for it, and their solution was you
should do a proportional odds model.

Many others say you need to do a
dichotomizing of the ordinal outcome.

If you do a dichotomization of the ordinal
outcome, you are imposing utilities on

those states that are 1, 1, 1, 0 0 0, 0.

And you are imposing that everybody
in that population has those weights.

If you do an A, a proportional odds model,
you are imposing weights on the outcomes.

The weights actually depend on
prevalence, but you're imposing a

single weight to each of those outcomes
for everybody in the population.

It's a little implicit.

In that scenario, and this was a group,
very smart group, do wonderful trials

and uh, uh, in this, but it was this
reaction that you can't do this because

you're imposing, everybody in the
population agrees to that, so do this,

which does exactly that same thing.

it it imposes a weight on those values
for everybody in the population.

It violates the single thing that
they thought was the worst reason.

Ex.

E.

E, E.

Exactly.

And somehow that's okay.

Where what we did was bad and largely
It was because we wrote down what the

weights were, where you can implicitly
hide it behind a statistical assumption.

Uh uh, if you don't write it down.

Kert Viele: And I, I think this comes
up, uh, we've talked about this offline

a lot, where there's a lot of situations
where statisticians we're implicitly

making a lot of clinical judgements
in trials that I don't wanna do.

I mean, if I form a win ratio
statistic, there are implied

utilities in those levels.

So if death is the first level, and
time on ventilation is the second

level, you're making an implicit
decision of how much extra death.

We'll compensate for how much extra
mechanical ventilation It's in

there, but it's not written down
explicitly and oh, that's in the

statistics and it's not thought about.

Whereas if I were to write this
out as an explicit, here is the

utility for different patient paths,
then it becomes a major argument

and I would much rather have.

Clinicians and patients making
those kind of judgements explicitly,

even if imperfect than me doing it
implicitly and, and no one sees it.

Scott Berry: Yeah.

A a a great example of that is the calfs
analysis of a LS trials, where it's a, you

do function, um, A-L-S-F-R-S and death.

And you do a non-parametric
test of those because they don't

want to wait death and function.

It's hard.

So they do this non-parametric
analysis, which weights death

and function implicitly through
a statistical assumption.

Uh, it's just not written down.

And somehow that's okay.

Where I, as you say, I'd much
rather have this done, even if not.

Perfectly done and explicit by
people who should be working on that.

Not a statistical assumption

to those.

Alright.

Um, I think we have managed
not to insult anybody

by saying,

Kert Viele: We certainly didn't intend to.

Scott Berry: what are we,
what are we doing here?

Uh, I think it's the natural thing of
when you're doing something different,

you're doing, I'll call it novel.

Um, uh, when you're doing novel things,
you're doing new things, you start to

address questions that you never addressed
with the old comfortable one, and all of

a sudden you worry about that issue and
realize uhoh the thing I was doing before

actually had exactly this same issue.

We just didn't really worry about it.

The, the, the, the comfort of that.

Kert Viele: I think we definitely,
whenever you do something new, we

should always be thinking in depth,
tear it apart, what are the risks?

All of that makes sense.

But at the end of the day, what we really
end up doing is defining a new metric.

What are we considering to be the
problem that we're trying to solve?

And.

The new methods don't have
to be perfect about that.

They need to be better
than what existed before.

And in order to assess that,
you need a common framework.

And so you don't wanna switch it.

Where the novel method is addressed
on this metric and the old method

is addressed on something different,
it's not a good comparison.

Scott Berry: Mm-hmm.

I actually think that's one of the
values of clinical trial simulation.

I think we can, in

depth look at a deep set of operating
characteristics that are very hard

to look at in other scenarios.

Um, and so apple's to apple's comparison
of approaches and picking, picking

better ones, uh, with, with explicit
performance and goals written down and

shown, uh, I think is very helpful.

All right.

Our, our time in the interim is done
for today, Kurt, a really cool topic

and, um, uh, we, we see examples of
this and, uh, uh, appreciate your,

you, you coming in the interim.

Kert Viele: Thank.

Scott Berry: Alright, everybody,
thank you for joining us in

the interim until the next one.

View episode details


Subscribe

Listen to In the Interim... using one of many popular podcasting apps or directories.

Apple Podcasts Spotify Overcast Pocket Casts Amazon Music
← Previous · All Episodes