Good question, for me a constant struggle!
I would like to add to the excellent suggestions and remarks already made.
As with all measurements of success and their KPIs, it starts with why you want to use threat modelling. Maybe implementing threat modelling is to promote security awareness. The implementation of threat modelling can also be used to be able to prioritise which security features or controls you need to build or code in your systems and or programs.
Both implementations would require different types of measurements and KPI’s.
For the implementation, where the core goal is to raise security awareness, I would suggest looking at various security maturity measurement models like the SANS security awareness maturity model. You can also mix and match with a team competence model like DASA.
For the implementation of threat modelling with as goal to create more secure applications, there is a pitfall that can be difficult to steer clear off and that is the preparedness paradox.
I have created threat models to limit security incidents à I don’t have that many incidents, why would I create a threat model for every application or solution change.
A potential good way to is to measure the “before and after” of a threat model.
Before, on completion of the first version of the model where you have identified threats and ways to counter them.
After, when you have implemented (some) of these countermeasures and evaluate how they impact the residual threat or risk.
This will work best if you have a Severity score of a threat and a ‘mitigation score’ of the countermeasure.
It could look something like this (ver, very, simple visualisation just to… well … visualise)
Before: Threat = 5, Countermeasure = 0, residual risk =5
After: Threat = 5, Countermeasure = 3, residual risk = 2
You can use this way to either represent the threats in the system / solution or estimate the impact of a countermeasure on a threat.
When you add a state to the threat model or an update time you can measure the progress over time or per state the solution or threat model was in, compared to previous time and or state
These are examples and you may think of other triggers more along line with the SDLC states the solution you are analysing is in.
I do suggest using a tool that does support this: mitigation scoring of countermeasures and severity scoring of threats, and Threat model update tracking (state or date time stamp of change)
When you do this manually a scoring poker method (like agile poker for user stories) can help, if you have the right subject matter experts in the room during the poker session.
I hope this helps!