Skip to content

bug(medcat): CU-869c0g9f7 Fix embedding linker disambiguation#314

Merged
mart-r merged 4 commits intomainfrom
bug/medcat/CU-869c0g9f7-embedding-linker-disambiguation
Feb 3, 2026
Merged

bug(medcat): CU-869c0g9f7 Fix embedding linker disambiguation#314
mart-r merged 4 commits intomainfrom
bug/medcat/CU-869c0g9f7-embedding-linker-disambiguation

Conversation

@mart-r
Copy link
Collaborator

@mart-r mart-r commented Feb 3, 2026

This PR will fix the issue with embedding linker that force it (in its default configuration with use_ner_link_candidates = True) not to any disambiguation.

Currently, whenever one runs the embedding linker it only links entities that had only one 1 link candidate. That's because of a condition in the _pre_inference method that ignored all other entities if use_ner_link_candidates was set.

This PR fixes that by:

  • Adding a test to show the failure
  • Fixing the condition to allow ambiguous names to be parsed by the linker

NOTE:

  • The first commit / workflow is intended to fail (at tests) since it includes the test, but not the underlying fix
  • The subsequent commmit / workflow should succeed as the fix is included

@tomolopolis
Copy link
Member

le.append(entity)
continue
elif self.cnf_l.use_ner_link_candidates:
elif self.cnf_l.use_ner_link_candidates and not len(entity.link_candidates):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor but this is the actual line so not super trivial. For my non python brain benefit - do you have to not len here?

Can you do just a not?

Suggested change
elif self.cnf_l.use_ner_link_candidates and not len(entity.link_candidates):
elif self.cnf_l.use_ner_link_candidates and not entity.link_candidates:

I think it's ultimately saying self.cnf_l.use_ner_link_candidates and len(entity.link_candidates) == 0, just the not len isnt a pattern I've seen before. I'm totally happy if it's super pythonic and standard, as a lot of things are.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, absolutely - having not entity.link_candidates would be better. I just started off with len(entity.link_candidates) == 0 and thought I'd simplify it. But you're right, relying on the truthy nature of a non-empty list is the best way here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big change ! Thanks for clearing up

@mart-r mart-r merged commit 2cb6421 into main Feb 3, 2026
21 checks passed
@mart-r mart-r deleted the bug/medcat/CU-869c0g9f7-embedding-linker-disambiguation branch February 3, 2026 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants