This thesis considers an exhaustive state-of-the-art (SOTA) desription of VQA models, theoretical considerations and implementation details for training a VQA model learning towards a semantical- and conceptually strong latent space spanned by the Conceptnet Numberbatch embeddings as well an analysis of model behaviour by exploiting explainability tools.
Recommended citation: Jacobsen, Albert Kjøller; Højbjerg, Phillip Chavarria; Jacobsen, Aron Djurhuus. (2022). "Visual Question Answering with Knowledge-based Semantics." DTU Department of Applied Mathematics and Computer Science . https://findit.dtu.dk/en/catalog/62c6c822d4fccf03d747b3db