The official implementation for the paper titled: "Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification", ICMR 2025.
📌 Note: We strongly recommend referring to the [arXiv] version, which contains updated content and refinements beyond the ICMR 2025 publication.
📌 注意:我们强烈建议参考 [arXiv] 版本,该版本包含 ICMR 2025 出版版本之外的更新内容和改进。
conda create -n CDGLT python==3.8.0
conda activate CDGLT
pip install -r requirement.txt
Download image data: here
Download label file: here
Download OCR content of memes: here
Download the dataset division file: here ( The 6/2/2 dataset division given by Vincy2King/M3F-MEME adopted in our experiments.)
The structure of data directoty:
-data/
-Eimages/
-Eimages/
-avg_test_label_E
-avg_train_label_E
-avg_val_label_E
-E_text.csv
-label_E.csv
Note:The label_E.csv and E_text.csv offered by kaggle have some flaw in encoding. It can be solved by openning them with VScode then clicking Select Encoding -> Save with Encoding -> UTF-8. The repaired files is provided by us in data directory of this repository.
The pretrained CLIP model we used: openai/clip-vit-large-patch14
The pretrained GPT-2 model we used: openai-community/gpt2
Obtain the 6/2/2 train/val/test label files of each specific tasks:
# {YOUR_PATH}/CDGLT$
cd ./utils
python split_train_val_test.py
Then the split files will be written in data/E_split/
Task 0: Sentiment Analysis; 1: sentiment analysis; 2: Intention Detection; 3: Offensiveness Detection; 4: Metaphor Identification
The first columns in the obtained .csv files represent the IDs of images (e.g. the ID of file named image_ (26).jpg is 26)
# {YOUR_PATH}/CDGLT$
cd ./utils
python write_clipText_feature.py
python write_clipViT_feature.py
python write_gpt2_prompt_tokenid.py
Embeddings and GPT2 tokenIDs will be written in feature/cache_E
# {YOUR_PATH}/CDGLT$
bash ./train_MI.sh
This bash script uses the nohup command, so the python program will run in the background and redirect std output to a log file.
