CDGLT

The official implementation for the paper titled: "Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification", ICMR 2025.

📌 Note: We strongly recommend referring to the [arXiv] version, which contains updated content and refinements beyond the ICMR 2025 publication.

📌 注意：我们强烈建议参考 [arXiv] 版本，该版本包含 ICMR 2025 出版版本之外的更新内容和改进。

Install the Requirements

conda create -n CDGLT python==3.8.0
conda activate CDGLT
pip install -r requirement.txt

Preparation

Download image data: here

Download label file: here

Download OCR content of memes: here

Download the dataset division file: here ( The 6/2/2 dataset division given by Vincy2King/M3F-MEME adopted in our experiments.)

The structure of data directoty:

-data/
    -Eimages/
        -Eimages/
    -avg_test_label_E
    -avg_train_label_E
    -avg_val_label_E
    -E_text.csv
    -label_E.csv

Note：The label_E.csv and E_text.csv offered by kaggle have some flaw in encoding. It can be solved by openning them with VScode then clicking Select Encoding -> Save with Encoding -> UTF-8. The repaired files is provided by us in data directory of this repository.

The pretrained CLIP model we used: openai/clip-vit-large-patch14

The pretrained GPT-2 model we used: openai-community/gpt2

Divide the Dataset:

Obtain the 6/2/2 train/val/test label files of each specific tasks:

# {YOUR_PATH}/CDGLT$
cd ./utils
python split_train_val_test.py

Then the split files will be written in data/E_split/

Task 0: Sentiment Analysis; 1: sentiment analysis; 2: Intention Detection; 3: Offensiveness Detection; 4: Metaphor Identification

The first columns in the obtained .csv files represent the IDs of images (e.g. the ID of file named image_ (26).jpg is 26)

Extract the embeddings and text prompt tokens

# {YOUR_PATH}/CDGLT$
cd ./utils
python write_clipText_feature.py
python write_clipViT_feature.py
python write_gpt2_prompt_tokenid.py

Embeddings and GPT2 tokenIDs will be written in feature/cache_E

Start Training

# {YOUR_PATH}/CDGLT$
bash ./train_MI.sh

This bash script uses the nohup command, so the python program will run in the background and redirect std output to a log file.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
feature		feature
src		src
src_vanilla		src_vanilla
utils		utils
.gitignore		.gitignore
README.md		README.md
cover.png		cover.png
requirement.txt		requirement.txt
train_ID.sh		train_ID.sh
train_MI.sh		train_MI.sh
train_OD.sh		train_OD.sh
train_SA.sh		train_SA.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CDGLT

Install the Requirements

Preparation

Divide the Dataset:

Extract the embeddings and text prompt tokens

Start Training

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CDGLT

Install the Requirements

Preparation

Divide the Dataset:

Extract the embeddings and text prompt tokens

Start Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages