분류 전체보기 71

[논문 리뷰] Adaptive Multimodal Fusion: Dynamic Attention Allocation for Intent Recognition (AAAI, 2025)

Paper: https://ojs.aaai.org/index.php/AAAI/article/view/33898 Adaptive Multimodal Fusion: Dynamic Attention Allocation for Intent Recognition | Proceedings of the AAAI Conference on A ojs.aaai.org Code: https://github.com/Freyrlake/MVCL-DAF GitHub - Freyrlake/MVCL-DAFContribute to Freyrlake/MVCL-DAF development by creating an account on GitHub.github.com Modality별로 입력 샘플에 따라 동적으로(Dynamic) 가중치를 ..

[논문 리뷰] CAFuser: Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes (IEEE Robotics and Automation Letters, 2025)

Paper: https://arxiv.org/abs/2410.10791 CAFuser: Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving ScenesLeveraging multiple sensors is crucial for robust semantic perception in autonomous driving, as each sensor type has complementary strengths and weaknesses. However, existing sensor fusion methods often treat sensors uniformly across all conditions, leadinarxiv.org C..

[논문 리뷰] Multi-layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices (CVPR, 2025)

Paper: https://arxiv.org/abs/2410.10604 Multi-modal Vision Pre-training for Medical Image AnalysisSelf-supervised learning has greatly facilitated medical image analysis by suppressing the training data requirement for real-world applications. Current paradigms predominantly rely on self-supervision within uni-modal image data, thereby neglecting the iarxiv.org CVPR 2025에 게재된 논문이다. 코드는 아래에서 확인 ..

[논문 리뷰] Multi-modal Vision Pre-training for Medical Image Analysis (CVPR, 2025)

Paper: https://arxiv.org/abs/2410.10604 Multi-modal Vision Pre-training for Medical Image AnalysisSelf-supervised learning has greatly facilitated medical image analysis by suppressing the training data requirement for real-world applications. Current paradigms predominantly rely on self-supervision within uni-modal image data, thereby neglecting the iarxiv.org 요즘 Multi-modal, 여러 Data를 한번에 활용할 때..

[논문 리뷰] Deep Multimodal Data Fusion (ACM Computing surveys, 2024)

Paper: https://dl.acm.org/doi/10.1145/3649447 ACM Computing surveys에 게재된 survey paper다.Multimodal Data Fusion에 대해 정리할 필요가 있었는데, 해당 논문이 좋은 참고자료가 됐다. 1. IntroductionConventional taxonomy전통적으로는 아래와 같이 Multimodal Data Fusion 분류하고, 다루는 연구들이 많다. 최근 페이퍼들을 봐도 연구자들이 많이 사용하는 분류 체계인 것 같다.Early Fusion (조기 융합)서로 다른 modality의 raw data 또는 전처리된 데이터를 모델에 입력하기 전에 융합하는 방법이다. (이미지에서 Modality 1과 Modality 2의 데이터가 F..

[논문 리뷰] Can LLMs Understand Time Series Anomalies? (ICLR, 2025)

Paper: https://arxiv.org/abs/2410.05440 Can LLMs Understand Time Series Anomalies?Large Language Models (LLMs) have gained popularity in time series forecasting, but their potential for anomaly detection remains largely unexplored. Our study investigates whether LLMs can understand and detect anomalies in time series data, focusing on zarxiv.org ICLR 2025에 submit된 논문으로, 관련 Code는 아래 Github에서 확인할 ..

[논문 리뷰] Empirical data drift detection experimentson real-world medical imaging data (Nature Communications, 2024)

Paper: https://www.nature.com/articles/s41467-024-46142-w  의료 도메인에서 Data Drift Detection에 대한 실증을 수행한 논문이다. (Data Drift Detection에 대해 관심이 있어 찾아 본 논문이다. 의료 도메인에 대한 지식이 없어, 이해도가 부족할 수 있다.)  IntroductionAI 모델의 성능 저하를 모니터링 하는 것은 일반적이지만, input data의 data drift(systemic changes to input distribution)를 모니터링하는 것은 일반적으로 잘 이루어지지 않는다. 그렇지만, 저자들은 real-time evaluation이 어렵거나, 라벨링 비용이 큰 경우 data drift를 추적하는 것이 A..

[논문 리뷰] MM-LLMs: Recent Advances in MultiModal Large Language Models (2024)

Paper: https://arxiv.org/abs/2401.13601 MM-LLMs: Recent Advances in MultiModal Large Language ModelsIn the past year, MultiModal Large Language Models (MM-LLMs) have undergone substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs via cost-effective training strategies. The resulting models not only preserve the inherentarxiv.org  이 페이퍼는 Multimodal LLM에 대한 연구들을 요..

[논문 리뷰] Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks (IEEE TPAMI, 2015, Exemplar-CNN)

Paper: https://arxiv.org/pdf/1406.6909  Introduction이 논문에서 Self-supervised learning을 (아마도) 최초로 도입했다.Unsupervised Learning을 다루는 논문들이 늘 얘기하는 문제(Labeled Dataset을 구축하는 게 어렵다.)를 극복하고자 Unlabeled Dataset으로 CNN을 사전 학습 시키는 방법을 제안한다.  MethodCreate Surrogate Training DataExemplar-CNN 학습을 위해 어떤식으로 Input-Output을 구성했는지 확인해보겠다. 먼저, Figure 1과 같이 Unlabeled Dataset내 Images로부터 객체가 포함된 Patch를 N개 추출한다.(정확히는, Gradien..

[논문 리뷰] DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection (2023)

Paper: https://arxiv.org/abs/2303.08730 DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection Anomaly detection has garnered extensive applications in real industrial manufacturing due to its remarkable effectiveness and efficiency. However, previous generative-based models have been limited by suboptimal reconstruction quality, hampering their ove arxiv.org Introduction 해당..