All Relations between representation and vit

Publication Sentence Publish Date Extraction Date Species
Qiying Yang, Rongzuo Gu. An Unsupervised Method for Industrial Image Anomaly Detection with Vision Transformer-Based Autoencoder. Sensors (Basel, Switzerland). vol 24. issue 8. 2024-04-27. PMID:38676057. to mitigate these issues, this study proposes an unsupervised anomaly detection model employing the vision transformer (vit) architecture, incorporating a transformer structure to understand the global context between image blocks, thereby extracting a superior representation of feature information. 2024-04-27 2024-04-29 Not clear
Gongshu Wang, Ning Jiang, Yunxiao Ma, Duanduan Chen, Jinglong Wu, Guoqi Li, Dong Liang, Tianyi Ya. Connectional-style-guided contextual representation learning for brain disease diagnosis. Neural networks : the official journal of the International Neural Network Society. vol 175. 2024-04-23. PMID:38653077. specifically, it has a vision transformer (vit) encoder and leverages mask reconstruction as the proxy task and gram matrices to guide the representation of connectional information. 2024-04-23 2024-04-26 Not clear
Zhiyong Xiao, Yuhong Zhang, Zhaohong Deng, Fei Li. Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transforme. NeuroImage. 2024-04-16. PMID:38626817. considering the importance of local features and global semantics for 3d segmentation, we used a lightweight vit to learn high-level features of scale invariance and further fuse local-to-global representation. 2024-04-16 2024-04-19 Not clear
Bowen Tang, Zhangming Niu, Xiaofeng Wang, Junjie Huang, Chao Ma, Jing Peng, Yinghui Jiang, Ruiquan Ge, Hongyu Hu, Luhao Lin, Guang Yan. Automated molecular structure segmentation from documents using ChemSAM. Journal of cheminformatics. vol 16. issue 1. 2024-03-13. PMID:38475916. this study introduces a deep learning approach to chemical structure segmentation, employing a vision transformer (vit) to discern the structural patterns of chemical compounds from their graphical representations. 2024-03-13 2024-03-15 Not clear
Abinaya K, Sivakumar . A Deep Learning-Based Approach for Cervical Cancer Classification Using 3D CNN and Vision Transformer. Journal of imaging informatics in medicine. 2024-02-12. PMID:38343216. the proposed model leverages the capability of 3d cnn to extract spatiotemporal features from cervical images and employs the vit model to capture and learn complex feature representations. 2024-02-12 2024-02-15 Not clear
Jiarong Ye, Shivam Kalra, Mohammad Saleh Mir. Cluster-based histopathology phenotype representation learning by self-supervised multi-class-token hierarchical ViT. Scientific reports. vol 14. issue 1. 2024-02-08. PMID:38331955. cluster-based histopathology phenotype representation learning by self-supervised multi-class-token hierarchical vit. 2024-02-08 2024-02-11 Not clear
Jiarong Ye, Shivam Kalra, Mohammad Saleh Mir. Cluster-based histopathology phenotype representation learning by self-supervised multi-class-token hierarchical ViT. Scientific reports. vol 14. issue 1. 2024-02-08. PMID:38331955. in this work, we introduce cyphervit, a cluster-based histo-pathology phenotype representation learning by self-supervised multi-class-token hierarchical vision transformer (vit). 2024-02-08 2024-02-11 Not clear
Yuzhong Chen, Zhenxiang Xiao, Yu Du, Lin Zhao, Lu Zhang, Zihao Wu, Dajiang Zhu, Tuo Zhang, Dezhong Yao, Xintao Hu, Tianming Liu, Xi Jian. A Unified and Biologically Plausible Relational Graph Representation of Vision Transformers. IEEE transactions on neural networks and learning systems. vol PP. 2024-01-02. PMID:38163310. to answer these fundamental questions, we, for the first time, propose a unified and biologically plausible relational graph representation of vit models. 2024-01-02 2024-01-05 Not clear
Yuzhong Chen, Zhenxiang Xiao, Yu Du, Lin Zhao, Lu Zhang, Zihao Wu, Dajiang Zhu, Tuo Zhang, Dezhong Yao, Xintao Hu, Tianming Liu, Xi Jian. A Unified and Biologically Plausible Relational Graph Representation of Vision Transformers. IEEE transactions on neural networks and learning systems. vol PP. 2024-01-02. PMID:38163310. however, there is still a key lack of unified representation of different vit architectures for systematic understanding and assessment of model representation performance. 2024-01-02 2024-01-05 Not clear
Yuzhong Chen, Zhenxiang Xiao, Yu Du, Lin Zhao, Lu Zhang, Zihao Wu, Dajiang Zhu, Tuo Zhang, Dezhong Yao, Xintao Hu, Tianming Liu, Xi Jian. A Unified and Biologically Plausible Relational Graph Representation of Vision Transformers. IEEE transactions on neural networks and learning systems. vol PP. 2024-01-02. PMID:38163310. using this unified relational graph representation, we found that: 1) model performance was closely related to graph measures; 2) the proposed relational graph representation of vit has high similarity with real bnns; and 3) there was a further improvement in model performance when training with a superior model to constrain the aggregation graph. 2024-01-02 2024-01-05 Not clear
Servas Adolph Tarimo, Mi-Ae Jang, Emmanuel Edward Ngasa, Hee Bong Shin, HyoJin Shin, Jiyoung Wo. WBC YOLO-ViT: 2 Way - 2 stage white blood cell detection and classification with a combination of YOLOv5 and vision transformer. Computers in biology and medicine. vol 169. 2023-12-28. PMID:38154163. yolo (fast object detection) and vit (powerful image representation capabilities) are effectively integrated into 16 classes. 2023-12-28 2023-12-31 Not clear
Md Haidar Sharif, Lei Jiao, Christian W Omli. CNN-ViT Supported Weakly-Supervised Video Segment Level Anomaly Detection. Sensors (Basel, Switzerland). vol 23. issue 18. 2023-09-28. PMID:37765792. in this paper, we first address taking advantage of two pretrained feature extractors for cnn (e.g., c3d and i3d) and vit (e.g., clip), for effectively extracting discerning representations. 2023-09-28 2023-10-07 Not clear
Kaicong Sun, Qian Wang, Dinggang She. Joint Cross-Attention Network with Deep Modality Prior for Fast MRI Reconstruction. IEEE transactions on medical imaging. vol PP. 2023-09-11. PMID:37695966. to enhance the representation ability of the proposed model, we deploy vision transformer (vit) and cnn in the image and k-space domains, respectively. 2023-09-11 2023-10-07 Not clear
Zhong-Yu Li, Shanghua Gao, Ming-Ming Chen. SERE: Exploring Feature Self-Relation for Self-Supervised Transformer. IEEE transactions on pattern analysis and machine intelligence. vol PP. 2023-08-30. PMID:37647184. self-relation based learning further enhances the relation modeling ability of vit, resulting in stronger representations that stably improve performance on multiple downstream tasks. 2023-08-30 2023-09-07 Not clear
Zhong-Yu Li, Shanghua Gao, Ming-Ming Chen. SERE: Exploring Feature Self-Relation for Self-Supervised Transformer. IEEE transactions on pattern analysis and machine intelligence. vol PP. 2023-08-30. PMID:37647184. as an alternative to cnn, vision transformers (vit) have strong representation ability with spatial self-attention and channel-level feedforward networks. 2023-08-30 2023-09-07 Not clear
Jie Ma, Yalong Bai, Bineng Zhong, Wei Zhang, Ting Yao, Tao Me. Visualizing and Understanding Patch Interactions in Vision Transformer. IEEE transactions on neural networks and learning systems. vol PP. 2023-05-24. PMID:37224360. vision transformer (vit) has become a leading tool in various computer vision tasks, owing to its unique self-attention mechanism that learns visual representations explicitly through cross-patch information interactions. 2023-05-24 2023-08-14 Not clear
Jie Peng, Yangbin Xu, Luqing Luo, Haiyang Liu, Kaiqiang Lu, Jian Li. Regularized Denoising Masked Visual Pretraining for Robust Embodied PointGoal Navigation. Sensors (Basel, Switzerland). vol 23. issue 7. 2023-04-13. PMID:37050615. in the visual module, a self-supervised pretraining method, dubbed regularized denoising masked autoencoders (rdmae), is designed to enable the vision transformers (vit)-based visual encoder to learn robust representations. 2023-04-13 2023-08-14 Not clear
Yingjie Tian, Kunlong Ba. End-to-End Multitask Learning With Vision Transformer. IEEE transactions on neural networks and learning systems. vol PP. 2023-04-05. PMID:37018576. in this article, we draw on the recent success of vision transformer (vit) to propose a multitask representation learning method called multitask vit (mtvit), which proposes a multiple branch transformer to sequentially process the image patches (i.e., tokens in transformer) that are associated with various tasks. 2023-04-05 2023-08-14 Not clear
Shuiqing Zhao, Yanan Wu, Mengmeng Tong, Yudong Yao, Wei Qian, Shouliang Q. CoT-XNet: Contextual Transformer with Xception Network for diabetic retinopathy grading. Physics in medicine and biology. 2022-11-02. PMID:36322995. recently, a vision transformer (vit) has shown comparable or even superior performance to cnns, and it also learns different visual representations from cnns. 2022-11-02 2023-08-14 Not clear
Pan Huang, Peng He, Sukun Tian, Mingrui Ma, Peng Feng, Hualiang Xiao, Francesco Mercaldo, Antonella Santone, Jing Qi. A ViT-AMC Network with Adaptive Model Fusion and Multiobjective Optimization for Interpretable Laryngeal Tumor Grading from Histopathological Images. IEEE transactions on medical imaging. vol PP. 2022-08-26. PMID:36018875. then, we propose a multiobjective optimization method to solve the problem that vit and amc blocks cannot simultaneously have good feature representation. 2022-08-26 2023-08-14 Not clear