This content originally appeared on HackerNoon and was authored by Writings, Papers and Blogs on Text Models
Table of Links
6 Phi-3-Vision
A Example prompt for benchmarks
6.3 Safety
To ensure the integration of Phi-3-Vision aligns with Microsoft’s Responsible AI (RAI) principles, we involved safety post-training in both Supervised Fine-Tuning (SFT) stage and Direct Preference Optimization (DPO) stage. In creating the safety training datasets, we utilized not only the text-only RAI datasets, but also a variety of in-house Multi-Modal (MM) RAI datasets that cover various harm categories identified in both public and internal MM RAI benchmarks. For the purpose of RAI evaluation, we performed a rigorous quantitative assessment on both public and internal benchmarks, this was done in conjunction with a human evaluation conducted by Microsoft’s internal red team.
\ In Table 3, we present the evaluation outcomes of Phi-3-Vision on three MM RAI benchmarks: one internal and two public benchmarks (specifically, RTVLM [LLY+ 24] and VLGuard [ZBY+ 24]). We juxtapose these results with those of other open-source models such as Llava-1.5 [LLLL23], Llava-1.6 [LLL+ 24], Qwen-VL-Chat [BBY+ 23], and GPT4-V[Ope23]. The results clearly indicate that safety posttraining notably enhances the RAI performance of Phi-3-Vision across all RAI benchmarks. In Figure 7, we further breakdown the performance across different RAI categories of the VLGuard and Internal benchmarks, demonstrating that safety post-training can aid Phi-3-Vision in improving RAI performance in nearly all categories.
\
\
:::info Authors:
(1) Marah Abdin;
(2) Sam Ade Jacobs;
(3) Ammar Ahmad Awan;
(4) Jyoti Aneja;
(5) Ahmed Awadallah;
(6) Hany Awadalla;
(7) Nguyen Bach;
(8) Amit Bahree;
(9) Arash Bakhtiari;
(10) Jianmin Bao;
(11) Harkirat Behl;
(12) Alon Benhaim;
(13) Misha Bilenko;
(14) Johan Bjorck;
(15) Sébastien Bubeck;
(16) Qin Cai;
(17) Martin Cai;
(18) Caio César Teodoro Mendes;
(19) Weizhu Chen;
(20) Vishrav Chaudhary;
(21) Dong Chen;
(22) Dongdong Chen;
(23) Yen-Chun Chen;
(24) Yi-Ling Chen;
(25) Parul Chopra;
(26) Xiyang Dai;
(27) Allie Del Giorno;
(28) Gustavo de Rosa;
(29) Matthew Dixon;
(30) Ronen Eldan;
(31) Victor Fragoso;
(32) Dan Iter;
(33) Mei Gao;
(34) Min Gao;
(35) Jianfeng Gao;
(36) Amit Garg;
(37) Abhishek Goswami;
(38) Suriya Gunasekar;
(39) Emman Haider;
(40) Junheng Hao;
(41) Russell J. Hewett;
(42) Jamie Huynh;
(43) Mojan Javaheripi;
(44) Xin Jin;
(45) Piero Kauffmann;
(46) Nikos Karampatziakis;
(47) Dongwoo Kim;
(48) Mahoud Khademi;
(49) Lev Kurilenko;
(50) James R. Lee;
(51) Yin Tat Lee;
(52) Yuanzhi Li;
(53) Yunsheng Li;
(54) Chen Liang;
(55) Lars Liden;
(56) Ce Liu;
(57) Mengchen Liu;
(58) Weishung Liu;
(59) Eric Lin;
(60) Zeqi Lin;
(61) Chong Luo;
(62) Piyush Madan;
(63) Matt Mazzola;
(64) Arindam Mitra;
(65) Hardik Modi;
(66) Anh Nguyen;
(67) Brandon Norick;
(68) Barun Patra;
(69) Daniel Perez-Becker;
(70) Thomas Portet;
(71) Reid Pryzant;
(72) Heyang Qin;
(73) Marko Radmilac;
(74) Corby Rosset;
(75) Sambudha Roy;
(76) Olatunji Ruwase;
(77) Olli Saarikivi;
(78) Amin Saied;
(79) Adil Salim;
(80) Michael Santacroce;
(81) Shital Shah;
(82) Ning Shang;
(83) Hiteshi Sharma;
(84) Swadheen Shukla;
(85) Xia Song;
(86) Masahiro Tanaka;
(87) Andrea Tupini;
(88) Xin Wang;
(89) Lijuan Wang;
(90) Chunyu Wang;
(91) Yu Wang;
(92) Rachel Ward;
(93) Guanhua Wang;
(94) Philipp Witte;
(95) Haiping Wu;
(96) Michael Wyatt;
(97) Bin Xiao;
(98) Can Xu;
(99) Jiahang Xu;
(100) Weijian Xu;
(101) Sonali Yadav;
(102) Fan Yang;
(103) Jianwei Yang;
(104) Ziyi Yang;
(105) Yifan Yang;
(106) Donghan Yu;
(107) Lu Yuan;
(108) Chengruidong Zhang;
(109) Cyril Zhang;
(110) Jianwen Zhang;
(111) Li Lyna Zhang;
(112) Yi Zhang;
(113) Yue Zhang;
(114) Yunan Zhang;
(115) Xiren Zhou.
:::
:::info This paper is available on arxiv under CC BY 4.0 DEED license.
:::
\
This content originally appeared on HackerNoon and was authored by Writings, Papers and Blogs on Text Models

Writings, Papers and Blogs on Text Models | Sciencx (2025-07-09T15:30:07+00:00) Benchmarking Multimodal Safety: Phi-3-Vision’s Robust RAI Performance. Retrieved from https://www.scien.cx/2025/07/09/benchmarking-multimodal-safety-phi-3-visions-robust-rai-performance/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.