Publications

(2025). Seeing What You Say: Expressive Image Generation from Speech. Proceedings of the IEEE international conference on computer vision workshops.
(2025). Probabilistic Language-Image Pre-Training. International Conference on Representation Learning.
(2025). MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation. Transactions on Machine Learning Research.
(2025). DNNs May Determine Major Properties of Their Outputs Early, with Timing Possibly Driven by Bias. arXiv preprint arXiv:2502.08167.
(2024). Similarity of neural architectures using adversarial attack transferability. European Conference on Computer Vision.
(2024). SeiT++: Masked Token Modeling Improves Storage-efficient Training. European Conference on Computer Vision.
(2024). Rotary position embedding for vision transformer. European Conference on Computer Vision.
(2023). TADA: Timestep-Aware Data Augmentation for Diffusion Models. NeurIPS Workshop.
(2023). SeiT: Storage-efficient vision training with tokens using 1% of pixel storage. Proceedings of the IEEE/CVF International Conference on Computer Vision.
(2022). Few-shot font generation with weakly supervised localized representations. IEEE transactions on pattern analysis and machine intelligence.