(原理|实战)T2I-Adapter

论文[T2I-Adapter] #

T2I-Adapter[10] #

T2I-Adapter is a lightweight adapter for controlling and providing more accurate structure guidance for text-to-image models. It works by learning an alignment between the internal knowledge of the text-to-image model and an external control signal, such as edge detection or depth estimation.

T2I-Adapter 是一种轻量级适配器,用于控制文本到图像模型并提供更准确的结构指导。它的工作原理是学习文本到图像模型的内部知识与外部控制信号(如边缘检测或深度估计)之间的对齐

The T2I-Adapter design is simple, the condition is passed to four feature extraction blocks and three downsample blocks. This makes it fast and easy to train different adapters for different conditions which can be plugged into the text-to-image model. T2I-Adapter is similar to ControlNet except it is smaller (~77M parameters) and faster because it only runs once during the diffusion process. The downside is that performance may be slightly worse than ControlNet.

T2I-Adapter的设计很简单,条件被传递给四个特征提取模块和三个下采样模块。这使得针对不同条件训练不同的适配器变得快速而容易,这些适配器可以插入到文本到图像模型中。T2I-Adapter 类似于 ControlNet,不同之处在于它更小(~77M 参数)和更快,因为它在扩散过程中只运行一次。缺点是性能可能比 ControlNet 稍差

Motivation[1] #

{% asset_img ’’ %}

Untitled.png

Method[1] #

{% asset_img ’’ %}

Untitled-1.png

参考 #

1.【北大-腾讯最新工作】T2I-Adapter 更加可控的文本生成图像 V

1xx. T2I-Adapter:挖掘更多SD模型的控制能力

1xx. Efficient Controllable Generation for SDXL with T2I-Adapters

实践 #

  1. T2I-Adapter hugggingface