DALLE-2が独自言語を獲得!?

16
小猫遊りょう(たかにゃし・りょう) @jaguring1

ゾクっとした…DALL-E2は独自言語を獲得したらしい。例えば「2頭のクジラが食べ物の話をしてる(字幕付き)」と英語で入力すると、内容を反映した画像が生成されるが、文がデタラメに見えていた。だが、その文字列をDALL-E2に入力すると魚介類が生成され、整合性が取れていた giannisdaras.github.io/publications/D… pic.twitter.com/xDsotDySI9

2022-06-01 08:59:46
拡大
拡大
拡大
拡大
小猫遊りょう(たかにゃし・りょう) @jaguring1

「2人の農夫が話してる(字幕付き)」とDALL-E2に入力すると、それっぽい画像が生成され、やはり生成された文字列はデタラメに見えるが、それらの文字列をDALL-E2に入力すると、野菜と鳥の画像を生成。どうやら農夫たちは野菜を手で抱えながら「DALL-E2言語」で鳥による被害の話をしてるように見える

2022-06-01 09:42:32
小猫遊りょう(たかにゃし・りょう) @jaguring1

今話題のツイート↓ twitter.com/giannis_daras/…

2022-06-01 09:44:37
Giannis Daras @giannis_daras

DALLE-2 has a secret language. "Apoploe vesrreaitais" means birds. "Contarra ccetnxniams luryca tanniounons" means bugs or pests. The prompt: "Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons" gives images of birds eating bugs. A thread (1/n)🧵 pic.twitter.com/VzWfsCFnZo

2022-06-01 02:44:25
小猫遊りょう(たかにゃし・りょう) @jaguring1

論文の著者の1人 twitter.com/AlexGDimakis/s…

2022-06-01 09:46:42
Alex Dimakis @AlexGDimakis

My student Giannis discovered that DALLE2 has a secret language. This can be used to crate absurd prompts that generate images. E.g. ''Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons'' generates Birds eating Bugs! We wrote a short paper on our experiments. twitter.com/giannis_daras/…

2022-06-01 02:47:43
小猫遊りょう(たかにゃし・りょう) @jaguring1

OpenAIの代表による反応 twitter.com/gdb/status/153…

2022-06-01 09:49:59
Greg Brockman @gdb

Turns out DALL-E can read the seemingly gibberish writing it produces. Built its own mini-language that is consistent between its text input space and image output space: twitter.com/giannis_daras/…

2022-06-01 09:07:50
小猫遊りょう(たかにゃし・りょう) @jaguring1

おそらく、さまざまなテストを試みると、失敗事例もけっこう見つかるはずで、たまたま成功した事例を選んでる可能性があるが、気になるのは、「入力テキスト」と「生成された画像」と「画像中に表示された不思議な文字列に関するDALL-E2の解釈」とが全て整合性が取れてしまう確率はどの程度か?

2022-06-01 10:14:11
Giannis Daras @giannis_daras

DALLE-2 has a secret language. "Apoploe vesrreaitais" means birds. "Contarra ccetnxniams luryca tanniounons" means bugs or pests. The prompt: "Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons" gives images of birds eating bugs. A thread (1/n)🧵 pic.twitter.com/VzWfsCFnZo

2022-06-01 02:44:25
拡大
Giannis Daras @giannis_daras

A known limitation of DALLE-2 is that it struggles with text. For example, the prompt: "Two farmers talking about vegetables, with subtitles" gives an image that appears to have gibberish text on it. However, the text is not as random as it initially appears... (2/n) pic.twitter.com/B3e5qVsTKu

2022-06-01 02:44:26
拡大
Giannis Daras @giannis_daras

We feed the text "Vicootes" from the previous image to DALLE-2. Surprisingly, we get (dishes with) vegetables! We then feed the words: "Apoploe vesrreaitars" and we get birds. It seems that the farmers are talking about birds, messing with their vegetables! (3/n) pic.twitter.com/OiU7NPTbor

2022-06-01 02:44:27
拡大
Giannis Daras @giannis_daras

Another example: "Two whales talking about food, with subtitles". We get an image with the text "Wa ch zod rea" written on it. Apparently, the whales are actually talking about their food in the DALLE-2 language. (4/n) pic.twitter.com/cqlUYXlLvf

2022-06-01 02:44:28
拡大
Giannis Daras @giannis_daras

Some words from the DALLE-2 language can be learned and used to create absurd prompts. For example, "painting of Apoploe vesrreaitais" gives a painting of a bird. "Apoploe vesrreaitais" means to the model "something that flies" and can be used across diverse styles. (5/n) pic.twitter.com/w73iKN4kM1

2022-06-01 02:44:29
拡大
Giannis Daras @giannis_daras

The discovery of the DALLE-2 language creates many interesting security and interpretability challenges. Currently, NLP systems filter text prompts that violate the policy rules. Gibberish prompts may be used to bypass these filters. (6/n)

2022-06-01 02:44:29
Giannis Daras @giannis_daras

We wrote a small paper with @AlexGDimakis summarizing our findings. Please find the paper here: giannisdaras.github.io/publications/D… Arxiv version coming soon. (7/n, n=7).

2022-06-01 02:44:29
Giannis Daras @giannis_daras

Based on valid comments, we updated our paper with a discussion on Limitations and changed the title to Discovering the Hidden Vocabulary of DALLE-2. Thanks to @mraginsky @rctatman @benjamin_hilton and others for useful comments.

2022-06-01 10:23:50
Giannis Daras @giannis_daras

Responses to some of the criticism can be found here: twitter.com/giannis_daras/…

2022-06-03 15:12:11
Giannis Daras @giannis_daras

An update on the hidden vocabulary of DALLE-2. While a lot of the feedback we received was constructive, some of the comments need to be addressed. A thread, with some new gibberish text and some discussion 🧵 (1/N)

2022-06-03 15:09:27

批判の一部に対する回答は、次のとおりです。

Giannis Daras @giannis_daras

An update on the hidden vocabulary of DALLE-2. While a lot of the feedback we received was constructive, some of the comments need to be addressed. A thread, with some new gibberish text and some discussion 🧵 (1/N)

2022-06-03 15:09:27
Giannis Daras @giannis_daras

@benjamin_hilton said that we got lucky with the whales example. We found another similar example. "Two men talking about soccer, with subtitles" gives the word "tiboer". This seems to give sports in ~4/10 images. (2/N) pic.twitter.com/OilSoWqzVH

2022-06-03 15:09:29
拡大
拡大
拡大
Giannis Daras @giannis_daras

A few people, including @realmeatyhuman, asked whether our method works beyond natural images (of birds, etc). Yes, we found some examples that seem statistically significant. E.g. "doitcdces" seems related (~4/10 images) to students (or learning). (3/N) pic.twitter.com/gtfywmRBkD

2022-06-03 15:09:30
拡大
拡大
Giannis Daras @giannis_daras

Similarly, "comafuruder" seems correlated (~4/10) to sickness/hospitals/patients. (4/N) pic.twitter.com/aGg7uB1Iaq

2022-06-03 15:09:32
拡大
拡大
Giannis Daras @giannis_daras

@BarneyFlames, @mattgroh pointed out that "Apoploe", our gibberish word for birds, has similar BPE encoding to "Apodidae". Interestingly, "Apodidae" produces ~1/10 birds (but many flying insects), while our gibberish "Apoploe" gives 10/10. (5/N) pic.twitter.com/rz3JZ9Lg3x

2022-06-03 15:09:33
拡大
拡大
Giannis Daras @giannis_daras

However, "Apodidae Ploceidae" (two names of real bird families) indeed gives 10/10 birds. Therefore, one possible explanation is that our gibberish tokens are mashups of parts of real words. This seems reasonable. It is interesting that DALLE-2 generates those mashups. (6/N)

2022-06-03 15:09:34
Giannis Daras @giannis_daras

Our gibberish tokens might have many meanings. @benjamin_hilton run "Contarra ccetnxniams luryca tanniounons" and pointed out that not all are bugs. Indeed, our gibberish text produces a statistically significant fraction, but rarely a 100% match to the target concept. (7/N) pic.twitter.com/SWC0Tr9R2E

2022-06-03 15:09:35
拡大